BibTeX import/export problems

I just started using Zotero and think that it is a wonderful App. However, I am strongly tied to BibTeX and must have proper import/export for large databases. I have observed the following problems in import/export:

1. Case is not preserved properly on export. In BibTeX, uppercase in the title is only guaranteed if encapsulated by {}. Zotero does not encapsulate upper case with {} on export and this leads to problems. Example: if the title is Title={My SPECIAL Title} then this should be exported as Title={My {SPECIAL} {T}itle}. In this case, the import from .bib respect case, but the export to .bib breaks it.

2. the \url tag is not properly imported from BibTeX files. For example, if I have a note field:
Note = {\url{}},
after import and then export this turns into:
Note = {{\textbackslash}url},

3. @url @electronic and @conference entries are converted to @misc entries.

4. The citation key debate seems to rage on the forums, but it seems that the citation key should be maintained from import to export. This isn't a total deal breaker for me, since I can always regenerate the key in my own preferred style...

Issues 1-3 are potential deal-breakers since it means a LOT of work for me to convert my current .bib files over to Zotero. Any advice???
From other posts
  • I just noticed another problem:

    The ~ character is not imported correctly from a .bib file. That means that URLs are broken on output. There is a ℆ character instead (c/u) - I have no idea what that is...
  • edited October 10, 2008
    Re. 1: The final case in your citations should depend upon the particular .BST file that you use w/ BibTeX. While I can see a justification for encapsulating words that have an uppercase letter in other-than-the-first position, I don't know if I feel the same way about words that only have an initial cap (title case).

    Given the title "DNA: The Secret of Life," you will almost always want to capitalize all of "DNA," but the initial case of other words may vary.
  • Re. 2.: We currently mostly map individual characters or strings. Perhaps the right thing to do would be to map BibTeX's '\url' to nothing, so that your zotero note will be "" instead of "\url". This would seem to be much cleaner. I don't know about the map back to BibTeX, though--in some cases, people might not want to use '\url'; they might prefer no special package or '\href'.

    Any ideas on something that both has a clean representation in Zotero and a friendly export for those that want to roundtrip BibTeX?
  • edited October 9, 2008
    Re. 3: @url and @electronic are very uncommon. I don't know if I'd want any zotero type to map to them. Which mappings would you propose for these types?

    I think that @conference should map to Zotero's "Conference Paper" type. I think that the "Conference Paper" type should continue to map to "@inproceedings," though. @inproceedings is much more common (I think that @conference might be a legacy format that is almost always handled identically to @inproceedings). Would there be some significant disadvantage of losing the explicit @conference mapping that I am missing?

    EDIT: I don't think that zotero should write out any type that isn't a core, enumerated type. Postel's law supercedes the ability to round-trip data.
  • It seems to me that the ability to reliably export to BibTeX is critical for a significant segment of the target users for Zotero. Round-tripping data is not as critical as getting a robust Zotero->BibTeX export. Postel's law in this context suggests that Zotero's exporting capabilities need to be flexible and robust since the end use of Zotero is to manage bibliographies in documents.

    Once data is imported into Zotero, even if the import requires a lot of cleanup, I would certainly be using Zotero as my primary database, but would be routinely exporting to BibTeX for document and bibliography creation. Thus, a reliable export that requires little or no cleanup is crucial.

    Regarding some of the questions you raised:

    1. Capitalization is a critical issue that must be addressed somehow. The ability to override a .bst setting to capitalize acronyms (e.g. DNA) and proper nouns is critically important. Otherwise each BibTeX export will need to be reformatted. In cases where we export thousands of entries to a BibTeX database, this is not a viable solution.

    2. I agree that different people may want to tag with \url or with \href. Why not create an export filter for BibTeX that allows the user control over such options. IMHO this would greatly improve the usability of Zotero for LaTeX users.

    3. Conference paper should *not* map to proceedings. Many conferences do not have formal published proceedings, but one needs to cite presentations made there nonetheless. This is the distinction between @conference and @inproceedings.
  • I, too, am eager to use Zotero in conjunction with BibTeX. I've noticed two bugs when importing from .bib files:

    1. For @incollection (and possibly other citation types), editors are imported as authors.

    2. In TeX the double hyphen (--) represents an en dash (–), but for page ranges Zotero imports an em dash instead. Also, Zotero imports a TeX triple hypen (---) as an em dash followed by a hypen; it should merely import an em dash.
  • addresses most concerns raised in this thread to some degree.

    @neatnate, I can't reproduce (1) in the trunk. (2) is fixed by the patch (-- was actually being replaced by a quotation dash, which is a unique unicode entity from an en- an em-dash)
  • Am I right, that now the import/export of the bibtex-key is the only open issue in this thread?
  • I propose that Zotero survey its users through a poll email to establish from which data bases and citation programs users need to import and to which they want to export. Zotero already owns all the login email adresses from forum users and could put a link to the online survey on the website. There is no need to get precise numbers but the interest in such a poll is to know where the bulk is situated.

  • The import/export of the bibtex key is NOT the only open issue.

    Another issue of critical importance is maintaining case in a BibTeX export. It would be nice to at least maintain upper case (enforced by elcosing text in {}) for acronyms. However, allowing a user to choose whether to respect all upper case in the Zotero database in a BibTeX export would be really nice as well.

    Lastly, the URL issue (see number 2 in my initial post) should also be addressed. Potentially, this could also be addressed via a user-configurable output filter.
  • @ftr: what does that proposed pole have to do with BibTeX?
    The case issues were addressed by the patch I described. The URL issue was at least partially addressed--"\url" and "\href" are stripped on import. This is the right way for the links to be stored in Zotero at least until full fields support semantic markup/rich text. It is still exported as-is, without an explicit link. I disagree that an option is the right way to solve this, but don't know what the right way is.
  • I just feel strange that Zotero makes the 4th issue (i.e. bibtex_key) a troublesome feature.
    Indeed there are a simple solution.

    when user select to import/export Bibtex
    1. If import from bib file, allow one to check "import user_key" (obviously default value should be "yes")
    2. if export from Zotero to bib file, allow one to check "use user_key" (default value should be "yes")

    In the preference dialog, add an option tab for Bibtex,
    e.g. a checkbox for "show user_key in the right column" (default value can be "off")

    Indeed the rule used by Zotero for Bibtex_key is pointless, for journals of Eastern Asian languages, e.g. Chinese, Japanese and Koreanian, the author is not suitable for bibtex key and will lead to funny bibtex_key like "____1992".

    I wish Zotero team could adapt this simple solution.
  • As I've said in another thread on this issue, Zotero a) should not adapt ANY features specifc to BibTeX, and b) needs to consider these requests in the context of larger goals. WRT to this example, the relevant larger goals are the forthcoming social networking and collaboration functionality.

    A BibTeX key is a local natural language identifier. WRT to a BIbTeX file that's imported, the key is local to (and hence unique within) just that file.

    For an app like Zotero 1.0, the key would have to be local to the database. Here you're already faced with a practical problem of resolving potentially duplicate keys. E.g. if you import two files and both have keys "smith1999" but are in fact referring to different items, then you need some way to resolve them.

    Zotero 2.0 is intended to allow social-networking-like sharing and discovery of items. Imagine you have an item that is part of the libraries of 100 users. Let's say we give that item a global identifier like , and/or maybe . What's the key, and what's its scope?

    You might reasonably say for the first problem that Zotero should just automatically resolve the duplicate keys (though that's not necessarily so straightforward). Or, for the second, you might say the local scope remains the user, so that we might in theory have 100 different keys for the same item. But I'm just emphasizing that this isn't so straightforward as you suggest.
  • Another hiccup in exporting to Bibtex: if a date field contains something other than a standard date (e.g. "forthcoming" or "ms"), the entry gets exported with no date field at all. Help? Thanks.
  • As I've said in another thread on this issue, Zotero a) should not adapt ANY features specifc to BibTeX, and b) needs to consider these requests in the context of larger goals. WRT to this example, the relevant larger goals are the forthcoming social networking and collaboration functionality.
    I strongly agree with your desire not to wed Zotero to problematic citation solutions in ways that might impede its progress towards forward-looking (elegant, easy, standardized, or even just, working) mechanisms for academic citations, and you have collaborated in and led some of the most interesting recent work on these things. However, I'm not convinced that explicit Zotero support of Bibtex keys would mean that Zotero would be taking on cruft which would slow down progress to better citation technologies.

    The request for good management of BibTeX keys could just be rolled into a broader request for usable, self-consciously local, human-readable, record IDs, as you I and I discussed in this thread. These would then be presumably mapped onto global a robust global equivalent of the BibTeX key (URIs). You wrote:
    So, for example, you have a generic notion of a citation and a reference. A citation is an ordered list of references. Each reference contains, say, a local id label, and a global id (a URI), plus optional locators, prefix and suffix, etc.
    What I'm trying to suggest that local identifiers (like BibTeX keys) are one pretty versatile means by which high-quality citation formatting could be added easily to text written in a wide variety of writing tools, including those which use plain-text formatting like the recent spate of lightweight markup languages. The only odd thing in the present situation, and the reason these requests come forward as "BibTeX support" requests, is that BibTeX is the sole longstanding, open implementation of this idea. Whatever its own faults, managing citations the way BibTeX does (with a simple human-generatable reference to (1) a prior work needing to be cited and (2) a citation style with which to format the citation, is a pretty good one. As I suggest in that thread, it could also be a workable path for Zotero/CSL citation to be usable for a broad range of currently unsupported word-processors and text editors, both on- and offline.

    So yes, you're right to urge keeping Zotero from developing a debt to legacy citation engines which we all hope will be superseded, and to local and individually-focused solutions in particular, but a human readable "LocalID" field could (it seems) be implemented in the near term, in a way that is deliberately aimed at providing a means for working with other citation tools alongside BibTeX, (citeproc-hs comes to mind) but which could also provide immediate benefits to those who are working with various incarnations of LaTeX. For those who use it them, this field would be mapped to BibTeX keys. For all its cruft, LaTeX still has a lot of users who have found nothing to replace its breadth, design flexibility, and massive FLOSSy extensibility. For it, and for any solution which operates in plain-text, nice, usable, local IDs seem to me to be one good for the writer to specify just what prior work s/he wants to make reference too.
  • A similar problem : if page numbers include letters, zotero adds curly braces. (Some journals do this, to distinguish the "letters" section from the main journal. Then the .bib file ends up with double curly braces, e.g.

    pages = {{L101--L104}} rather than pages = {L101--L104}

    I can find these and edit them, but its a big pain.
  • okey dokey. Dunno how you keep up with all this stuff !!
  • Hi, I guess this thread is relevant. I have an capitalization {Gy.} which I wish to preserve, but all other words in the title should follow sentence-case capitalization.

    Can I markup the title in zotero somehow to tell it that the G should stay capitalized on export to bibtex?
  • Not at this time: there's no semantic markup in Zotero fields & there is no way to differentiate proper nouns/etc. in title case.
  • OK. Thank you noksagt.

    I was able to workaround by including curly braces within the zotero title (as I currently do for bibtex), but obviously this is not a long-term solution.
  • I would just like to echo scot's remarks. A "LocalID" field that could double as a BibTex citekey would be a huge big for BibTex users like myself and might also prove to be useful for non-BibTex users further down the line.

    A human-readable LocalID makes sense for citation management in general. I can imagine many cases where it would be useful to type a LocalID in-line in any text editor and then be able to do post-processing that uniquely matches that LocalID with an entry in Zotero.

    I realize that Zotero must remain forward looking, but I think this could be very beneficial. I saw that in January 2008, Dan Stillman suggested that exposing a LocalID in the UI would be a goal for Zotero 1.5:

    I hope its not too late to get that into the final release.

    Thanks again to all of you zotero developers.
  • For those interested, I made a copy of the BibTex translator that allows you to export only the \cite{key}. This enables me to use Zotero to drag-and-drop or copy-paste the citekey into a waiting text editor. The cite key is the same as the one that the default zotero BibTex translator uses.

    Link is here:

    Instructions here:
  • Hi,

    There is an issue when exporting an entry containing the Polish ł to BibTeX. I have no problems with any other accentuated character (they are exported correctly to their \accent{letter} format), but maybe there are others that are missing.

    The LaTeX code for ł is \l{} (and \L{} for the capital version).

  • Thank you noksagt for pointing out, and thank you Dan Stillman for fixingmuch of the BibTex capitalization problems. I was wondering if there are any new ideas regarding how the BibTex exporter should handle capitalization for the title field.

    I know this was being considered as a general issue with Zotero, and I like the about:config setting solution, but obviously this doesn't affect the BibTex output.

    I have a lot of citations with "C. elegans" in the middle of the title that I need to output to Bibtex. This is doubly tricky because "C." is always capitalized but "elegans" never is. Is there an obvious way to deal with that?

    If its simple I could even edit the Translator file myself.
    Thanks again.
  • thank you Dan Stillman for fixingmuch of the BibTex capitalization problems
    Credit for the patch goes to noksagt. I just checked it in.
  • I just ran into the issue of upper/lower case when exporting to BibTeX. It appears to me that the biggest issue is that of proper nouns getting lower-cased by various style sheets applied to BibTeX entries. It appears that if a token (~word) has an upper case letter in non-initial position, the export process correctly brackets it. This (and maybe some other tricks I haven't noticed) goes a long way towards a solution, but of course it doesn't solve the problem for most proper nouns.

    I can think of two possible (albeit partial) solutions to the proper noun issue: have a dictionary of proper nouns, and bracket any token matching a word in that dictionary; or have a dictionary of non-proper nouns, and bracket any token *not* in that dictionary. The latter seems much more reasonable, since such dictionaries already exist (one could start with the aspell dictionaries, for example); and the list of proper nouns is open ended.

    Would something like this be feasible? It won't work where a noun is ambiguous (George W. bush), but it might go a long ways.
  • Feature requests for Bibtex export:

    1.) When exporting to Bibtex, it would be great if we could preserve the "LastName, FirstName" ordering so that last names consisting of multiple words show up correctly in the final Latex document. For example "Frances K. Del Boca" now shows up in the exported Bibtex db exactly as written, instead of "Del Boca, Frances K." as it exists in my Zotero library. The resulting Latex output is "Boca, F. K. D." whereas it should be "Del Boca, F. K."

    2.) It would be great if the default Bibtex key was "AuthorYearTitle." On my own system, I've changed my BibTeX.js output script so that line 59 reads:

    var citeKeyFormat = "%a%y%t";

    But it would be really nice if this was the default, or at least there was some way for people to change the output without manually editing the js.

    3.) It would also be nice if the character mapping table mapped all dash-like output to the "-" character. I think otherwise Bibtex chokes on those characters (though I am running Windows XP and perhaps it is something about Windows). Have others had this issue? I have had to change my BibTeX.js file as follows:

    var mappingTable = {
    "\u2013":"-", // EN DASH
    "\u2014":"-", // EM DASH
    "\u2015":"-", // HORIZONTAL BAR or QUOTATION DASH (not in LaTeX -- use EM DASH)

    Thanks very much to all. Zotero makes my life much easier.

  • I also have the problem concerning captial words in BibTeX exports, in my case "Chlamydomonas reinhardtii" became "chlamydomonas reinhardtii". I guess that article titles should be printed as they are, and don't get why LaTeX performs its own capitalization. However, a solution to the problem is to encapsulate the whole title in { }, the following diff (against a freshly checked out subversion trunk as of 2010-05-06) resolves the problem by adding curly braces around the title in .bib files:

    Index: translators/BibTeX.js
    --- translators/BibTeX.js (revision 6025)
    +++ translators/BibTeX.js (working copy)
    @@ -1833,8 +1833,12 @@
    if (Zotero.getOption("exportCharset") != "UTF-8") {
    value = value.replace(/[\u0080-\uFFFF]/g, mapAccent);
    - Zotero.write(value);
    - if(!isMacro) Zotero.write("}");
    + if ((!isMacro) && (field == "title"))
    + Zotero.write("{");
    + Zotero.write(value);
    + if ((!isMacro) && (field == "title"))
    + Zotero.write("}");
    + if(!isMacro) Zotero.write("}");

    function mapEscape(character) {
    Is there any chance this will make it into any of the upcoming releases? I would be happy if I didn't have to patch this file every time Zotero is updated.

    Cheers - Micha.
Sign In or Register to comment.