BibTeX import/export problems
I just started using Zotero and think that it is a wonderful App. However, I am strongly tied to BibTeX and must have proper import/export for large databases. I have observed the following problems in import/export:
1. Case is not preserved properly on export. In BibTeX, uppercase in the title is only guaranteed if encapsulated by {}. Zotero does not encapsulate upper case with {} on export and this leads to problems. Example: if the title is Title={My SPECIAL Title} then this should be exported as Title={My {SPECIAL} {T}itle}. In this case, the import from .bib respect case, but the export to .bib breaks it.
2. the \url tag is not properly imported from BibTeX files. For example, if I have a note field:
Note = {\url{http://www.paraview.org}},
after import and then export this turns into:
Note = {{\textbackslash}urlhttp://www.paraview.org},
3. @url @electronic and @conference entries are converted to @misc entries.
4. The citation key debate seems to rage on the forums, but it seems that the citation key should be maintained from import to export. This isn't a total deal breaker for me, since I can always regenerate the key in my own preferred style...
Issues 1-3 are potential deal-breakers since it means a LOT of work for me to convert my current .bib files over to Zotero. Any advice???
From other posts
1. Case is not preserved properly on export. In BibTeX, uppercase in the title is only guaranteed if encapsulated by {}. Zotero does not encapsulate upper case with {} on export and this leads to problems. Example: if the title is Title={My SPECIAL Title} then this should be exported as Title={My {SPECIAL} {T}itle}. In this case, the import from .bib respect case, but the export to .bib breaks it.
2. the \url tag is not properly imported from BibTeX files. For example, if I have a note field:
Note = {\url{http://www.paraview.org}},
after import and then export this turns into:
Note = {{\textbackslash}urlhttp://www.paraview.org},
3. @url @electronic and @conference entries are converted to @misc entries.
4. The citation key debate seems to rage on the forums, but it seems that the citation key should be maintained from import to export. This isn't a total deal breaker for me, since I can always regenerate the key in my own preferred style...
Issues 1-3 are potential deal-breakers since it means a LOT of work for me to convert my current .bib files over to Zotero. Any advice???
From other posts
The ~ character is not imported correctly from a .bib file. That means that URLs are broken on output. There is a ℆ character instead (c/u) - I have no idea what that is...
Given the title "DNA: The Secret of Life," you will almost always want to capitalize all of "DNA," but the initial case of other words may vary.
Any ideas on something that both has a clean representation in Zotero and a friendly export for those that want to roundtrip BibTeX?
I think that @conference should map to Zotero's "Conference Paper" type. I think that the "Conference Paper" type should continue to map to "@inproceedings," though. @inproceedings is much more common (I think that @conference might be a legacy format that is almost always handled identically to @inproceedings). Would there be some significant disadvantage of losing the explicit @conference mapping that I am missing?
EDIT: I don't think that zotero should write out any type that isn't a core, enumerated type. Postel's law supercedes the ability to round-trip data.
Once data is imported into Zotero, even if the import requires a lot of cleanup, I would certainly be using Zotero as my primary database, but would be routinely exporting to BibTeX for document and bibliography creation. Thus, a reliable export that requires little or no cleanup is crucial.
Regarding some of the questions you raised:
1. Capitalization is a critical issue that must be addressed somehow. The ability to override a .bst setting to capitalize acronyms (e.g. DNA) and proper nouns is critically important. Otherwise each BibTeX export will need to be reformatted. In cases where we export thousands of entries to a BibTeX database, this is not a viable solution.
2. I agree that different people may want to tag with \url or with \href. Why not create an export filter for BibTeX that allows the user control over such options. IMHO this would greatly improve the usability of Zotero for LaTeX users.
3. Conference paper should *not* map to proceedings. Many conferences do not have formal published proceedings, but one needs to cite presentations made there nonetheless. This is the distinction between @conference and @inproceedings.
1. For @incollection (and possibly other citation types), editors are imported as authors.
2. In TeX the double hyphen (--) represents an en dash (–), but for page ranges Zotero imports an em dash instead. Also, Zotero imports a TeX triple hypen (---) as an em dash followed by a hypen; it should merely import an em dash.
@neatnate, I can't reproduce (1) in the trunk. (2) is fixed by the patch (-- was actually being replaced by a quotation dash, which is a unique unicode entity from an en- an em-dash)
-ft
Another issue of critical importance is maintaining case in a BibTeX export. It would be nice to at least maintain upper case (enforced by elcosing text in {}) for acronyms. However, allowing a user to choose whether to respect all upper case in the Zotero database in a BibTeX export would be really nice as well.
Lastly, the URL issue (see number 2 in my initial post) should also be addressed. Potentially, this could also be addressed via a user-configurable output filter.
@jamessuthe:
The case issues were addressed by the patch I described. The URL issue was at least partially addressed--"\url" and "\href" are stripped on import. This is the right way for the links to be stored in Zotero at least until full fields support semantic markup/rich text. It is still exported as-is, without an explicit link. I disagree that an option is the right way to solve this, but don't know what the right way is.
Indeed there are a simple solution.
when user select to import/export Bibtex
1. If import from bib file, allow one to check "import user_key" (obviously default value should be "yes")
2. if export from Zotero to bib file, allow one to check "use user_key" (default value should be "yes")
In the preference dialog, add an option tab for Bibtex,
e.g. a checkbox for "show user_key in the right column" (default value can be "off")
Indeed the rule used by Zotero for Bibtex_key is pointless, for journals of Eastern Asian languages, e.g. Chinese, Japanese and Koreanian, the author is not suitable for bibtex key and will lead to funny bibtex_key like "____1992".
I wish Zotero team could adapt this simple solution.
A BibTeX key is a local natural language identifier. WRT to a BIbTeX file that's imported, the key is local to (and hence unique within) just that file.
For an app like Zotero 1.0, the key would have to be local to the database. Here you're already faced with a practical problem of resolving potentially duplicate keys. E.g. if you import two files and both have keys "smith1999" but are in fact referring to different items, then you need some way to resolve them.
Zotero 2.0 is intended to allow social-networking-like sharing and discovery of items. Imagine you have an item that is part of the libraries of 100 users. Let's say we give that item a global identifier like , and/or maybe . What's the key, and what's its scope?
You might reasonably say for the first problem that Zotero should just automatically resolve the duplicate keys (though that's not necessarily so straightforward). Or, for the second, you might say the local scope remains the user, so that we might in theory have 100 different keys for the same item. But I'm just emphasizing that this isn't so straightforward as you suggest.
The request for good management of BibTeX keys could just be rolled into a broader request for usable, self-consciously local, human-readable, record IDs, as you I and I discussed in this thread. These would then be presumably mapped onto global a robust global equivalent of the BibTeX key (URIs). You wrote: What I'm trying to suggest that local identifiers (like BibTeX keys) are one pretty versatile means by which high-quality citation formatting could be added easily to text written in a wide variety of writing tools, including those which use plain-text formatting like the recent spate of lightweight markup languages. The only odd thing in the present situation, and the reason these requests come forward as "BibTeX support" requests, is that BibTeX is the sole longstanding, open implementation of this idea. Whatever its own faults, managing citations the way BibTeX does (with a simple human-generatable reference to (1) a prior work needing to be cited and (2) a citation style with which to format the citation, is a pretty good one. As I suggest in that thread, it could also be a workable path for Zotero/CSL citation to be usable for a broad range of currently unsupported word-processors and text editors, both on- and offline.
So yes, you're right to urge keeping Zotero from developing a debt to legacy citation engines which we all hope will be superseded, and to local and individually-focused solutions in particular, but a human readable "LocalID" field could (it seems) be implemented in the near term, in a way that is deliberately aimed at providing a means for working with other citation tools alongside BibTeX, (citeproc-hs comes to mind) but which could also provide immediate benefits to those who are working with various incarnations of LaTeX. For those who use it them, this field would be mapped to BibTeX keys. For all its cruft, LaTeX still has a lot of users who have found nothing to replace its breadth, design flexibility, and massive FLOSSy extensibility. For it, and for any solution which operates in plain-text, nice, usable, local IDs seem to me to be one good for the writer to specify just what prior work s/he wants to make reference too.
pages = {{L101--L104}} rather than pages = {L101--L104}
I can find these and edit them, but its a big pain.
http://forums.zotero.org/discussion/5762/
Can I markup the title in zotero somehow to tell it that the G should stay capitalized on export to bibtex?
I was able to workaround by including curly braces within the zotero title (as I currently do for bibtex), but obviously this is not a long-term solution.
A human-readable LocalID makes sense for citation management in general. I can imagine many cases where it would be useful to type a LocalID in-line in any text editor and then be able to do post-processing that uniquely matches that LocalID with an entry in Zotero.
I realize that Zotero must remain forward looking, but I think this could be very beneficial. I saw that in January 2008, Dan Stillman suggested that exposing a LocalID in the UI would be a goal for Zotero 1.5:
http://forums.zotero.org/discussion/1952/bibtex-key-consistancyexposure-within-zotero/
I hope its not too late to get that into the final release.
Thanks again to all of you zotero developers.
Link is here:
http://www.people.fas.harvard.edu/~leifer/zotero/BibTexCiteKeyOnly.js
Instructions here:
http://forums.zotero.org/discussion/146/integration-with-latex/#Item_18
There is an issue when exporting an entry containing the Polish ł to BibTeX. I have no problems with any other accentuated character (they are exported correctly to their \accent{letter} format), but maybe there are others that are missing.
The LaTeX code for ł is \l{} (and \L{} for the capital version).
Cheers,
Francois.
I know this was being considered as a general issue with Zotero, and I like the about:config setting solution, but obviously this doesn't affect the BibTex output.
I have a lot of citations with "C. elegans" in the middle of the title that I need to output to Bibtex. This is doubly tricky because "C." is always capitalized but "elegans" never is. Is there an obvious way to deal with that?
If its simple I could even edit the Translator file myself.
Thanks again.
I can think of two possible (albeit partial) solutions to the proper noun issue: have a dictionary of proper nouns, and bracket any token matching a word in that dictionary; or have a dictionary of non-proper nouns, and bracket any token *not* in that dictionary. The latter seems much more reasonable, since such dictionaries already exist (one could start with the aspell dictionaries, for example); and the list of proper nouns is open ended.
Would something like this be feasible? It won't work where a noun is ambiguous (George W. bush), but it might go a long ways.
1.) When exporting to Bibtex, it would be great if we could preserve the "LastName, FirstName" ordering so that last names consisting of multiple words show up correctly in the final Latex document. For example "Frances K. Del Boca" now shows up in the exported Bibtex db exactly as written, instead of "Del Boca, Frances K." as it exists in my Zotero library. The resulting Latex output is "Boca, F. K. D." whereas it should be "Del Boca, F. K."
2.) It would be great if the default Bibtex key was "AuthorYearTitle." On my own system, I've changed my BibTeX.js output script so that line 59 reads:
var citeKeyFormat = "%a%y%t";
But it would be really nice if this was the default, or at least there was some way for people to change the output without manually editing the js.
3.) It would also be nice if the character mapping table mapped all dash-like output to the "-" character. I think otherwise Bibtex chokes on those characters (though I am running Windows XP and perhaps it is something about Windows). Have others had this issue? I have had to change my BibTeX.js file as follows:
var mappingTable = {
...
"\u2013":"-", // EN DASH
"\u2014":"-", // EM DASH
"\u2015":"-", // HORIZONTAL BAR or QUOTATION DASH (not in LaTeX -- use EM DASH)
...
}
Thanks very much to all. Zotero makes my life much easier.
-Solomon
Cheers - Micha.