Default encoding in BibTex export

I am running Zotero 3.0.8 on Firefox 3.0.1 on Ubuntu.

I noticed that for quick copy (Ctrl+Alt+C), for some reason the BibTeX export format doe not do latex substitution for non-ASCII Unicode characters (for example, u with an umlaut is not rendered as \"{u}). With some help from a friend, I found that the BibTeX.js translator actually does have the capability for translating to latex code like the above, but for some reason does not use it during quick copy.

I think this happens because of the following line (line 10 in BibTeX.js)

"exportCharset": "UTF-8",

I could solve my problem by changing this line to

"exportCharset": "ISO-8859-1",

This ensures that the Unicode to latex translation kicks in during quick copy.

Since the most popular latex implementation today is pdflatex, which has little support for Unicode, and since (I believe) most users would therefore prefer their BibTeX output to use latex code rather than Unicode characters that pdflatex cannot handle, I think the default version of the translator should have

"exportCharset": "ISO-8859-1",

rather than UTF-8. is there a reason then that the translator uses UTF-8 rather than ISO-8859-1?

Of course, another way to solve this problem would be to give an option for specifying a default character encoding for quick copy, but I don't know how to do that.
  • There's an open ticket for providing translator options for quick copy.

    There are two reasons we use UTF-8 by default instead of ISO-8859-1:

    1. All non-Unicode character sets are broken by design, in that a large proportion of the world's population can't use them to write in their native language. Unicode is over 20 years old, and there are several fully Unicode-capable LaTeX implementations, including XeTeX and LuaTeX, which is the official successor to pdfTeX.

    2. A lot of people use BibTeX export with other software that handles UTF-8, but can't handle BibTeX entities.

  • Several - and I understand an increasing number - of bibtex implementations have utf-8 support, which is why it's set as the default.
    You can set the export charset when you right-click export to Bibtex (when the option is checked under export in the preferences).
    I believe Zotero remembers the last char-set option for quick-copy.
  • Simon:

    Thanks for the response. I understand the decision to use UTF-8 in the interests of general portability.

    adamsmith:
    I am not sure if bibtex does, but pdflatex certainly has a problem. Fortunately, there seems to be an easy way to fix it at least for some of the common characters (accented roman characters and copyright symbols etc.) even while using pdflatex, by adding the line

    \usepackage[utf8]{inputenc}

    in the latex source code. This however may not work properly with some bibtex styles (typially those which try to include the UTF-8 character in the citation key).

    On the other hand, yes, it is certainly possible to set the export charset when you do an export to file. However, during quick-copy, this is not possible, as Simon pointed out above.
  • Hi - sorry if thread necromancy is considered bad form here, but just wondering whether there had been any progress on making the encoding for quick copy selectable?
    I'm building a .bib file as I go using drag and drop, which is wonderful, but non-ASCII characters get mangled as soon as I run BibTeX :-(

    Thanks
  • no, nothing new on quick copy.
  • Ah well, thanks anyway.
    I've modified the encoding for quick copy in bibtex.js for now, but I have a feeling that I'll have to do that again every time there is a Zotero update... ;-)
  • that's correct - it's actually not just Zotero updates, but also translator updates which happen in the background whenever a translator changes.
    You can have a separate bibtex translator with a different id, but that way you wouldn't get the automatic updates, which often contain small fixes & improvements.
Sign In or Register to comment.