German Umlauts in BibTeX export

Hi

If I export my collection as BibTeX with ISO-8859-1 or -15 encoding, the German Umlauts ä, ö and ü are converted to e.g. \"{u}. This is wrong, it has to be {\"u}. No idea about the ß, but it's prob. a similar behaviour.
Are the special characters of other languages coded the same way?

This bug is very important to be fixed since BibTeX is not recognizing the letters. I get "nice" bibliographic label as M\"90 (yes, a 9 with two dots above), which should be M{\"u}90 instead.

Cheers,
Mathias
  • Both of these are valid ways to write "ü" in LaTeX:\"u
    \"{u}
    But yes, BibTeX (according to Oren Patashnik's manual), wants all accented characters in a single set of braces. This should be relatively easy to do, but we will need to double-check against the brace hack to preserve capitalization when non-initial uppercase letters appear in a field.

    The output should be:
    author = {M{\"{u}}ller, Erwin},
    title = {The {IP}{\"{O}} is the Institute for Polar Ecology}
    and should NOT BE :title = {The {IP{\"{O}}} is the Institute for Polar Ecology}
  • edited August 5, 2009
    Mmhh, right. I didn't know about this brace hack :)
    But as you've already said: BibTeX wants those braces around accented characters. I don't see a way around. At the moment the BibeX export is not very useful - I'll always have to edit the bib-file manually.

    Is it possible to get a bugfix with the next update?
  • Another questions that rose in the discussion on de.comp.text.tex:
    Why is Zotero masking the Umlauts in ISO-8859-15? They could be printed as ä, ö and ü instead. Masking them is only necessary in full ASCII mode...
  • Zotero uses the LaTeX entities in all charactersets that are not 'UTF-8'.

    The transliteration tables doubtless include some characters that are absent from other character sets & transliteration is "all" or "nothing."

    I don't know, off-hand, what characters are missing from which sets.

    But why use ISO-8859-15 anyway? Are there any common tools that don't work with the full set of UTF-8, but wouldn't need the TeX-encoded entities?
  • Yes, e.g. BibTeX and BibLaTeX can handle those symbols in ISO-8859-15 encoded files. But those tools can't handle UTF8.
    You might have a look into the BibLaTeX documentation, section 2.4.3. It explains all the details.

    I don't know if you understand German?! Actually, there is a discussion in the German newsgroup. Have a look here:
    http://groups.google.de/group/de.comp.text.tex/browse_thread/thread/13307dde8a8de7a3?pli=1
  • My german is poor, but I believe that you are mistaken. bibtex is 7-bit only & must use TeX-encoded entities to work. The newer 'bibtex8' and 'biber' programs can handle 8-bit character sets. bibtex8 can't do multibyte, but biber should be able to.
  • Right, I think we talk about the same - I only wasn't that exactly. :)
    If I run a bib file with 8-bit characters as ä, ö and ü through bibtex I get readable results. But this case is tricky, sometimes some letters are left out etc. Definitely not a way to go.
    But bibtex8 can do. So what I meant up there was BibLaTeX in companion with BibTeX8. In this case I could use ISO-8859-15 encoded files with 8-bit characters, couldn't I?
    So maybe the standard encoding for the bibtex export filter in Zotero should be ASCII with the appropriate symbols {\"u} (not \"{u}) for maximum compatibility. But if one chooses ISO-8859-15 one also expects ISO-8859-15 with 8-bit characters. ;)

    I hope this is possible.
  • Presently, there is not a separate ability to export using ASCII encoding.

    And, as above, the ability to transliterate characters to/from TeX encoding has complex tables. It doesn't seem obvious to me that the added complexity of having per-character set transliteration tables is worth it. I'm aware of no other program that really does this (including BibTeX-specific applications, such as JabRef).

    UTF-8 or the ASCII-subset of ISO-8859-1 with TeX-encoded entities seem sufficient for the vast majority of use cases. If TeX encoded characters were properly braced, I don't see that you've raised any advantage to also having single-byte 8-bit output.
  • Right. As long as UTF-8 and ASCII is working well, there is no need. But currently at least the ASCII notation is not correct and BibTeX complains about it...
  • edited November 14, 2011
    Just for posterity:
    Presently, there is not a separate ability to export using ASCII encoding.
    "English (US-ASCII)" is an option in the character encoding list in the US version of Firefox (though seemingly not in some other versions), and it causes all extended characters to be replaced with TeX entities.
  • hello--writing in October 2018, I find the comment by the OP to still be a problem, e.g. exporting names w/ umlauts from Zotero to Bibtex, {\"u} got turned into \{{\textbackslash}"u\} (in fact, even worse than OP!). This is still incorrect, I need the name to be exported with exactly this {\"u} , or I have to go through and manually correct. Any chance this will be corrected? Thank you.
  • I probably would not have resurrected this thread, as your gripe is only related tangentially.

    The behavior you see is intended.

    Zotero is a general-purpose reference manager and the stock version does not intend you to use LaTeX markup in any fields. When exporting to bibtex, it will use markup to best preserve what you have entered literally.

    If you edit your title to use "ü" instead, it will export to BibTeX in the way you expect.

    Alternatively, the Better Bib(La)TeX extension will export literal LaTeX that you enclose in pre tags. But this is overkill if you're just trying to enter "ü" and will make your database less useful for any workflows that do not use better BibTeX.
  • Thank you noksagt. Guess I thought my gripe was dead on! well, maybe I miss the point.

    Anyway, though, Zotero doesn't do what you say: if I enter "ü", the letter is simply dropped from the exported author name (resulting in Bchner rather than Büchner); this observation is what set me off on this search. On the off-chance you meant me to include the quotes, well I tried that and amusingly get a square-root symbol! I wonder, though, is Zotero *supposed* to just carry the ü through, as you say? Perhaps that would help other bibliographies, but for Bibtex I'd still need the proper markup, {\"u}. And, maybe my version is out of whack? (5.055.1, only slightly behind, can't imagine that's the problem...)

    I have not tried the Better Bib(La)TeX extension, but will give it a shot, thanks for pointing it out. I don't have any workflows outside of Bibtex, so no worries there.
  • > if I enter "ü", the letter is simply dropped from the exported author name (resulting in Bchner rather than Büchner)

    Check that you choose Western as the character encoding in the BibTeX export instead of the UTF-8.
  • edited October 12, 2018
    Zotero should never drop the ü, but the treatment of Umlauts in bibtex depends on the character encoding. In UTF-8 they just stay, in all other encodings they are LaTeX escaped.
  • gotcha! now that was useful, thank you, worked like a charm.
  • In BBT it's not dependent on the charset, it's a setting in the preferences.
Sign In or Register to comment.