Bibtex key generation: changed treatment of ä, ö, ü, ß

Hi,

Until very recently, the Bibtex key generation algorithm translated these letters in the following way:
ä -> ae
ö -> oe
ü -> ue
ß -> (deletion)
ł -> (deletion)

After updating to the latest version, these letters are now translated as:
ä -> a
ö -> o
ü -> u
ß -> s
ł -> l

(I assume that more letters are affected)

I would very much appreciate if the old behaviour could be restored. If this is not going to happen, can you please confirm that this new behaviour is going to stay? It is very time consuming to change the keys to dozens of affected references in my document.
  • I wasn't involved in that, but I believe what prompted this were problems in key generation that demonstrated the need for a more comprehensive list of conversions of non-asci characters.

    Zotero is now using what I believe is a comprehensive list from a third party, so yes, that should stay the same.
    I doubt there's much of a chance of reverting to prior behavior, though.
  • This new system seems to work in most cases if 1) the only purpose is to place items in alphabetical order; and 2) we are only interested in the first character of a word. This "translation" system results in words being spelled differently depending upon when the conversion was done. This inconsistency seems like a bad thing. This is the first I' ve heard of converting German language decorated characters to a single undecorated letter rather than two. I'm especially annoyed with "ss" being converted to "s".

    All of this mess because of BibTeX problems with UTF-8?
  • this is for bibtex keys, not for records themselves - in bibtex records exported from Zotero, items are either left in utf-8 (encoding utf-8) or converted to the bibtex code - i.e. \"a for ä etc. (encoding ISO...)

    What the conversion does is to remove all diacritics etc. from all letters and convert them to their simplest form (the function in Zotero is actually call removeDiacritics) - for the purpose of citation keys that makes sense to me, though I agree that is quite unfortunate this changed over time.
  • Before this change, many extended characters in keys were just being stripped, as dlandert's last two examples show, which is why the change was made. I think we had hard-coded replacements for a few characters in the BibTeX translator itself, including the first three above.

    It might still make sense in a few cases to adjust removeDiacritics() to convert certain characters (e.g., 'ß') to two letters. The only other place removeDiacritics() is used right now is in duplicate detection.
  • Thanks to all of you for your quick replies! I guess in this case I'll have to change all the keys... I really hope that this is the last time the conversion changes, though. In my opinion, it is not that important how the characters are converted, after all this is not visible in the finalised document. But it is quite annoying if one has to change all the keys in a large document (in particular so shortly before my deadline, when I'd prefer to feel happy about the fact that by using latex I avoid having to struggle with such last-minute technical problems...).
  • I do agree with dlandert. It does not matter how the bibtex keys are generated - as long as it is consistent. I am facing the problem of manually changing the keys yet again.o agree with dlandert. It does not matter how the bibtex keys are generated - as long as it is consistend.
Sign In or Register to comment.