Zotero > BibTeX :: transliteration of Arabic
I have a problem with exporting to BibTeX format. After a day or two, I have figured how this whole thing works, but one problem still remains.
I have a lot of Unicode characters in my Zotero DB; the translator to BibTeX format does a great job converting those Unicode symbols into the LaTeX code, but... when it created a BibTeX key, it throws away all those Unicode letters, so the keys become completely unintelligible, for example abar_victory_1990 (whereas the author's name is Ṭabarī). So, the question is: is there anyway to add a similar conversion of authors' names and titles? I.e., Ṭ is converted into T and ī is converted into i, so that the key would look tabari_victory_1990.
I have a lot of Unicode characters in my Zotero DB; the translator to BibTeX format does a great job converting those Unicode symbols into the LaTeX code, but... when it created a BibTeX key, it throws away all those Unicode letters, so the keys become completely unintelligible, for example abar_victory_1990 (whereas the author's name is Ṭabarī). So, the question is: is there anyway to add a similar conversion of authors' names and titles? I.e., Ṭ is converted into T and ī is converted into i, so that the key would look tabari_victory_1990.
Strangely enough, I have figured it out myself (with a help of my friend). Here is the updated BibTeX translator to suit those needs (every change I did is marked); I have not changed anything in the code itself, it turned out that I just needed to add few more options:
http://www-personal.umich.edu/~romanov/BibTeX.js
What it does:
1. BibTeX keys: Changes transliterated names and titles into simplified versions without dots and macrons (e.g.: "Ṭabarī" becomes "tabari"), thus the keys are more intelligible (tabari_victory_1990 instead of abar_victory_1990).
Symbols added: ṭ ū ī ā ṣ ḍ ḥ ḳ ẓ Ṭ Ū Ī Ō ō Ā Ṣ Ḍ Ḥ Ẓ; ("ʾ", "ʿ" - symbols used for Ayns and Hamzas are deleted).
2. Two more transliteration symbols - for 'Ayns and Hamzas - are now converted into LaTeX codes: "ʾ" is converted into "\Alif"; "ʿ" - into "\Ayn"; use \usepackage{semtrans} to activate their conversion).
3. Also the error with "i with macrons" is fixed (now producing a code for "i" with a macron, but without a dot in between: \={\i}, instead of \={i}.
I hope somebody else finds it helpful.
http://lehelk.com/2011/05/06/script-to-remove-diacritics/
it's included in the utilities.js file in Zotero:
https://github.com/zotero/zotero/blob/master/chrome/content/zotero/xpcom/utilities.js
It's a function that can be called from within the translator - if I understand correctly, you could just use it instead of the tidyAccents function (which you wrote?).
http://www-personal.umich.edu/~romanov/BibTeX.js
This would be good to cover with unit tests, but we don't have support for unit tests for export translators yet. But it looks to work fine.
ZU.removeDiacritics
is only applied in Zotero 3.0, and they will not be removed in Zotero 2.1.x.- The unicode markup for both latin capital and small letter O with a stroke has changed from using the proper '\u' to '\U'
- The LaTeX entites for both of the macroned I's ('\\={I}') now have escaped I's ('\\={\\I}')
- The closing brace of the creator string (ca. line 2060) has been removed
Otherwise, this looks ok.As to utilities.js - I had no idea how it works exactly. I am quite new to both Zotero (switched a month or two ago) and Latex (started just a week or a week and a half ago), so pardon my screw-ups.
As to "removing the special treatment of corporate authors when writing out creator names," I do not think I touched anything like that at all.
Again, thanks a lot, ajlyon! The translator works perfectly well for me at the moment.
As something of an aside, is it still not possible to handle these in a native Unicode LaTeX processor, avoiding the whole morass of character substitutions? It's been some time since I last used LaTeX, but I thought these approaches were on their way out.