biblatex import/export: CSL "language" = biblatex "langid"
The CSL "language" variable should be mapped from and to the biblatex "langid" field upon import and export, *not* from/to the biblatex "language" field.
The biblatex "langid" field (which used to be called "hyphenation" until biblatex v2.8, released 2013-10-21) specifies the (main) language of the metadata, and is used to switch hyphenation patterns and capitalization routines. It thus matches the CSL "language" variable perfectly.
In biblatex, the "langid" identifier must be a language name known to the babel/polyglossia packages. A fairly complete list, including mappings between biblatex and CSL can be found in the pandoc-citeproc sources, see http://hackage.haskell.org/package/pandoc-citeproc-0.1.2.1/docs/src/Text-CSL-Input-Bibtex.html.
The biblatex "language" field, by contrast, describes the language(s) of the content, e.g., "greek and latin and english", but has no counterpart in CSL.
The biblatex import filter should thus map biblatex "langid" (ideally also considering "langidopts" where details such as "variant=british" might be specified) and "hyphenation" (for backwards compatibility) to CSL "language", translating babel language names to CSL language identifiers.
The export filter should map CSL "language" to biblatex "langid". Mapping anything to the "langidopts" field is not essential, AFAICS, since all available languages can be specified without using langidopts (which has been introduced to facilitate the use of the polyglossia package).
The biblatex "langid" field (which used to be called "hyphenation" until biblatex v2.8, released 2013-10-21) specifies the (main) language of the metadata, and is used to switch hyphenation patterns and capitalization routines. It thus matches the CSL "language" variable perfectly.
In biblatex, the "langid" identifier must be a language name known to the babel/polyglossia packages. A fairly complete list, including mappings between biblatex and CSL can be found in the pandoc-citeproc sources, see http://hackage.haskell.org/package/pandoc-citeproc-0.1.2.1/docs/src/Text-CSL-Input-Bibtex.html.
The biblatex "language" field, by contrast, describes the language(s) of the content, e.g., "greek and latin and english", but has no counterpart in CSL.
The biblatex import filter should thus map biblatex "langid" (ideally also considering "langidopts" where details such as "variant=british" might be specified) and "hyphenation" (for backwards compatibility) to CSL "language", translating babel language names to CSL language identifiers.
The export filter should map CSL "language" to biblatex "langid". Mapping anything to the "langidopts" field is not essential, AFAICS, since all available languages can be specified without using langidopts (which has been introduced to facilitate the use of the polyglossia package).
https://github.com/zotero/translators/blob/master/BibLaTeX.js#L529
We can look at this on import, we can use langid as a fallback for language, but I don't think we'll prefer it (there is a single bibtex/biblatex import translator, so when in doubt we follow bibtex and not biblatex).
What I now get upon export is hyphenation = {en}. biblatex, however, will not understand this; it should be hyphenation = {english}.
Also, I think the way this is written should produce "english" and not "en" for langid - what's in the Zotero field?
The only thing I'm still not too happy about is the biblatex "language" field.
Both standard biblatex and biblatex-chicago print the content of this field by default, so the way things are now, you get, e.g.,
Author, Ann. 2012. Title. en-GB. Place: Publisher.
Probably not what anyone would ever want.
I would recommend not to write the biblatex "language" field at all - but if you feel you must, please do not use the content of the Zotero "language" field as is but map it to the appropriate biblatex "language name" (see biblatex manual, "4.9.2.18 Language Names").
That it writes out both language and langid is completely a mistake. I just forgot to remove the old direct mapping to language when I added the smarter handling of langid. Easily fixed.
The other thing that it does is writing out language literally to the language field if no matching language is found in the "smart" selection of languages (which writes to langid). Maybe this is wrong if the meaning of the CSL-language field is a language only used for localization and not written out in the bibliography. Biblatex obviously distinguishes between these two uses and provides two different fields, which doesn't seem to be the case for CSL and Zotero.
Maybe then we should never write to the biblatex language field. Do you think I should do that? Would people expect anything else?
I based the language mapping on the list of supported languages in biblatex according to the manual, but now I realize that "supported" probably only refers to availability of localization strings and not to hyphenation so this list could be extended to all languages supported by babel or polyglossia and at least give correct hyphenation in the bibliography. I'll look into this.
I haven't added support for polyglossia yet though, only babel's language variants (see languageMap in the file). But I guess that could be done quite easily. I'll just have to look up which languages and variants are supported in polyglossia.
Thanks for the input.
https://github.com/zotero/translators/pull/664
I'll add support for more languages and polyglossia later as
fixing this is of course more urgent The people possibly affected by erroneously printed languages in bibliographies are more than the people depending on an integration of Zotero/CSL language and biblatex langid for more languages than is already supported now (probably no one).
Thanks to anjo7539
it would be really nice if the export worked with any string entered into the Zotero language field, but show a warning that there are entries which don't match any known supported BibLaTeX language (possibly with a list of not matching entries) so people can change those, for example by using a text editor's "find and replace" function.
I got a bibliography with hundreds of entries that I maintain in Zotero. I use language entries like "German" or "English" that make it perfectly clear for anyone searching within my Zotero bibliography which language the entry is in, but of course BibLaTeX doesn't recognize such language values.
The current BibLaTeX export seems to ignore said entries, so I'd have to change them from "German" to "ngerman", "English" to "USenglish" or "american" and so on to make the BibLaTeX export work.
First of all, that's an unnecessary differentiation within Zotero bibliographies and could lead to confusion as people might not be familiar with BibLaTeX languages and second I'd have to do said replacements manually as Zotero is lacking an option to find and replace within specific fields. In recent export versions I was able to just find and replace the exported entries with a text editor, but now I can't as there are no "langid" fields being exported at all.
*edit* Typos and clearing it up.
*edit2*: Neither "american" nor "USenglish" nor "ngerman" nor "german" get exported right now, although some of them are listed in BibLaTeX.js' "var languageMap". This utterly breaks my thesis' references which make use of language switching. "english" works though.
*edit3*: Now I get the logic behind the mapping (e.g. "en:US" or "en-US" for "american") and I helped myself using the find-and-replace JavaScript code found here: https://forums.zotero.org/discussion/7707/. It still would be nice if the BibLaTeX export had an option to just export to "langid" whatever the Zotero language field contains.
I noticed a change because now the single Zotero field "Issue" is translated in two biblatex fields ("Issue" and "Number") with the same value. This results in the incorrect output like this:
Brader, Ted A., Joshua A. Tucker, and Dominik Duell. 2013. "Which Parties Can Lead Opinion? Experimental Evidence on Partisan Cue Taking in Multiparty Democracies." Comparative Political Studies 46, no. 11 (11): 1485-1517.
However there are other changes. Here is the result of standard biblatex export 2 weeks ago:
@article{brader_which_2013,
title = {Which Parties Can Lead Opinion? Experimental Evidence on Partisan Cue Taking in Multiparty Democracies},
volume = {46},
issn = {0010-4140, 1552-3829},
url = {http://cps.sagepub.com/content/46/11/1485},
doi = {10.1177/0010414012453452},
shorttitle = {Which Parties Can Lead Opinion?},
language = {en},
issue = {11},
pages = {1485-1517},
journaltitle = {Comparative Political Studies},
shortjournal = {Comparative Political Studies},
author = {Brader, Ted A. and Tucker, Joshua A. and Duell, Dominik},
urldate = {2013-12-01},
date = {2013-11-01},
keywords = {Great Britain, Hungary, partisanship, party cues, Poland, political parties, public opinion, survey experiments}
}
Here is what I get now:
@article{brader_which_2013,
title = {Which Parties Can Lead Opinion? Experimental Evidence on Partisan Cue Taking in Multiparty Democracies},
volume = {46},
issn = {0010-4140, 1552-3829},
url = {http://cps.sagepub.com/content/46/11/1485},
doi = {10.1177/0010414012453452},
shorttitle = {Which Parties Can Lead Opinion?},
abstract = {Political parties not only aggregate the policy HERE GOES COMPLETE ABSTRACT.},
issue = {11},
pages = {1485-1517},
number = {11},
journaltitle = {Comparative Political Studies},
shortjournal = {Comparative Political Studies},
author = {Brader, Ted A. and Tucker, Joshua A. and Duell, Dominik},
urldate = {2013-12-01},
date = {2013-11-01},
langid = {english},
keywords = {Great Britain, Hungary, partisanship, party cues, Poland, political parties, public opinion, survey experiments}
}
Questions:
1. How to fix the problem with double issue/number?
2. Is there a place where such changes are documented? It makes using Zotero unreliable when - without a warning - the same operation could not be performed after a short period of time.
https://github.com/zotero/translators/commit/f49a0f18a4edab6df9f18a1c7c60f6be0b40ce7f#diff-25752e7a73f4fcccddce475afb61db43R355
doesn't look like it's working as intended - if anjo could take a look that'd be great.
2. Translator changes are not individually documented, but are all available via the history on https://github.com/zotero/translators
In general we change export translators rarely because of stability concerns, but since we've only had BibLaTeX for two months and are still figuring out how to best implement some details, I'm allowing more changes. E.g. the BibTeX translator hasn't seen such dramatic changes for at least a year.
2. I see the problem with new translators. By the way, as far as I understand, with these changes in language fields you have just fixed the problem that was already fixed on the biblatex end (http://tex.stackexchange.com/questions/147749/excessive-fields-in-biblatex-chicago-author-date-style). I understand that there are many implementations of biblatex, but some sort of coordination with at least major ones could save resources on both ends. It maybe unfeasible though.
3. I definitely can't do it, but it may also just not be possible given the way the forum software works.
And ideally I would prefer biblatex to use only the fields I need rather than limit the information Zotero is storing. However, as far as I can understand, the partial solution here is exactly in the fine-tuning of the export translators.
https://github.com/zotero/translators/pull/665.
It'll update whenever @adamsmith pulls it.
Note that the old behaviour (where you got only "issue" was actually wrong) and what you will get now is only "number" (for numeric "issue numbers" as in your example)
Quoting the biblatex (2.8a) manual:
The language fields in peoples real life Zotero databases don't neccesarily look that nice though. I've noticed that you could end up with many different language codes using the web translators.
Thanks!
I should note that with this new translator, which is following "biblatex" guidelines and provides numbers instead of issues, I have to add "biblatex-chicago" command "numbermonth=false" to the preamble of my latex document to get it right way, e.g. "American Economic Review 98 (3): 808-842".
Absent this command I was getting "American Economic Review 98, no. 3 (May 1): 808-842".