Please document/explain the use of the language field

I've been unable to find any explanation on what the language field is meant and intended to contain.

  1. Should it contain "en", "fr" or "english", "french" ?

  2. Is it meant to be the language of the author's manuscript, or the language in which it was first published? [1]

  3. Which of the available translators actually recognize it ?

Thanks in advance,
Nicolas

[1] to quote Kurt Weber in a wikipedia discussion (not about Zotero but still insightful):
The description for the "Language" field is "Language of original book," which I'm afraid is a bit ambiguous. Is that the language of the author's manuscript, or the language in which it was first published? Doctor Zhivago, for instance, was written in Russian, but as it was first published in Italy it was done so simultaneously in Italian and Russian--so if it's the language of the author's manuscript then the "original book" was in Russian, but if it's the language of the first publication then the "original book" was in both Italian and Russian.
  • In many cases bibliographic styles do not do anything with the language field. So many folks just leave it blank.
    Should it contain "en", "fr" or "English", "french" ?
    Which would you like to see in your bibliography? I would think in most cases you would want "french" instead of "fr".
    Is it meant to be the language of the author's manuscript, or the language in which it was first published?
    In general, in Zotero you are talking about what you want for citation and bibliography. In which case if you are citing the Russian version of Doctor Zhivago you would put Russian in, if you were using the Italian version you would put that in.
  • On a related note...

    I've never come across an item that will automatically enter the language into the language field when adding to my Zotero Library. I've always had to enter language manually. Is the language field ever automatically inserted?
  • Which would you like to see in your bibliography? I would think in most cases you would want "french" instead of "fr".
    That doesn't work if you're dealing with multiple languages. I would have thought a fixed list of language options tied to the language codes, and properly localized for display, would be more forward-looking.
  • I would have thought a fixed list of language options tied to the language codes, and properly localized for display, would be more forward-looking.
    Agreed, though it's possible that would be overly limiting. RFC 4646, which Dublin Core recommends for the the Language field, allows for more detail than a fixed list would allow. (See some of the examples in Appendix B.)

    The other question is how we would handle migration from the current freeform field to a defined vocabulary.

    One option that might solve both problems is a toggle between a freeform field and a defined list, with a toggle from the former to the latter automatically preserving the basic locale part if detectable. (If the Language value was then actually used somewhere that required a controlled vocabulary (say, a test for a particular language), the same logic could be used to parse a usable locale from the open fields.) Existing values not in locale format would probably just be left as is and wouldn't parse. (Eventually, batch editing could help clean those up.)
  • edited January 14, 2009
    Agreed, though it's possible that would be overly limiting. RFC 4646, which Dublin Core recommends for the the Language field, allows for more detail than a fixed list would allow.
    Yes, that's what I mean; just didn't have the details handy at the time.

    As for implementation details, sounds good.
Sign In or Register to comment.