Localization problems (CSL processor and styles, fi-FI)

Hello,

I'm planning to contribute to the localization by translating the CSL locale into Finnish and creating some citation styles for Finnish journals. However, for some reason I haven't succeeded in building a functioning test environment for this effort.

This is what I have tried:

- I've made a quick and dirty translation of the CSL locale (locales-fi-FI.xml) and added it to my Zotero files (both to the Firefox and Standalone versions and by two methods, first by unzipping zotero.jar, adding the locale to content/zotero/locale/csl and zipping it all again; and by changing chrome.manifest to point to the unpacked files).

- extensions.zotero.export.bibliographyLocale in Firefox and in the Standalone version is set to "fi-FI".

- Then, I have a CSL style with default-locale also set to "fi-FI".

- Now, when I test the style either with csledit.xul or by exporting selected bibliography from the context menu, I still get English strings ("In...", "and" etc.).

Any ideas what I am missing here?

The debug log shows one related entry: "CSL: Warning: unknown locale fi-FI, setting to en-US". This seems to originate from citeproc.js, from the function CSL.localeResolve, which tries apparently tries to find to given locale from a hardcoded list (CSL.LANG_BASES) that doesn't have fi-FI there. I even tried to just add it to the list, however causing even more problems (it broke the parser somehow).

Curiously enough, the only way I've found to translate the strings is to hardcode them into the CSL file with "<locale xml:lang="en">" (sic! not fi, but en). What is more, this trick works for normal bibliography export, but not for csledit.xul (which stubbornly gives English strings). Shouldn't csledit and normal bibliography export give the same results, uh?

Thanks in advance!
  • I would suggest you just submit your locale style to the repository and it will ship with the next version of Zotero. I'm pretty sure it's related to CSL.LANG_BASES, but I can't tell you why adding fi doesn't work.

    As for testing - defining the terms in the style (with locale="en" - because, as you note, csl doesn't recognize "fi" as a valid locale) should work, including in the csledit - I can't tell you why it's not working for you without seeing the style.
  • Thanks for the suggestion - that is evidently the second best option. I just thought that it would have been easier to test the locale file in practice with different CSL styles than to just theoretically speculate what exactly the different terms are supposed to do. I've read the CSL specification, but it is not too verbose on this.

    Some of the terms are a bit tricky to translate accurately because of the noun declension in Finnish. For example, "presented at X" would translate as "esitetty X:[ssa/ssä/lla/llä]", i.e., the preposition "at" translates as to a case suffix that depends on the properties of the noun. I wonder if CSL will ever support these little oddities. :) So, the locale will not be perfectly elegant, but maybe it's still better than nothing.

    Still, I would welcome a way to patch the locale in while waiting for the official release.

    As for the possible csledit bug, I will try to reproduce it once more and to come back with an concrete example.
  • generally CSL will improve support for gendered word endings (e.g. in many languages ordinal numbers correspond to the following noun).
    I have my doubts whether that will ever extend to all the intricacies of Finnish cases, though. I suggest you go with what will fit most often - in this case probably the Finnish word for conference.

    Most styles don't contain many of the terms - even something relatively common like "presented at" is only in a handful of styles - so testing with existing styles wouldn't work super well in the first place. I would not overthink this too much and just come up with a translation that works. It's easy enough to fine-tune it later.
    If you have specific usage questions just ask - I guess we could try to define the terms somewhere, but that's a tricky business.
  • Ok, let's try with specific CSL locale terms (I'm scanning the locale file from the beginning to the end):

    - "at", "by": Where could these be used? They are practically untranslatable due to the declension issue (they would be expressed with a suffix or such)

    - "reference" (short and long, singular and plural forms): I'm not 100 % of the alleged context for this. Any examples?

    - "ordinal-01"-"ordinal-04": a dot (.) should be enough, if that's ok

    - "verse": different translation if referring to a metric division of a poem/composition or to a subdivision of a chapter of the Bible/Quran

    - short locator forms: most of these do not exist in general use but can of course be invented (if that's better than to leave them blank)

    - "author" (long and short role forms): left blank in en-US, why is that?

    - "translator": if taken literally, no problem (->"kääntäjä"), but actually the form that is more commonly used in Finnish bibliographies is "suomentaja", meaning "translator into Finnish". The verb form would be similarly "suomentanut" or "suom." ("translated into Finnish by..."). This could form a problem when designing actual CSL styles that would expect to have "Suom." instead of "Käänt." (translated).

    - "recipient": in en-US this is "to" - if that is what is intended, it is untranslatable as such (would translate as a declension suffix)

    That's about it, otherwise it should be all clear.
  • By the way, it seems that I just found to source of the original problem: a tiny bug in my XML locale code broke the parser. Test and doubt your own code first, should be obvious enough, but isn't always so... Sorry for that false alarm! Now it works as well in csledit as in normal export.

    It doesn't seem to affect much whether locale in included in the hardcoded LANG_BASES list, but adding it there also works now.
  • "at", "by": Where could these be used? They are practically untranslatable due to the declension issue (they would be expressed with a suffix or such)
    "at" is used by most styles in front of URLs (see e.g Trend Journals) as in "at www.exampleurl.fi" It's used in about 10 styles
    "by" is used in one style (Journal of Fish Disease) in front of an editor and probably shouldn't be used there. The only situation I can think of where it might come up in the future is for a book author as in:
    Elliott, Emory. Afterword. The Jungle. By Upton Sinclair. New York: Signet, 1990. 342-50. Print.
    "reference" (short and long, singular and plural forms): I'm not 100 % of the alleged context for this. Any examples?
    No - I think it's a legal thing. We're not using it in a single style.
    "ordinal-01"-"ordinal-04": a dot (.) should be enough, if that's ok
    Yes, that's the case in many languages.
    "verse": different translation if referring to a metric division of a poem/composition or to a subdivision of a chapter of the Bible/Quran
    that has never come up. Intuitively I'd say go with the poem - that's my spontaneous association in English. Obviously not ideal - we may have to address that.
    - short locator forms: most of these do not exist in general use but can of course be invented (if that's better than to leave them blank)
    These will appear in many citations, e.g. if people cite a chapter or so. If it's not common to use abbreviations there, repeat the long form. Don' t leave those empty, else a citation of page 3 won't be distinguishable from chapter 3.
    - "author" (long and short role forms): left blank in en-US, why is that?
    because authors aren't usually labeled. If someone is an editor (translator, etc.) you add Ed. (or , editor or eds. etc.) after the name(s). You don't do that (at least in all languages I'm aware of) for authors.
    - "translator": if taken literally, no problem (->"kääntäjä"), but actually the form that is more commonly used in Finnish bibliographies is "suomentaja", meaning "translator into Finnish". The verb form would be similarly "suomentanut" or "suom." ("translated into Finnish by..."). This could form a problem when designing actual CSL styles that would expect to have "Suom." instead of "Käänt." (translated).
    use the more common form.
    - "recipient": in en-US this is "to" - if that is what is intended, it is untranslatable as such (would translate as a declension suffix)
    This is used for letters: In English they are usually cited as "Abraham Lincoln to Henry Pierce" where Henry Pierce is the recipient. If that can be approximated somehow, do that. Else leave it blank.
  • I've added the locale to the citeproc-js source, so that it will roll out with the next release.
  • edited February 20, 2012
    Ok, that solved the rest!

    I will, just to be sure, try to rephrase the problem with the term "translator" once more. In a Finnish context, different translations should be used depending on the target language of the translation. For example, some of my bibliographies contain items that are translated e.g. from Italian to English (and thus 'translated by'->'kääntänyt') while some are translated, say, from English to Finnish (and thus 'translated by'->'suomentanut'). But I suspect that the data structure doesn't allow for this kind of distinction? So as you suggested, I'll stick with the more common option for now.

    Anyway, now the locale file is ready to be delivered. What command should I use to commit it to the CSL locale repository? I have Git installed and configured, but I'm just not quite sure on how to use it. :)
  • "verse": different translation if referring to a metric division of a poem/composition or to a subdivision of a chapter of the Bible/Quran

    that has never come up. Intuitively I'd say go with the poem - that's my spontaneous association in English. Obviously not ideal - we may have to address that.
    That has come up in French (vers/verset). AFAIR, we choose the "poem version".
  • Anyway, now the locale file is ready to be delivered. What command should I use to commit it to the CSL locale repository?
    The instructions for locale files are the same as for styles: https://github.com/citation-style-language/styles/wiki/Submitting-Styles
  • Thanks. Forked, committed and a pull request sent. Hope it works!
  • up already - will be in the next Zotero version - thanks.
Sign In or Register to comment.