Capitalisation and Titles with Multiple Languages

edited July 19, 2020
Q: how can I stop Zotero capitalising titles in citations inappropriately?
A: put 'XX' in the Language field of the item in the database.


@dstillman recently brought my attention to the fact that you can use the Language field in the database to affect the way citations are presented. However I found no documentation on the precise way it works or what codes should be used, so I did some experiments. I share my results here, but speak with no authority and welcome additions and corrections.

I am aware of the discussion thread https://forums.zotero.org/discussion/5464/apa-citation-style-capitalization and I thank the contributors for the tips, especially the one about changing the import preferences. However, I find the sources (I use Worldcat a lot) are often wrong anyway. I am posting a new thread because I am not using APA, but MHRA3 (Modern Humanities Research Association standard version 3), which is also supported by Zotero. For the rules see http://www.mhra.org.uk/style. Actually, does the citation standard make any difference to Zotero’s capitalisation rules?


My problem is that I am writing a thesis in English that cites titles with two languages because I have the original title followed by an English translation, for example:
„Demokratie ist Lustig“: Der politische Künstler Joseph Beuys [“Democracy is Fun” Joseph Beuys the Political Artist]
I have carefully used the correct standards for both German and English parts so was annoyed when Zotero rendered the title in the citation as:
„Demokratie Ist Lustig“: Der Politische Künstler Joseph Beuys [“Democracy Is Fun” Joseph Beuys the Political Artist]

Oddly, it capitalised ‘Is’ but not ‘the’ in the English version and incorrectly capitalised ‘Ist’, ‘Lustig’ and ‘Politische’ in the German part. I thought I could do nothing about it until I heard about the Language field’s role in citations.

So what values to use for the language codes? Looking at what has been imported, I see all sorts, like En, eng, EN, for English; Ge, ger or DE for German, FR or fre for French and so on. I also see compound codes like ‘EN-US’. I have not found a list of codes for Zotero (is there one?) but found some standards online. The Library of Congress specifies three letter codes in English like ‘eng’ for English, ‘fre’ for French and ‘ger’ for German (https://www.loc.gov/marc/languages/language_code.html#e). The ISO standard allows up to four variants (ISO 639-1, ~2/T, ~2/B, and ~3) for two- or three-letter codes in English or the language itself: So English can be ‘en’ or ‘eng’; German ‘de’, ‘deu’ or ‘ger’; French ‘fr’, ‘fra’, or ‘fre’ (https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).

What does Zotero do? After several experiments, I concluded that it seems only to distinguish between ‘English’ and ‘not English’. If the text is English, nearly all words are capitalised, except the first word and a few like ‘the’, of which it presumably has a list. Does the list only include English words? If the text is not English, then it is left as it is – which is what I want even for English titles. By experiment I found that a blank Language field is read as English; so English is the default. ‘en’, ‘eng’, ‘EN’ and the word ‘English’ are also accepted as meaning the title is in English. Capitalisation in the language codes seems to be ignored. ‘de’, ‘ger’, ‘fr’, ‘fre’, ‘fra’ and capitalised equivalents all seem to mean ‘not English’ and leave the capitalisation of the title in the citation alone. I also note that a random code like ‘XXX’ is read as ‘not English’. Am I underselling Zotero here? Actually this is fine by me; I’d rather write the titles myself than have Zotero try and figure out what is ‘right’.

Conclusion: to cite a title without change to the capitalisation put any code in the Language field that is NOT ‘en’, ‘eng’, or ‘English’. If in doubt, ‘XXX’ will do fine.

Happy citing!



P.S. If you are wondering how I typed the correct forms of the quotation marks into Zotero, I didn’t. I copied the imported title to WORD, set the language for each part of the text, re-typed the quotation marks, then copied and pasted it all back into Zotero.
«1
  • The wall of text is a bit dense to read. What exactly is the problem you’re trying to solve, briefly?
  • edited July 19, 2020
    See paragraph 3, the one starting "my problem is"...
  • ... and I did solve it; see the last paragraph starting "Conclusion..."
  • @bwiernik In response to your comment on the original text, I added the first two lines to the text, which put the question and the answer in a nutshell. Hope that helps!
  • @dstillman recently brought my attention to the fact that you can use the Language field in the database to affect the way citations are presented. However I found no documentation on the precise way it works or what codes should be used
    I linked to documentation in the same sentence where I said this was possible.
  • @AndySymons That is not a good practice. You should enter an actual language into the field or leave it blank. If there is an English language title with words that should never be capitalized, wrap those in <span class="nocase"> </span> tags.
  • edited July 19, 2020
    ISO 639-3 has a code for multible languages 'mul', it also has a code for no-linguistic content 'zxx' and it has a code for undetermined 'und'... any of those would be better than XX or xxx; Because they would be accurate and would be valid BCP47 codes. https://tools.ietf.org/html/bcp47

    @AndySymons BCP47 uses all three ISO 639 standards and ties them together for IT uses. It is how we get the Locale IDs. -2 and -3 were synchronized some years ago, but -3 has more than -2 did so it became the default code standard. -1 was already in use widely by the IT industry, so BCP47 says something like:: if a language has a -1 use it, if a language doesn't use -3, but the same language might be written in different ways in different places, so when needed for specificity, add a country code. BCP47 spells this all out and where to get what codes.

    My understanding was that the Zotero Language field was for the language of content of the resource, the linked documentation seems to indicate that the language field is used as a field for linking to a CSL locale. Locale Identifiers are BCP47 compliant (with few exceptions, but new ones are to be BCP47 compliant), so is the correct answer on what goes in this field "BCP47 tags" for how the citation content will be styled? (or more narrowly: 'Choose a locale').
  • @dstillman Yes you did point to the documentation, which mentions some codes that can be used. I agree with most of what is said there, except that titles should 'usually be in sentence case'. With a mixed-language title, one has to stop all conversion, so all parts should be in the capitalisation appropriate to the language.
  • edited July 19, 2020
    @bwiernik Sorry, the type of tagging you suggest is too cryptic for me; not easy enough to remember.
  • edited July 19, 2020
    @hughp3 Thanks for telling me about BDP47. Yours is an excellent suggestion and I may well change my 'xx' codes to 'zxx', although this is still not strictly compliant with the BCP47 recommendations. What I really mean by 'xx' is 'this citation is in English but I want to fool Zotero into not changing the capitalisation'. Where the document is in German, say, I already use the 'de' code; it then does not change the capitalisation of any part of the title so the German and English parts have to be correctly capitalised in the title field.
  • One reason I want to stop Zotero changing my capitalisation is that it changes both the Title and the Series fields (possibly other fields too?). I often (mis-)use the Series field for additional text that I want in the citation, like "catalogue of an exhibition at the Staatliche Kunsthalle Baden-Baden, 6 December 1986 to 15 February 1987" and I don't want this all in capitals.
  • Using the multilingual language tag like that will prevent proper casing of English titles.

    I suggest you store that sort of information in Extra like this:
    Medium: catalogue...

    That will be picked up and formatted correctly by citation styles.

    I also recommend wrapping non-English text in the span tags I mentioned.
  • edited July 19, 2020
    @bwiernik 'Using the multilingual language tag like that will prevent proper casing of English titles.' --> that's exactly what I want!

    'I suggest you store that sort of information in Extra like this...' I tried that, but the Extra field is not used in MHRA3 citations.
  • ... to be clear, I want the final citation to look like this:

    Hofer, Sigrid, ed., Entfesselte Form: fünfzig Jahre Frankfurter Quadriga [Unshackled Form: Fifty Years of the Frankfurt Quadriga Group], catalogue of an exhibition at the Städelsches Kunstinstitut and the Städtische Galerie Frankfurt am Main, 1 October 2002 to 5 January 2003 (Basel; Frankfurt am Main: Stroemfeld, 2002)

    ... with the title italicised; I don't know how to reproduce that here.
  • edited July 27, 2020
    Is Hofer, Sigrid the editor of the catalogue or are they the collection curator? It is unclear to me if you are citing a book (the catalogue describing the collection) [by intention or because citing a collection in Zotero is hard and generally the collection catalogue is the only remnant of a collection after the exhibit is over] or if you are citing the collection itself... Citing an exhibit itself might be better done as a presentation. Have you looked at the fields for a presentation in your specific stylesheet?

    if Hofer, Sigrid is the curator and the catalogue editor, and you go the presentation type route then adding curator as a role in CSL 1.0.1 might be a good thing. @AndySymons you could add a comment to the pinned CSL item as they are asking for feedback a the moment.
  • I am citing the BOOK, which is itself the catalogue of an exhibition. That is precisely why I want the additional text "catalogue of an exhibition...". Apart from the additional text, the details are as for any other book. Sigrid Hofer is in this case the editor of the book, so correctly cited. She was quite likely the exhibition curator too, but that is irrelevant here.

    I do not see any need or way to cite the 'collection itself'; I can mention it in the text of course, but a citation is to something concrete that the reader can go and find for him/herself. I sometimes refer to a web page with installation shots or other details of a past exhibition, when available, but that is not the subject of the question in this thread.

    I don't use 'presentation' at all and its fields are not useful for an exhibition or art collection. I'm not sure what it is meant for. For a presentation at a seminar, I use the item type 'conference paper'. I'm not sure what other kind of presentation one might want to cite?
  • “Presentation” is for presentations at conferences and similar things. “Conference paper” is for papers formally published in a conference proceedings book or journal.
  • edited July 27, 2020
    Conference Paper "A paper presented at a conference and subsequently published in a formal conference proceedings publication (e.g., as a book, report, or issue of a journal). For conference papers that have **not** been published in a proceedings, use `Presentation`." The zotero documentation is helpful in these sorts of cases: https://www.zotero.org/support/kb/item_types_and_fields — though I'm not alawys the first to know what is or isn't in the documentation...
  • Thanks to both @bwiernik and @hughp for those clarifications.
  • Hello,

    I have just spent 2 hours trying to clean up my bibliographical references.

    Trying to follow the rules, I have used automation to put all my titles in sentence case, have checked the language of all my references (mostly en or fr).
    Running a few test, not all titles, and stumbling on three issues.

    1) This bibliography item in MLA:

    The Dubliners. ‘The Town I Loved so Well’. Plain and Simple, Polydor, 1973.

    Does anyone have any idea why the word “so” has not been capitalised?
    Of course, I can undo what I have just done (sentence case) and add a capital “s” manually, but I would prefer to understand so that I can see if the problem might occur elsewhere.

    2) Another problem, this time with a reference in French. I have set the language to fr, which means I suppose Zotero should capitalise the first word of the subtitle, but I get this:

    Mason, Roger, and Steve Waring. Guitare américaine : spécial instrumental. Le chant du monde, 1972.

    I do not understand why the word “spécial” does not have a capital “S”. Could it be something wrong with punctuation ? I have put a space before the colon, because that is the way it is done in French. Shouldn’t I have?

    3) In the same way, I suppose Zotero should add a capital “R” on “rire” in the following title (that is the rule in French titles after a definite article) and add a capital “E” in “essai” (first word of the subtitle), but it doesn’t

    Bergson, Henri. Le rire : essai sur la signification du comique. Flammarion, 2013.

    Thank you!
  • 1) How have you entered the title in Zotero and what is entered in the Language field?

    2) No, no casing at all is applied for non-English items. Sentence versus title case is a uniquely English thing, whereas to my knowledge other languages are fairly uniform in their capitalization rules. So, you should store non English titles with correct capitalization for that language.

    3) As above.
  • 1) I'm seeing "so" not being title cased and that's definitely a bug -- not sure how long that's been around.
  • @jcmeunier It seems that you are talking about using the text conversion within the Zotero interface, not the outputs of Zotero's content via a style sheet. I just want to confirm that that is the case.

    CSL does not capitalize "so" in output see: https://docs.citationstyles.org/en/stable/specification.html#title-case-conversion

    the language value "fr" will only tell the CSL style sheet to use the french version of the style sheet if there is one. As I understand the purpose of this language field. The language field doesn't directly impact anything in Zotero.
  • @adamsmith @Rintze Do you recall the origin of the stop words list?
  • I don't, and the actual list used by citeproc-js is (as you know) signficantly longer -- I don't think there's a title-casing standard that wouldn't capitalize "So," though (Chicago, MLA, AP definitely would), so I'm puzzled about that one.
  • @bwiernik

    1) The title I entered in Zotero is this: “The town I loved so well”
    The language I put is just “en”

    “So” is an adverb, so it should be capitalised in English, but Zotero leaves it in lower case.

    2) Actually, capitalisation in French is far from “uniform”. The rules are VERY complex and sometimes totally counterintuitive… which maybe explains why it is complex to ask a computer to know what to capitalise or not.

    So, if I understand well, apart from “en”, which deals with capitalisation, the other languages do not do ANYTHING? Does that mean that, in fact, it doesn’t matter whether I enter “fr” or “it” for an Italian work, for example?
    … in which case… should I just leave the field blank whenever it is not English? Does writing “it” or “fr” make any difference?

    Thank you very much for your answers. Of course, now I have to go back and put the capital letters where I changed them to “sentence case”, but at least I have learned something and will not do it again ;-)
  • @hughp3

    If I understand your initial remark, yes, you are right, I was not trying to cite anything for an article, I was just using the Zotero tool to generate bibliography from an item in order to TEST it.
    However, when I do that, I do chose MLA, so I suppose it corresponds to what would happen if I cited it in MLA, wouldn’t it?
    What is the difference?


    Thank you for the link. Some things surprise me, reading this: I understand that “a”, “an”… should not be uppercased, but some of these words are problematic:

    1) “so” should be uppercased when it is a conjunction, but it should when it is an adverb (no idea how Zotero/the CSL could now that)

    2) similarly, I suppose that “up” should be capitalised when it is a particle/adverb and not a preposition: “He’s up to no good” vs “He climbed up the ladder” (again, I wonder who Zotero deals with these things).

    3) “to” is rarely an adverb but often an infinitive marker. I am not sure if it should be capitalised then…

    As a conclusion, I just wonder: in cases when they SHOULD be capitalised, should I always capitalise “so” or “up” (and “to”?) myself when I enter my titles in the title field, knowing that Zotero/the CSL will always lowercase them by default?

    Also, you have answered one of my questions to @bwiernik above: when it is NOT English, I NEED to write something else than “en”, or run the risk of the CSL considering it is “en” by default.
    It still doesn’t tell me if it makes any difference whether I write “fr” or “it”, though.
  • We generally follow Chicago Manual's rules for title case. That is made explicit in the next version of the specs.

    1) Chicago Manual only says to "[l]owercase the common coordinating conjunctions and, but, for, or, and nor." While "so" can be used as a coordinating conjunction, I think it should always be capitalized. Same with "yet." I think this should get updated

    2) That's right. There's no reliable solution for these edge cases, I believe.

    Re 3) we follow Chicago Manual title casing rules, which do lowercase "to" as an infitive marker (though do uppercase it as an adverb as in "Come To")
  • I am basing my style on MHRA 3, and have that setting in Jurism, though I am not obliged to follow MHRA exactly.
    On Title case, the MHRA Guide version 3 says:
    In English titles the initial letters of the first word and of all nouns, pronouns (except the relative ‘that’), adjectives, verbs, adverbs, and subordinating conjunctions are capitalized, but those of articles, possessive determiners (‘my’, etc.), prepositions, and the co-ordinating conjunctions ‘and’, ‘but’, ‘or’, and ‘nor’ are not […] The first word of a subtitle following a colon is capitalized […]
    So far, so good.
    On foreign titles it says
    English works with foreign titles are normally capitalized according to the English convention rather than that of the language of the title
    With that, I strongly disagree! Luckily, so do the makers of Jurism, so I am happy that my German and French titles are rendered as I write them.

    My remaining problem is with my translation into English of titles that include foreign words or names, e.g.
    “Oil Paintings, Watercolours and Pencil Drawings from the van der Grinten Collection”
    This is tagged in Jurism as an English variant (en), so is erroneously rendered as
    “Oil Paintings, Watercolours and Pencil Drawings from the Van Der Grinten Collection”
    Since I switched to Jurism, I can no longer apply my workaround of pretending it is not in English by using the ‘zxx’ language code.

    For me, Jurism / Zotero is overthinking capitalisation. I do not need Jurism / Zotero to capitalise for me; I am quite capable of doing it myself. I like the “Title case” option in Zotero fields; that helps, but I would still like to be able to make corrections and have them stand.
    I am not very interested in long debates about what the ‘right’ rules are, either.

    I would prefer an option in Zotero to simply turn off capitalisation

  • @AndySymons Please do not post the same response in multiple threads. Ask your question once and leave it at that.
Sign In or Register to comment.