Non-Western Name Ordering in Bibliographies

Special thanks to the developers of Zotero for a wonderful service and a great product. I am hoping to someday move all of my work into Zotero.

I have a question whose answer I am unable to find in these discussions, and I am sure that it is relevant to other users. My work is on Southeast Asia, meaning that I deal with all sorts of name ordering conventions other than Western ones. I do not see how Zotero can be supporting these other naming conventions in order to incorporate into bibliographies and in-text citations.

My problem comes from Malay and Chinese names. Consider two (fake) authors whose work I cite, Abdul Rahman Abdul Malik (Malay) and Khoo Kin Tay (Chinese).

The "surnames" of these authors are "Abdul Rahman" and "Khoo," respectively. So in in-text citations, they should appear as (Abdul Rahman 1994) and (Khoo 2000). They would also be alphabetized by these names. But as far as I can tell, if I put these names in the "last name" field, any bibliography created by Zotero will output their names as follows:

Abdul Rahman, Abdul Malik
Khoo, Kin Tay

These are improper renderings of these names. If I try to put the authors' entire names into the last name field and nothing in the first name field, then I can get the bibliographies to look right, but any citations created through the Word plugin will be (Abdul Rahman Abdul Malik 1994) and (Khoo Kin Tay 2000), which are also improper. Moreover, even if I enter these selected citations by hand, it will not be possible to format the names correctly in bibliographic styles that abbreviate first and middle names (where the proper renderings would be Abdul Rahman AM and Khoo KT).

These problems are not confined to just Chinese and Malay names: in principle, this will affect every Chinese, Japanese, Korean, Vietnamese, Burmese, and Malay author, to say nothing of Arabic or Persian or central Asian names, along with a substantial number of Indonesian, Thai, and Khmer names (where there are multiple conventions depending which ethnic group the author is a member of) and a number of others which I am certainly forgetting. I know that the Library of Congress has detailed procedures for deciding how to alphabetize Asian names.

I am frankly rather surprised that no one has encountered this problem before, so I am hoping that there is an easy fix for this that I've overlooked. My apologies if this question has been asked and answered before, but I have searched through the discussions for Chinese, Japanese, Asian, Vietnamese, Malay, Indonesian, Burmese, and Arabic and found no such help.

If there is no such fix available, it would seem natural to me that this is a prime area for future development. It seems obvious to me that bibliographers and software developers will have to at some day confront the fact that at least a third of the world's people (probably closer to half) do not follow Western name ordering conventions!

Again, thank you so much for a great product, and I apologize if this question has an obvious answer that I have overlooked.
«1
  • Try not using the "last name" field but rather clicking on the icon to the right of the field, which toggles on the "full name" field. This will allow you to get proper sorting and display (though probably not initialization).

    The Citation Style Language (CSL) Zotero uses (and that I designed) is pretty generic about names. It basically assumes two types of personal names: those that sort as displayed (Asia, etc.), and those that don''t (most Western names). But I've yet to come across anyone that's ever really tested this.
  • Thanks for your reply. I just tried it out. I can get the bibliographies to display correctly if I do this (as I noted in my initial post), but again, it does not produce proper in-text citations with the Word plugin, nor does it allow for proper name initialization.

    Also note that when toggling back and forth with a Malay (and by implication any Arabic) name, you can get some important errors. Here's what happens with the name "Abdul Rahman Abdul Malik." I start out with "Abdul Rahman" in the LAST NAME field and "Abdul Malik" in the FIRST NAME field. When I toggle to one name, I get "Abdul Malik Abdul Rahman" which is incorrect (wrong order). When I toggle back, I get "Rahman" in the LAST NAME field and "Abdul Malik Abdul" in the FIRST NAME field. If I start with the correct full name in the single field ("Abdul Rahman Abdul Malik") and toggle to two names, I get "Malik" in the LAST NAME FIELD" and "Abdul Rahman Abdul" in the FIRST NAME FIELD.

    Thanks so much for your help. Again, Zotero is a fantastic product and this is one of the only things that keeps me from embracing it completely.
  • edited May 8, 2008
    Yeah, the problem is that the Zotero team started from a problematic premise: that all names are Western names. This isn't really their fault exactly (even though I warned them!), b/c the vast majority of their users and possible data sources also reflect this bias.

    The problem with the full name option is you then have no way to distinguish the given names (which can be dropped or initialized) from the family names. I think to really solve your (very legitimate) issue, Zotero would need to make it possible (maybe with a checkbox?) to distinguish among different kinds of personal names. From a discussion on the CSL dev list, I recall someone mentioned there are actually three forms, with the third being IIRC "Icelandic."
  • edited May 8, 2008
    BTW, I'll add a followup question:

    I'm working on an import/export model that Zotero plans to adopt. We'll use FOAF for describing people and organizations. FOAF isn't per se designed for this, but I think the solution is to encode family and given names using a language tag, and to tie formatting and sorting rules to those tags.

    Does that makes sense to you?
  • Thanks again for following up. I'm hoping that in the future Zotero releases, that this issue is something that the developers deal with. If I may, I'd like to make a suggestion or two. First, there are dozens of different naming conventions, not just "the Asian Way" and "Icelandic way." For Indonesian names alone, see A. Kohar Rony "Indonesian Names: A Guide to Bibliographic Listing." Indonesia, Vol. 10, (Oct., 1970), pp. 27-36. I think that this article identifies something like 32 separate conventions in Indonesia alone.

    Icelandic is really just a patronymic system like Arabic, so the same forms should work. Abdul Rahman Abdul Malik means implicitly "Abdul Rahman son of Abdul Malik," just like Martin Magnusson means "Martin son of Magnus."

    Second, I think that a checkbox for bibliographic entries would be ideal. But in general, I think that most names could be dealt with in the sense that the last name is connected to the first name with a comma (the standard way) or that it is not (the Malay and Chinese [and Vietnamese, etc.] way).

    The checkbox could allow users to put in names as they see fit (last name or surname or cited name in one field, first and second and whatever else in another field), and then choose to either display names in the standard or nonstandard way. In bibliographies, alphabetized by the cited name, this would mean with a comma separating the last name from the first name, or not. In footnotes, where names appear as they are spoken, this would mean putting first in front of last (as standard), or not (last would stay in front of first, no commas separating them). I bet that this would solve problems for 99% of the world's languages, and it should be relatively trivial to implement, no?

    Does this make sense?
  • edited May 8, 2008
    Yeah, the problem is that the Zotero team started from a problematic premise: that all names are Western names
    No, we really didn't. We started with the premise that most users will be entering at least some Western names, and having an option to enter "first" and "last" names for those both makes the UI clearer and allows for some additional autocomplete functionality. There's nothing stopping a user from switching to single-field mode and leaving that the default.

    However, the single-field mode does currently have certain limitations, as you've discovered, tpep. See my post on the XBiblio list for more about this. What I suggest there (that we add a per-creator field for short citation name, and possibly a sort field for each creator as well) might be insufficient, but, as I also say there, "it seems that any solution that asks the user to enter discrete parts for non-Western names might be inadequate". (How do you have users enter "Abu Karim Muhammad al-Jamil ibn Nidal ibn Abdulaziz al-Filistini" and have all parts of the software do what they're supposed to?)

    Suggestions more than welcome.
  • I just saw your follow-up. I'm probably not the best person to be asking about FOAF, but just identifying "family names" is really tricky in non-western contexts. Indonesian president Susilo Bambang Yudhoyono, for example, doesn't have a "family name" in the strict sense. He parents just gave him three names, and he has decided that he wants to go by President Yudhoyono rather than President Susilo.

    It need not have been this way. His predecessor, Megawati Sukarnoputri, did not go by Sukarnoputri, by rather by Megawati. (Sukarnoputri just means daughter of Sukarno.)
  • edited May 8, 2008
    No, we really didn't. We started with the premise that most users will be entering at least some Western names, and having an option to enter "first" and "last" names for those both makes the UI clearer and allows for some additional autocomplete functionality.
    How is basing software on the notion of "first" and "last" names not a problematic, culturally-specific, premise? If you'd designed for more international-friendliness, you would have not made any assumptions about name order (except perhaps to by default assume that one sorts on family name, but displays it last).

    But let's not get wrapped up in that discussion of past decisions, since it doesn't help us move forward, and this is a difficult issue (as well know). My bad for bringing it up.

    tpep: you understand the problems; how would *you* solve them?
  • edited May 8, 2008
    The Abu Karim Muhammad... example is great. It shows us just how confusing these things can be when we get away from firstname-lastname.

    To answer your question, bdarcus, I would implement the very system that I just suggested. Still have the last name field and the first/middle name field, just an option to connect with a comma/reorder or not. So we have someone whose name is Jane A. Smith. Enter Smith as LASTNAME and Jane A. as FIRSTNAME. If you leave the box for standard style checked, you get

    STANDARD STYLE
    bibliography: Smith, Jane A.
    footnotes: Jane A. Smith
    citations: Smith

    If you uncheck a box for standard style, you get

    UNSTANDARD STYLE
    bibliography: Smith Jane A.
    footnotes: Smith Jane A.
    citations: Smith

    Because it's a Western name, we know to use the standard style. If you do the same thing with Khoo Kin Tay, you get the following.

    STANDARD STYLE
    bibliography: Khoo, Kin Tay
    footnotes: Kin Tay Khoo
    citations: Khoo

    NONSTANDARD STYLE
    bibliography: Khoo Kin Tay
    footnotes: Khoo Kin Tay
    citations: Khoo

    Because it's a Chinese name, we know to use the non-standard style.

    This would handle Vietnamese names, traditional Japanese and Korean names, and even every kind of Malay and Indonesian and Thai and Icelandic convention that I can think of. Martin Magnusson wouldn't even have to worry--if he wanted to be alphabetized by Martin, he could put Martin as his LASTNAME and use the non-standard style; if not, just enter it like any other European name.

    Incidentally, it would not solve every Arabic problem, but the vast majority of them. For Abu Kasim Muhammad al-Jamil ibn Nidal ibn Abdulaziz al-Filistini, presumably the person wants to be cited as Abu Kasim. Then just choose the non-standard naming convention and enter "Muhammad al-Jamil ibn Nidal ibn Abdulaziz al-Filistini" as FIRSTNAME. If the person wants to be cited as some name in the middle (in this case, the example would be Muhammad) would there be a problem. But this is probably rare. People who formally choose to include Abu Kasim (I think this means "father of Kasim") as part of a name on a published document do so because they want to be known by it.
  • edited May 8, 2008
    I also want to mention something else quickly as we think about design issues:

    Part of the challenge here is UI. The current UI has real estate limitations that partly drive the design.

    But what if agents got full class UIs (also)? So imagine in Zotero 2.0 every agent gets a URI, and an associated web page, to which additional properties can be added. That changes the possibilities (and perhaps the challenges).
  • Has there been any further work on issues regarding non-Western author names? If so, could someone point me to material that explains how to enter non-Western author names, and how to handle them in bibliographies and citations?

    And if solutions do not exist, could we perhaps revitalize that discussion? I'm having problems with non-Western names, and I'd like to standardize how I enter them. Currently, I am entering the entire name in a single, rather than double, author entry. However, as noted above, appropriate in-text citations for these should only include the family name, not the given name.

    Thanks.
  • The new (not yet deployed) CSL processor recognizes an additional toggle field in the name, which controls whether the full name is rendered in Western order or in same-as-sort-order order. It's essentially tpep's solution.

    It should be complemented with a style option in CSL that permits a style to turn off this behavior, since many scientific journals seem to force Japanese, Chinese and other non-Western names into Western order; but the functionality for preserving native-language name ordering is there. To make use of it, there would need to be some means of setting the toggle value in the Zotero UI.
  • Thanks for the reply. So, how to promote the solutions that you've outlined (ie, to create a style option in CSL to turn off the behavior, and to amend the Zotero UI to allow a toggle for these options)? Should I post in the feature requests section? Or is this issue already being addressed for future development?

    Thanks again,
    J
  • Just to be clear, any real solution here has to recognize that a bibliography may mix different types of names, with different display rules. CSL is currently based on that understanding; it's up to implementations (like Zotero) to figure out how to make that work for its users.
  • edited July 14, 2009
    JonEP,

    The Zotero developers follow the forums, so they will have read your note and the replies. I myself work in Japan, so this is an item of interest to me.

    What would be really helpful is to have one or more pointers to style guides that require native ordering for non-Western names. If you have a link to hand, feel free to post it to this thread. It raises the profile of an issue to pin it down as a requirement for a particular style or category of styles.

    (In response to Bruce above, the solution I have in mind in the processor would support mixed formatting.)
  • We also need to figure out how to handle the different-names-in-different-scripts problem.
  • Background: This isn't exclusively a Zotero issue. I've used other bibliographic databases and each has had a similar name entry / sorting / format for footnote, reference list, in-text problem. I'm not arguing against working on a problem-fix. However, the problem may not have a resolution that is good for everyone.

    As someone pointed out above, some journals mash up the names of their authors. Indeed, it is well known in library and information science circles that conducting a search using author names can be very messy. The same author's name can be represented in the literature several different ways.

    How should the same author be listed in a bibliographic database if the journals force different name formatting (and sometimes even different spellings). I can cite examples of an author name attached to an article being different from the same author's printed name when attached to an author's reply letter to a comment. The author's citation to the original article may use the name format as printed in the original article or the name may be changed to match the format of the authors name attached to the letter.

    There is a solution for this problem but it is beyond this discussion of Zotero capabilities. What is needed is an Author Authority. That will require the cooperation of academics, policy makers, and publishers worldwide. An author authority could also help disambiguate the problem when there are many authors with common names.
  • How should obvious misspellings of author's names be handled? Should the correct spelling be used when entered into the database or the incorrect spelling as printed in the source? What if there is a printed correction to the author name in a later journal issue?
  • This is somewhat orthogonal to the issue we're discussing here (which is about non-Western name display and sorting), but ...

    I''m not convinced a (single) "author authority" that involves all these players is either feasible, or necessary. It seems entirely feasible that services like Zotero and Open Library can allow for more informal, and crowd-sourced, efforts that achieve the same thing. For example, why can't I, as an author, say how my name should be represented, and what I've published, just as I have with OL? With linked data conventions and technologies, that data can then be aggregated.
  • How should obvious misspellings of author's names be handled? Should the correct spelling be used when entered into the database or the incorrect spelling as printed in the source?
    The former.
    What if there is a printed correction to the author name in a later journal issue?
    Not a problem given my answer above ;-)
  • edited July 14, 2009
    bdarcus,
    We also need to figure out how to handle the different-names-in-different-scripts problem.
    I wouldn't say it's been definitively solved, but the new processor at least has a working solution in place. We reorder names, subject to the exception flag mentioned above, only if they are in Roman-derived or Cyrillic scripts. Everything else appears as in sort order. This at least nicely covers some of the obvious cases (Chinese, Japanese, Korean, Thai, Lao, Khmer).

    Not sure about Arabic and other RTL languages. Can anyone reading this speak for those? The problem is not with handling the romanized forms (we probably have that covered, touch wood), but the name in the original script.
  • edited July 15, 2009
    Regarding style instructions for non-Western names, and FBennet's request that

    "What would be really helpful is to have one or more pointers to style guides that require native ordering for non-Western names. If you have a link to hand, feel free to post it to this thread. It raises the profile of an issue to pin it down as a requirement for a particular style or category of styles." [could someone tell me how to put quoted text into a dashed-line box?--any instructions out there for using vanilla? The vanilla homepage isn't hugely helpful to pedestrians like me... thanks]

    Here are some materials:

    http://www.chicagomanualofstyle.org/ch18/ch18_sec074.html
    or http://www.press.uchicago.edu/Misc/Chicago/CHIIndexingComplete.pdf
    The Chicago manual of style, sections 18.74 on, provides guidance on ordering non-western names. This guidance is applicable to reference styles. Several instruction manuals on citation styles point explicitly to this section to provide guidance to authors (see, for instance, http://www.durhamtech.edu/html/prospective/library/MLA.pdf page 2, “Name Order”).

    http://bcasnet.org/assets/files/cas-author-guidelines.pdf
    This is the journal Critical Asian Studies. See the specimen references, which include non-western examples cited as per examples in this discussion.

    http://www.aaanet.org/publications/style_guide.pdf see section X. This is the American Anthropological Association style, which is used by many of their publications. Glancing through the references cited lists of any AAA publication will demonstrate the style in use, which reflects all the nuances raised by tpep at the start of this thread.
  • JonEP,

    Perfect, thanks!

    (Re quotes, just manually type in in <blockquote></blockquote> tags around the text to be quoted.)
  • I am wondering, is there any progress on the issue of non-western names?

    Back at the time of this thread, there was some discussion of a new, not yet implemented, CSL processor. Did that eventually get implemented? It seemed at the time like a number of issues were pending its deployment.

    Thanks.
  • no, not yet - the new CSL processor seems to be almost done, then it still needs implementation - will be some more months very likely.
  • edited March 5, 2010
    Yep. Both CSL 1.0 and a new processor for Zotero are on the verge of final release. The Zotero team are busy with other important tasks at the moment, but a few months seems a likely horizon for the first development versions with CSL 1.0 integration.
  • Hi, I wonder if there has been further progress on the issue of non-Western names, or if it would be possible to spur some work on this issue?

    Thanks.
  • Not sure (don't recall the issues, and don't have time ATM to refresh my memory; Frank is probably better to comment on this), but the CSL 1.0 spec section on names is here. If you have any suggested revisions, let us know. How it gets implemented in the Zotero UI is another question of course.
  • Names handling in the citeproc-js processor is described here (scroll down to the part on "non-Byzantine" names -- terrible nomenclature, forgive me).

    Basically, for non-Western scripts, we automagically render names in sort order ("family" name first), which should get it right most of the time. An override toggle is available in the processor; if there are problems with the non-Western names that cannot be discriminated by script, that could be tied into the Zotero UI to give full control.

    There are probably some name forms out there that the kit will not handle correctly, but we'll see how it goes. The new processor is running in the trunk version of Zotero, if you're inclined to experiment. The trunk is alpha code, so run it in a separate instance, be sure to take regular backups, etc.
  • Hi all,

    Just wondering if this issue might be ready to advance a bit, -- the new processor is "out", yes?

    Thanks!
Sign In or Register to comment.