Non-Western Name Ordering in Bibliographies

2»
  • My understanding is that this should just work. Are you running into any issues?
  • edited January 26, 2011
    Umm, hmm, not sure how this would work. I have a lot of Vietnamese authors in my list. An author like Lê Cao Đài should be alphabetized under Lê, and the citation (ie., an inline citation like (Lê 2004)) should use the term Lê to indicate the family name. However, the author's full name is Lê Cao Đài with family name first. If I enter Lê in the "(first)" section of the zotero entry, though, bibliographies will include entries for Cao Đài Lê, and inline citations will be in the format (Cao Đài 2004). So my understanding was that the new processor would allow an implementation within the zotero interface that would allow me to assign the "first" name also as the "family" name. It seems like Zotero would need some sort of button on the left (similar to the "Move down" feature that was implemented a while ago) that would let one assign that. Then, the various citation styles would have to be updated to order everything appropriately.

    What I understood, from fbennet's post, was that there's the possibility now to make the changes to the UI, but I wasn't aware that they've been made.

    Perhaps (indeed I hope) I've missed it, and it's possible to do this?
  • Ah yes. This is done intelligently for names that can be easily identified as non-Byzantine (in Frank's terms), but Vietnamese can't be identified by script alone (it can be identified by specific character usage, though). Frank coded logic to default to non-Byzantine for Thai, Chinese, Japanese scripts (and maybe others). Languages with Latin scripts that need this behavior will indeed have to wait for exposure in the UI.

    Perhaps this could be folded into the creator changes for the multilingual branch?
  • I overlooked Vietnamese. It has nice diacritics over all vowels, so a scan by character set should be able to pick it up pretty reliably. I'll take a look, and post here when something is running in the multilingual client for testing. If we can control this automatically (touch wood), we can keep it out of the UI.
  • edited January 27, 2011
    Hmm. A little investigation quickly disabused me of the idea that it will be easy to automate this, at least without some risk of slippage. To avoid driving people up the wall, we'll have to tie the language code on the field (available in the multilingual version) to name ordering, as Avram suggests.

    A question for Jon: would you like to pursue a heuristic for automation (in which case I'll have a bunch of questions for you about the conventions applicable to Vietnamese names), or is a requirement that the language of the author name be set manually on every Vietnamese name acceptable? It's your call.

    (I've gone ahead and implemented a heuristic in the multilingual branch. It's a hidden option, because of the small possibility that it will turn up false positives. If you run multilingual [after backing everything up], open the about:config page in Firefox, find the option "extensions.zotero.csl.autoVietnameseNames", and set it to "true". Vietnamese names should then come out correctly when rendered with CSL. I've also set things up so that names explicitly tagged as "vn" in the headline entry will be formatted with Vietnamese ordering, so you can do things that way as well.)
  • edited January 28, 2011
    Quick reply -- We should keep an eye out for other languages in which the convention in bibliographies is to romanize a text. For instance, the Thai author Thongchai Winichakul should be alphabetized under Thongchai (his family, and first, name). Similarly a Khmer (Cambodian) author would be romanized and sorted by first (family) name, eg., Khin Sok.

    (Edited) There's an exhaustive treatment of the subject of alphabetization for Vietnamese authors available here.

    A quick question for Fbennet: you mention that this is moving ahead in multilingual zotero. I've searched a bit and come up with this thread, which includes a link to download an xpi installer for multilingual zotero. So, that is the route we'd go to be able to take advantage of this?
  • If you use the multilingual Zotero, you can specify languages for both headline entries and romanizations (transliterations of all sorts). Citeproc-js should be able to take advantage of this and order names correctly. I do encourage you to take the multilingual build for a spin-- I'm using it full-time these days, and it is very pleasant to use for someone who uses a lot of languages and scripts.
  • edited January 28, 2011
    Jon,

    The multilingual branch will probably do most of what you need, with the hidden option mentioned above turned on. If you install it over a copy of your Zotero database, it will perform an upgrade, and the mainstream beta versions will no longer run out of the box. The database upgrade is non-destructive, though; we just add a few tables used to store and manage multilingual data.

    If you're interested in running the multilingual client, your best starting point would be the multilingual thread on the zotero-dev list:

    http://groups.google.com/group/zotero-dev/browse_thread/thread/c40b71277c8b6963

    The first post in that thread provides a link to the multilingual client xpi.

    On other issues ...

    Thai (and Laotian) names should come out in the correct order, family name first, regardless of the order specified in the citation style -- Western name ordering is only applied to names written in roman script, cyriillic, or greek. For sorting, we're pretty much at the mercy of the Firefox Javascript engine, but it seems to do well with most scripts. For languages that don't sort in their normal written form (Cambodian, Chinese, Japanese) you can add transliterations and specify them as the controlling form for sorting. You can also render transliterations, append translations to titles, or append the proper written form of transliterated author names. There's not much that it won't handle.
  • I'm not sure if this is the right thread (if not, I'd appreciate being pointed toward the right one), but I recently posted elsewhere about a problem I'm having with sorting names of Arabic origin that actually should be sorted like Western names but are not. (https://forums.zotero.org/discussion/30974/any-idea-why-an-a-author-comes-last-in-the-bibliography/#Comment_161709). I'm using my own slightly modified Chicago author-date style.

    I have several author last names starting with Abu- in my bibliography--Abu-Lughod, Abu-Zahra--and these are sorted into L and Z respectively, although the Abu- is left at the beginning of the line, so it just looks like an out-of-place "A" name.

    I've never seen such names sorted this way in any bibliography and I don't think any reader would see (Abu-Zahra 1970) and then go looking for it under the Z section in the bibliography. So this seems to be a bug.

    Thanks.
  • I hope it is okay to resurrect threads when (imho) relevant to the previous discussion.

    When I export a Japanese bibliography to APA format the order of the names is not the same as that shown in the Zotero interface (when ordered by the name column), nor alphabetical, nor any other order I can recognise. I wonder what the order is.

    I think that Japanese bibliographies are usually given in Japanese alphabet (a i u e o, ka, ki, ku, ke, ko) order but I would not know how to put them in that order unless there were a separate field for name reading and a way of ordering those readings according to the Japanese alphabet.
  • For non-alphabetic scripts (Chinese, Japanese), the only way to impose a phonetic sort order is to use a tool designed for that purpose, such as Juris-M. https://juris-m.github.io/
  • Thank you. It is about time I take the plunge.
  • If you want to make two leaps in one go, I think I might be putting up a beta version of Juris-M 5.0 on Monday or Tuesday.
  • I ran into this very issue, and I’m not completely sure I understand the solution proposed here.

    Our use case are Chinese names that are written in Latin alphabet, e.g. "Li Fanwen". Thus, the alphabet checking heuristic in citeproc-js clearly fails, as the name uses latin characters, but should be treated as non-Western with regard to name part order.

    I understand that there is no way to express this in the UI. But are there hidden switches to achieve this result? Can this be solved on the CSL level? I guess it would require the ability to specify either language or name-part order on a per-name basis in the input JSON file, but I could not find any such possibility.
  • edited February 18, 2020
    For Jurism, the processor recognizes an extended version of the CSL specification (CLS-M) that allows of name ordering on a per-locale basis, via attributes set in the style (name-as-sort-order and name-never-short). The settings won't work well with Zotero, because (apart from giving you a warning that the style is not valid CSL) they rely on a locale setting on the name field itself (for this, you would specify the locale of the name as a transliteration of Chinese, using something like "zh-alalc97," and in the style list "zh" as one of the values to name-as-sort-order and name-never-short).
  • @fbennett, thank you very much for your response! What you describe is exactly what we are looking for. However, we would depend on a standard conforming way: We are using pandoc-citeproc in our pipeline, so we could not easily switch to using Juris-M.

    Do you know if there are initiatives to incorporate such capabilities into the CSL standard? Would it make sense to raise the issue on the xbiblio-devel list? If this we part of the CSL spec, we could more easily push for inclusion in pandoc-citeproc.
  • edited February 19, 2020
    There is no harm in raising it, but CSL-M/Jurism has been around for awhile, and there probably won't be any movement in the short term. There is an initiative for a citeproc-js replacement (citeproc-rs, composed in Rust), initially targeting official Zotero, but with a declared objective of ultimately supporting CSL-M as well. The developer used a pandoc bridge for some of the proof-of-concept testing, so the finished processor could eventually bring CSL-M to pandoc. It's a large task and a long road, though, so not an immediate solution. I don't know any Haskell, so I wouldn't know how much work would be needed to (or whether it would be possible to) somehow hack in citeproc-js as a substitute pandoc citation processor, as a shorter-term fix, but that might be an alternative.

    No immediate help there, unfortunately, but that's what I know about the state of play.
  • (A less elegant route would be to just kludge the Chinese names as single-field values, of course. Not sure whether that would create any other problems.)
  • Thank you very much! That is indeed helpful information. citeproc-rs looks very promising, and the pandoc bridge could make it a drop-in replacement for our use case.

    We are currently using workarounds, but a proper solution would be very much welcome. I’ll keep an eye on citeproc-rs.
  • (A less elegant route would be to just kludge the Chinese names as single-field values, of course. Not sure whether that would create any other problems.)
    Yes, that’s what we’re currently doing. But we’re using "family", because then we can use "suffix" for the original name in Chinese script. Very hackish, and also creates some problems, but allows us to achieve 90% the output we want.
Sign In or Register to comment.