Chinese creator names - sort logic in Zotero library

I have a lot of Chinese creators (written in Chinese characters, not pinyin) in my Zotero library.

When I sort by "Creator" field in Zotero, I get all the Romanized last names in alphabetical order, then the Chinese creators. I can't figure out how this latter group is sorted. As far as I can tell it's not by pinyin pronunciation of their last names (For example, I see 韓 (Han) entries coming after 王 (Wang) when sorted in ascending order), nor is it by number of strokes. Can anyone enlighten me as to what order these creators are presented in?

I know I can just do a search for the author, but my preferred method is to scroll.

  • edited July 26, 2023
    Anyone have any insight? Since I started this thread, my library has tripled in size and Chinese references, so I'd really like to know the answer. (Unfortunately, it also has not magically come to me during the course of using Zotero...)
  • edited July 26, 2023
    They'll sort correctly if you use one of the Chinese locales, but not in other locales. We're going to see what we can do to get Pinyin sorting in other locales.
  • Sorry, what do you mean by "Chinese locale"?
  • If you set Zotero's language to zh-CN or zh-TW in the preferences. That will change the interface language to Chinese, so presumably not what you want — I'm just noting that it will sort Chinese characters correctly in that case.
  • edited July 27, 2023
    Hello @EmmaWK,

    First, could you describe what you use the author-name ordering for? My understanding is that the display order is just for display. But I can see how it can be useful for certain workflows.

    Second, practically, what would your desired sorting order be? This will help me think of work-arounds.

    For example, what would be the desired ordering of 韓 (hán), 王 (wáng), and Jin, when put in ascending order?

    - 韓 -> 王 -> Jin: This is the order if you set the "language" option, in Zotero Settings, to 中文(简体) [Chinese (Simplified)]. It sorts the Han characters by Pinyin, and all Roman characters follow Han.

    - 王 -> 韓 -> Jin: This is the ordering under language setting 正體中文(繁體) [Chinese (traditional)]. It sorts the Han characters by stroke, and again, Han precedes Roman.

    - Jin -> 王 -> 韓: This is under language setting English. Here, Roman precedes Han, and Han characters are sorted by the numeric order of their code-point (more on this later).

    No matter which language setting you choose, there will be other major effects: all text in the app's user interface will be turned into that language.

    So, if you need to retain the app UI language (say, English), that will limit your option in having a particular order for the Han characters.

    The "default" or code-point ordering of Han character (such as the one used under "English" language setting), for the majority of highest-frequency characters, is not arbitrary: it follows the traditional radical-stroke ordering. First, sort by radical, as defined in the 18th-century Kangxi Dictionary. Then, under each radical, sort by the number of strokes.
  • BTW, this is not a Zotero thing per se. The behaviour arises from the underlying definition or locale-dependent sorting order as created by the International Components for Unicode project. The defaults may be fairly reasonable for mostly monolingual settings, but of course they may not always fit in a particular usage pattern.

    In theory, the same underlying mechanism of handling Unicode characters also supports fine-tuning the sorting order, but currently these are not exposed as user settings in Zotero. I can't say for sure, but I guess with more feedback, there will eventually be improvements in the direction of language/locale support.
  • edited July 27, 2023
    I am using the English UI of Zotero with some items in my library being in Chinese.
    I would like to use the sorting by creator to identify all publications from the same first author. Therefore, the Pinyin ordering within Roman characters would be useful for me:
    韓 -> Jin -> 王

    Most of the time, the Chinese names are simply translated in English with their Pinyin. So that would be the most logical ordering to use under the English language setting in my opinion. I don't see why "code-point" ordering would be useful for someone using an English UI (or even anyone at all?).

    I don't know if integrating the Chinese characters within the Roman characters could be an issue for some workflows. But with separate ordering of Chinese and Roman characters, I will always need to search separately in each language. This is also the problem with searching tools, where "Li" does not search for Chinese characters with this pinyin and therefore cannot find Creators containing "李".

    Since Zotero cannot handle multi-language metadata at the moment, I need to store two items for each publications in Chinese (one in English and one in Chinese), so that I can switch between them depending on the language I want to cite it. [I don't know if there is a better way to do that at the moment...] The Pinyin ordering within Roman characters would show the two items together when ordering by Creator, which would be very useful. With the current ordering, I don't even see any way to find out that the same publication is stored in the two languages in my library.
    For example:
    李飞, 李雁淮, 徐可为, 宋忠孝, 张智, 崔红, & 周莉. (2014). 氧化锆空心球粉体制备及其涂层性能研究进展. 稀有金属材料与工程, 43(12), 3183–3187.
    Li, F., Li, Y., Xu, K., Song, Z., Zhang, Z., Cui, H., & Zhou, L. (2014). Research Progress of the Preparation of Zirconia Hollow Sphere Powder and the Performance of Its Coating. Rare Metal Materials and Engineering, 43(12), 3183–3187.
    http://www.rmme.ac.cn/rmme/ch/reader/view_abstract.aspx?file_no=20141263

    One of the nice thing of the ordering is that you can quickly reach a creator by typing the first letters of its name. This is not possible for Chinese names. It would be nice to get this functionality also working with the pinyin of the Chinese characters.
  • I'm afraid none of the collation order options, as specified by the JavaScript API, would support a purely Pinyin-based ordering for mixed language content. So any implementation has to be crafted specially for this purpose and is therefore unlikely to become part of the mainline Zotero.

    If there's a way to do what you describe (the ability to sort and search by Pinyin), it's likely going to be a plugin.
  • To clarify, better sorting of CJK items in non-CJK locales is certainly something we'd want in Zotero proper — we'll see if there's any reasonable way to do that.

    Search or find-as-you-type by Pinyin probably isn't in the cards from us any time soon, though we'd consider patches.
  • Thanks everyone for your input. It is not ideal I know, but I definitely want to retain the English interface, but now at least I know the sort order. Who knew I would be using my Kangxi dictionary again...!!
  • Sorry @ZoeCMa, neglected to answer your question about usage. I guess because I normally type in English, I found it cumbersome to have to switch to a Chinese input method to search. Most of my Chinese sources are sorted into various folders already, so if the last names in those folders can then be sorted in pinyin order (which is the most intuitive for me), then when I switch to those folders it will be easy to locate the author/entry.

    It would also be cool if typing "Wang" in the search box pulls up entries containing 王 *and* "Wang," but as far as I can tell this is not the case. I've tried toggling the search options in the dropdown menu. So basically, it appears that my preferences are similar to @mjthoraval. I understand that it is not a priority to have pinyin sort order right now with mixed language, so I will make do with Kangxi sort order.
Sign In or Register to comment.