Inconsistent sorting with Chinese names

I have a lot of Chinese/Taiwanese authors in my database, and I've noticed that the sorting is not always consistent when using Pinyin names. I'm not sure why. The photo shows an example where, when sorted alphabetically, I get some Chens and then some Chengs and then some more Chens. Even single Chen authors are coming AFTER Chen and X studies.

Perhaps I'm too used to APA style and expect it in a certain order, but sorting it this way does make it a bit difficult to find what I'm looking for when scrolling. Why is Zotero sorting like this?


  • Nothing?

    This is also happening with the names Ho, Lo, and others. They aren't consistently sorted in alphabetical order, and I can't find a reason why they shouldn't be.
  • Could you export a triplet of the entries that are missorted as RDF, upload the file somewhere (like dropbox) and post the link here?
  • Here are a few of the names that seem to have problems. Again, it's only Chinese names in Pinyin that seem to have problems. Other names are sorted correctly.

    So for the first file, Chan, you'll see that there's a Chan, then on to Chandler and Chang, but some single author Changs, then Changs et al, then more single author Changs, then eventually back to Chan. Clearly something is getting missorted, and I can't figure out what it might be.

    This happens with the four files in the folder as well as many more throughout my collection.

  • This should be fixed in the latest 5.0 Beta, and the fix will be included in 5.0.26.

    (When you tell the JavaScript sorting mechanism to ignore punctuation, it apparently ignores whitespace too, so "Chang" sorts before "Chan H" (because "g" comes before "h"). We've disabled punctuation-ignoring for now.)
  • It's definitely better now (5.0.27), but still a bit inconsistent:

    Chan et al.
    Chan et al.
    Chan et al.

    So it seems now it sorts primarily by the name of the first author, which is fine in most cases, but still causes some confusion between single- and multiple-author documents like this.
  • Can you give an RDF of those items like you did last time?
  • Actually I still have these names — no need for a new RDF. I'll take a look.
  • But @drberg, you're right about what's happening here. It's sorting by the first author, including the given name when the family name is the same, and ignoring subsequent authors as long as the first author isn't entirely the same.

    This is by design, and the alternative isn't obviously better:

    Chan, Alice
    Chan, Victoria
    Chan, Michael + Smith, Bob + Taylor, Linda

    This would appear correct:

    Chan et al.

    But the actual sorting (having Victoria Chan before Michael Chan) wouldn't really make sense if you knew the author you were looking for.

    There's actually a hidden pref, extensions.zotero.sortCreatorAsString, to ignore everything but what's in the creator field itself. It can be toggled from the Config Editor in the Advanced pane of the preferences.
  • The new/current Zotero behavior is consistent with most citation style sorting, so I think it makes sense.
  • Fair enough. Your explanation makes sense, but for some reason my thinking is that the sorting should organize the single authors before the multiple authors. Anyway, it's much better after having removed whitespace; I suspect within a week or two I'll be used to looking at it the way you do.

  • An unfortunate consequence of this change is that items with titles beginning with punctuation now sort at the top of the list, rather than based on the first letters (items beginning with numerals are also at the top, after punctuation, but I'm not sure if that's a change).

    Could the punctuation-ignoring be only disabled for the Creator field sorting?
  • This behavior is not new, at least for some punctuation marks, such as the colon. I like punctuation-aware ordering--please keep it. I've been relying on it for forced sorting of notes and collections.
  • edited November 25, 2017
    No, this is a regression. When we were ignoring punctuation for sorting by default, we were explicitly sorting on initial punctuation, except for a few characters. Those characters are now being ignored as well (causing them to sort at the top), so that will need to be fixed. Issue created.

    Initial colons (and underscores) aren't affected either way.
Sign In or Register to comment.