Sort non-numeric dates last

edited August 1, 2018
I've been trying to figure this out for hours, and I'm lost.

See also this discussion and my (rather long) post just now:
https://forums.zotero.org/discussion/882/other-options-for-date-in-press-etc/p3

When I write "forthcoming" (etc.) in the date field, the result is as desired, except for sorting. In the bibliography, it seems that it will default to "0000" for sorting purposes, so that "Smith forthcoming" is always placed before "Smith 1999" and "Smith 2018".

I've tried a number of ways to work around this in the CSL style:
--Checking if the date is numeric (it's not, and there's no other way to test)
--Prefixing it with text (or similar) so that sort should treat it as alphabetic rather than date. Zotero seems to just ignore this and insist on pretending it's still the original date only for sorting purposes, not matter how many macro layers it is embedded in.
--There seems to be no way to check what the value is, or what type of value it is, within CSL.

The answer seems to be that Zotero's programming makes a bad assumption: there's no advantage that I can think of to considering it "0000" instead of "9999", and clearly an advantage in sorting these items last. But getting an update through Zotero might take a long time, and I'm not sure what will happen to dates with coming updates anyway (see linked discussion above).

So is there anything at all that can be done within CSL?

Sorting non-numeric dates last (ascending) is obviously the correct thing to do.

I realize there are other ways I could attempt to deal with "forthcoming" etc. (see my linked comment above for why I don't want to do that), but in short this is the most straightforward way, and would be ideal if it just sorted correctly.

In the big picture, this isn't a huge problem, but it's frustrating because it's logically so simple and easy to fix, but Zotero actually seems to block access to this in the way that it forces a numeric interpretation of dates for sort.

--
P.S.: there actually is an argument to treat "n.d." as 0000, e.g., first. If a manuscript is old enough to have an unknown date, it's probably older than other dated items. That's quite different from a "forthcoming" paper. And if "n.d." is simply when there is no date added, that seems easy to accommodate. But if these must be treated the same way, then I think it's much less wrong to put "n.d." at the end (like "forthcoming") than to put "forthcoming" at the beginning.
  • So is there anything at all that can be done within CSL?
    Not without using status, no.
  • Thanks.

    Can this be considered a bug then? What possible reason is there for "forthcoming" (etc.) to be sorted first?
  • Definitely not a bug, no. Standard alphabetical sorting sorts numbers before letters, so this is behaving according to specifications.

    I understand there is a real issue there, but that's the question of how to properly address forthcoming et al. Reversing sort order in a non-standard way because of how you want to see it implemented puts the cart before the horse.
  • edited August 1, 2018
    What? No. It's sorting letters before numbers. That's not standard (compare ASCII, etc.).

    So by your definition, it would be a bug. And by mine too.

    (Note: I don't think this is a sorting issue, but a bug with how the dates are encoded in Zotero as "0000" if words are inserted. It's correctly sorting "0000" first, but incorrectly forcing "forthcoming" as "0000" for sorting purposes.)
  • I responded in your other thread (could you try to keep discussion of the same issue in the same thread). In general, don’t enter text instead of an actual date in the date field.
  • Thanks.

    I consider these relevantly different questions, but blurring together because of Zotero's limited date-handling. Essentially they're all the same question because the answer is "no" :)
  • right, sorry. The issue is the date parser. I don't think its behavior is ideal, but unexpected treatment of unexpected/"illegal" content is hardly a bug. You can make the case for literal pass through and that's certainly not an unreasonable one, but that's very much in the realm of feature requests.
  • edited August 1, 2018
    Thanks for clarifying what is going on.

    Hm. You're not wrong.

    But given that Zotero isn't perfect (no software is) I'd rather see flexibility left open. It's not standard use, so I'd assume 'use at my own risk', but if it's easy (e.g., sorting after, not before), that would be nice, versus 'punishing' my 'illegal' usage because the software doesn't have a legitimate solution. Note that any other approach here is also a hack of some sort, like using the extra field. For a lot of this, I feel like perfection is getting in the way of workable. It's interesting looking back on threads ~10 years old and seeing problems that haven't been solved and could have been solved in easy ways, but instead we're waiting on the perfect solutions.
  • Technical debt introduced by quick-stop solutions is a very real issue. There's a reason to strive to do things "right" the first time around and note have to spend 10 times as much time fixing them in a backwards compatible way later on. The whole reason we have to deal with the Extra field workaround for new fields, for example, is the inability to add new fields introduced by technical debt in the sync infrastructure that took something like 5 years to address.

    There's also no punishing going on, i.e. Zotero isn't doing anything on purpose. It just doesn't have code in place to gracefully handle dates it can't parse, so you're seeing some default fallback.
  • edited August 1, 2018
    My intention is not at all to complain about this wonderful free software. However, if we're getting into the philosophy of it:

    Is it really reasonable to wait 10 years for something that should be basic functionality, when it could be implemented easily, in a decent but not ideal way?

    In principle, I agree with you, but there's a tradeoff with current usability.

    The help I've gotten on the forum has been great though.
  • The Year column would be a Zotero issue, but if this is about bibliographies, this is a question of citeproc-js behavior, not Zotero behavior. And citeproc-js does support literal passthrough for dates, which Zotero uses for dates without parsed years. So then the question of sorting those in bibliographies would be for @fbennett.
  • edited August 2, 2018
    The Year column would be a Zotero issue, but if this is about bibliographies, this is a question of citeproc-js behavior, not Zotero behavior.
    I don't understand precisely what that difference would be, and since I can't see where the date is being treated in these different ways (displayed as literal text, but sorted as "0000") I'm not sure where that difference originates anyway.
    Thank you for trying to narrow it down!

    Note: this discussion got split across multiple threads (my fault), and there's some more info here: https://forums.zotero.org/discussion/22616/capitalization-differences-for-non-year-dates
  • I don't understand precisely what that difference would be
    citeproc-js isn't developed by Zotero developers, so I was CCing the person who would be able to answer your question.
  • Thanks. Sorry, I meant that I wasn't sure exactly when this happens (whether it's internal to Zotero or not). I haven't actually SEEN it happening. I just know that it NEVER changes between Zotero and eventually being sorted at the top, and I never can actually access that information (e.g., to print it out in the bibliography, or via Word's data field JSON info).

    Why do you think this is in citeproc-js? (I'm learning on the fly here, and probably missing something.)
  • (Sorry, partly my fault, I did this from memory). It's citeproc-js because that's what handles the actual sorting of the bibliography, based on the data passed on by Zotero -- the data is what you saw in the field code, i.e. includes the literal date.
  • edited August 4, 2018
    Thanks. Interesting! That explains then why the output of the date macro doesn't match how it's actually sorted. I thought something was going on behind the scenes.

    I wonder how hard it would be to change/fix this in citeproc-js. I don't think anyone would mind if literal dates were sorted as text, after the numbers.

    I don't see that file anywhere. Is that something I can change myself, or would it need to be part of a revision to the software itself?
  • It's part of Zotero and you can change it, but I believe you'd need to rebuild the app after making any changes. It's also rather massive & complex at 17k lines of code (https://github.com/Juris-M/citeproc-js/blob/master/citeproc.js) so it's likely not a quick & easy fix for someone not already familiar with the code.
  • OK, too much for me to take on at this time then. If someone does feel kind enough to fix this for future releases, it would be nice, and I see no downside. But I understand if not.
    Thank you for walking me through everything!
  • @djross3 The respondents to your post are simply saying that you should "hold your horses" for a day or two. Frank Bennett (who desidned the CSL processor that allows Zotero styles to do their magic) is a regular reader of this forum. He is responsible for making key changes to the processor. He is very responsive to requests for changes when the change is reasonable and possible.

    See: https://en.m.wikipedia.org/wiki/CiteProc
  • @DWL-SDCA thanks! No, I didn't catch that subtext, sorry. Happy to wait for this!!
  • Following up on some of this, I've identified some bizarre behavior for sorting from Zotero. I discuss it in more detail here in response to an in depth search through the entries (even into the database data) to find (and correct/mark) my literal date entries:
    https://forums.zotero.org/discussion/comment/314188#Comment_314188

    The relevant summary that I hope alerts the developers to inconsistencies is as follows:
    a date of "forthcoming" will be displayed as "0000" in the Zotero middle panes, but sorted last after all actual dates (e.g., 1999, then 2000, then 2018, then '0000'). Yet when actually used in a bibliography (either in the style preview window within Zotero or in a document elsewhere) the literal text "forthcoming" will be displayed, but it will be sorted first.
Sign In or Register to comment.