Better Date Field

  • edited November 7, 2009
    As Dan says, ordinary seasons are fully structured; they are passed as integers 1-4. Season-like phrases such as "Michaelmas" can be passed as strings through the same season element. In that case they are handled in the same way as proper seasons, but cannot be localized. Examples are given in the interface description, linked above. The latter case ("Trinity", "Michaelmas", "Summer Solstice") is indeed a compromise of the kind you suggest, and similar to the "other" date element recognized by the existing processor.

    In rendering, a season is handled in much the same way as is done in the current Zotero implementation. It will appear on output only if no month and no day are specified. It replaces the month, and acquires the same formatting characteristics. It would be possible to allow seasons to be formatted independently, but it seems reasonable to wait and see whether that's necessary.

    Seasons should probably not appear in CSL 1.0 localized dates of form "numeric". I don't think I've considered that wrinkle, it should probably be added to the test suite.
  • MG6
    edited August 12, 2010
    I have what appears to be a related concern. I work with very old items, which quite often cannot be precisely dated. It happens most of all with manuscripts, but very frequently with printed material, too. I'm eager for date ranges, but using "c." or "ca." for "circa" poses a special problem. If I enter some like "c. 1600's" I get this sort of result:

    Anonymous. “Title text,” c.'s 1600.

    Notice it's flipped. "c.1726-1732" yields "c.-1732 1726." For the former, mousing over the date field, I see "1600-00-00," which is itself an inaccurate representation of my intent, being far too precise. Obviously, I can't just fudge it.

    I'm just not sure how to handle this, and would like to ask that it be considered as a legitimate and serious need. Thanks.
  • MG6,

    We hear you. In fact, the handling of fuzzy dates with ranges is currently ticketed as an issue for the citeproc-js CSL processor, put there by a library technologist with exactly the same concern.

    What would be very helpful is a list of examples that ring the changes of how fuzzy date markers and ranges should play together in the output. Once all of the possibilities are in front of us, it will be easier to work back and figure out how to set things up at the start of the processing chain.
  • MG6
    edited August 12, 2010
    Thank you, that's good news.

    Another problem I can think of is the "Old Style" or Julian calendar, in which the new year starts in March. In the case of England prior to 1750, the year in dates between January and March was, and still is, often represented like this: "Feb. 16, 1615/16" (with 1616 being the "new" reckoning). Right now, I'm entering them like this: "Feb. 16, 1616 [1615/16]," but that is reduced to "February 16, 1616," which can confuse readers as to which calendar you're working with, and mess up a chronology.

    As well as dashes in ranges, I see greater/less than signs ("before 1616" or "<1616"), and uncertain dates may contain brackets and/or question marks: "[1616?]"

    I'd imagine that classicists and historians of the ancient world might have concerns about dates BCE, when the numbers get smaller going forward.
  • May I suggest that if you take the plunge, the exceptions will start to roll in of their own accord? It might be impossible to predict everything people are going to want?
  • but the problem is that you don't want to adjust things too often - and depending on what you do you might even lock things in to some degree. So you want to have a system where you can accommodate 99% of all people 99% of the time - and then you can tell the remaining 1% of people or the ones hitting 1% of cases: "tough luck this can't be done systematically - edit manually at the end".
  • edited August 12, 2010
    Alternative calendars are off the development map, I'm afraid, at least for the time being. When I work with pre- and early-Meiji Restoration materials from Japan, dates in the sources are in the lunar calendar, with intercalary months and only a loose correspondence to Gregorian dates. It's not really feasible to code for all the different output possibilities without the user interface becoming impossibly cluttered. The best alternative for the present is some sort of literal passthrough mechanism, which we can build in fairly easily.

    Re fuzzy dates and ranges, the reason for wanting a roster of possible forms before doing further work on it is that the citation processor input format, and therefore Zotero's internal machinery, will need to be altered to accomodate them. Making modifications that penetrate that deep is costly in time, so we'd want to get things pretty well covered with a single step (edit: as adamsmith says).
  • Regarding the various ways of formatting plain uncertain dates, CSL 1.0 will have is-uncertain-date as a test attribute, so you'll be able to do pretty much anything with affixes and whatnot. The only unsolved problem involving fuzzy dates appears to be their interaction with ranges.
  • @fbennett: Some kind of pass-through seems more good enough to me. The biggest downside would be the inability to sort them as dates within Zotero, correct? From the perspective of people writing, that's a small price to pay for correct citation, which is like the gateway drug to using bibliography management software in the first place. WYSIWYG formatting allowed Word to conquer the world.
  • I've always admired how the "creator" field works, with the toggle for one- or two-word names. Is there a way to toggle the date format as either numeric dates or free text?
  • You might be able to get passthrough out of the current Zotero by enclosing the date field in quotes. Not sure.
  • You might be able to get passthrough out of the current Zotero by enclosing the date field in quotes. Not sure.
    no, you don't at the moment.
  • edited August 13, 2010
    And there's something big to be said for a) standards, and b) not reinventing wheels.

    As with names, dates are critical in many cases for basic functionality, like sorting.

    Outputting dates "as is" is an outstanding ticket, so I don't think it's possible in Zotero 2.0.
  • @bdarcus: Please don't take this the wrong way, but I feel very strongly about the need for this accommodation. I understand the developer perspective, but this is not some arbitrary foible. If Zotero is meant to serve its users needs rather than those of its makers, HTML standards cannot possibly be allowed to trump citation standards that have already existed for hundreds of years. Nor is the LOC the sole authority on dating in academia. It has, after all, existed only for 200 years, and bibliographers see their field as a science. History and literature are not sciences. They are fuzzy.
  • edited August 13, 2010
    I am a user too. I have no problem with the view that we need support for circa dates and date ranges; I subscribe to it. But defining a need should not be conflated with a particular solution to that need, particularly if it breaks critical functionality elsewhere (like sorting).

    As for age of organizations or traditions, that's irrelevant. What is relevant is that the LOC works with these kinds of data, and has identified these precise needs.
  • MG6
    edited August 13, 2010
    You are, of course, saying that we're doing our jobs wrong. Please don't turn Zotero into a revolution.
  • No, you're not doing your job wrong, and I don't think that Bruce is trying to say otherwise. Zotero is trying to store and meaningfully work with data in automated ways, an enterprise that can be greatly helped by using existing standards like the LOC one. Your job is probably different, at least in part because it isn't supposed to automated and doesn't require that dates be unambiguous.

    The pass-through of some dates doesn't seem like too much of a sacrifice, however, since I'm afraid we won't see anything approaching support for the LOC standard in the near future.

    If Zotero were to support entry of the wide range of time expressions that the standards have described, how would input look? This is a serious question -- as I think about it, it seems to me that we would need to implement some sort of advanced date entry pop-up to handle such dates, so that users could explicitly choose calendars and whatnot. If that were done, then the field content (at the database level) could be in LOC date format, and we would just interact with it via a pop-up for complex dates or by entering the date string as it currently standards. The two modes could be alternated between a la one- and two-name input for authors.

    Continuing my advanced data entry pipe dreams, a similar type of pop-up would be appropriate for multiple-language titles, and for multi-part names with ordering rules, titles, and dropping particles. Some UI innovations like this will be necessary if we ever want to get more complex item data into Zotero. Frank has put months of work into perfecting citation styling for carefully constructed bibliographic data, but we don't yet have a way of entering such carefully constructed data.
  • edited August 13, 2010
    From the HTML5 doc linked by Bruce:
    User agents could be instructed to ignore any unrecognised CALSCALE value, treating the contents of the element as plain text for data-processing (but not styling) purposes.
    It's safe to say that Zotero 2.1 is likely to support some form of literal passthrough, and that such a facility will be with us for some time to come, pending the invention of this particular wheel.
  • edited August 13, 2010
    Yeah, I didn't say you're "doing [y]our jobs wrong"; I meant exactly what I wrote. I meant that how you may do things in your discipline or area does not or should not necessarily translate directly into how it gets implemented in Zotero.

    If you're using footnote styles without bibliographies (fbennett, and I suspect MG6), then you don't have to worry about sorting. It's not surprising you think literal dates are an OK solution.

    If, OTOH, you work in the social sciences (like me, more-or-less), and you use author-date styles, then being able to reliably sort dates (including circa dates) is as critical a feature as there is. If a date doesn't sort correctly, then you can end up with wildly incorrect bibliographies, and mistakes with in-text citation references. Stuff breaks, and this is not a trivial thing.

    For that reason, I consider literal dates a highly non-ideal solution, and I would like to isolate its use to a very narrow range of cases.

    And I'm not actually suggesting we wait on the LoC to resolve the issues. Some of us have already contributed to that document (the list is open for contribution from anyone, BTW), and so we could always use some of the principles behind it before it's actually "done."

    For example: if we support "circa" dates in Zotero (and we should), then it's critical that it works a) internationally (if different locales do things differently than "c. 1435", and b) for sorting. The LoCs draft proposal for this (adding a "~" at the end ) I believe comes from decades of practice with MARC, and has the characteristic it achieves the two requirements I note.

    How this is done in the UI is a separate matter. It could be, for example, that people learn this syntax, that Zotero support already existing conventions like "c. 1435", or that the UI add some sort of checkbox by the date box. I have no strong opinion, so long as it's clear, and it works (broadly).

    So to bring this down to something concrete, I see the following cases (which have been discussed elsewhere here, and which have been outlined in the EDTF documents):

    - circa or uncertain dates ("c. 1456")
    - date ranges ("1786-1787")
    - compound dates ("March/April, 2000")
    - different calendar systems

    The first three, it seems to me, are the low-hanging fruit that we can and should support sooner than later.

    FWIW, the RIS spec does not have literal dates. But it does have literal date parts. So the compound example in RIS would be "2000///March/April", which means the "March/April" bit it treated as a literal, but the "2000" is treated as a year. I actually think this is a pretty practical solution.
  • edited August 14, 2010
    Discussion of dates often gets confusing because of the number of processing layers involved:

    (1) Zotero translation
    Read dates in a variety of formats and, if possible, convert them into a stable internal representation for automatic processing;

    (2) Storage conversion and retrieval
    Reduce the parts of the object produced by (1) to a string representation that can be stored in the Zotero DB and quickly restored to the internal form;

    (3) Processor conversion
    Transform the object produced by (1) to a form suitable for digestion by the CSL processor;

    (4) Export conversion
    Convert the object produced by (1) to a compact standard form that can be exchanged with other systems.

    Standards dictate that the object generated at (1) be capable of expressing the elements and values of which the standard date exchange format used at step (4) is capable. User requirements dictate that all date forms (including non-standard ones) that enter the system at (1) be transferrable between systems in some way using the standard form used at (4).

    My own concerns mostly start at step (3), which is why I probably seem to be talking at cross purposes at times.
    The first three, it seems to me, are the low-hanging fruit that we can and should support sooner than later.
    Just so it's clear for everyone, these three are all included in the CSL 1.0 date specification, and are supported (for rendering and for sorting) by the citeproc-js processor that will debut in Zotero 2.1. For step (1) above (parsing of input), the citeproc-js sources include date parsing code (disabled by default) that can produce the internal representation from the human-readable form of such dates. So signficant parts of this problem can be solved rather quickly and rather soon.
  • Just a note to say I'm glad to hear that there is likely to be some sort of literal pass through for dates. I know the developers have heard a lot about how desirable this would be for a number of reasons. The entire issue of dates continues to be both complicated and difficult for the user to follow. There are initiatives afoot to get things like n.d., in press, etc. included (or excluded) from the date field, and there is the entire discussion regarding original date of publication. It seems like all of these things take a long time to get wired into the CSL and will take a longer time to get integrated into styles. In the meantime, users have to go through and break the connection of documents to Zotero in order to hand-edit dates to insert these things. A pass-through option will be a nice step in the direction of better date handling.
  • edited September 22, 2010
    I like to add to the comments that the field behaves _very_ confusing.

    I am a german. I enter something like 12.9.2010 for September 12 2010.
    Zotero parses and the displays "12.9.2010" J M T .

    That is wild. because i entered T.M.J (D.M.Y in english, you probably got that.)

    So now i think, my way of entering is wrong.

    i enter 2010 9 12 according to the J M T directive i got form the UI.

    then zotero shows: "2010 9 12" J T

    Here, my wisdom left the building. Whiskey Tango Foxtrot id going on in z.?

    Call this a layer 8 problem, if you like. The user here cannot for any sakes of it figure out properly, how and what zotero has made of their entered date. There should be less ambigouous ways to DISPLAY the result, once the field is parsed.

    So please advise:

    - how shall i enter the date?
    - how shall i confirm that our little helper has got it right?

    Thanks the lot, dear programmer!
  • Move the cursor over the field after saving the date, without clicking on the field. It will display the saved data in unambiguous numeric year-month-day form.
  • Although have learnt to trust that Zotero does it alright, the interface always keeps confusing me. After years of using Zotero I still have no idea what the grey text to the right of the saved field is supposed to be doing or saying.
  • A "Y" indicates that Zotero has found a year. An "M" indicates a month, and a "D" a day. Mouse over to see what Zotero is interpreting the date as. Zotero shows it as "YYYY-MM-DD" in the tool-tip that appears when you mouse over the date.
  • edited September 30, 2010
    Okay. So it really is a UI problem. I've seen many reports here of people thinking that the grey bold text to the right of the field indicates the order in which Z has parsed the input (this is also thomassprinzing's issue, above). So I have "March 27, 2009" and the grey text says "y m d". And I have "Feb., 1985" and the grey text says "y m". That is simply confusing.

    I am aware of at least four desiderata here. One is to be liberal with regard to input format; Zotero is good at parsing dates so it does not force the user to supply it in a set format. The second is to be liberal with regard to input content: Zotero knows that "Autumn 1981" is used sometimes so it allows that, even though it parses only the year because Autumn doesn't resolve to a numeric month. The third is to show how the date has been parsed so that the user can check things are okay. The fourth is to show what is in the date field (which may be different from what's parsed due to #2).

    The way these desiderata are implemented now is clearly confusing to users. Here are two options that meet the same desiderata but that make much more sense from a UI point of view:

    A. Always store and display all dates in YYYY-(MM-(DD)) format insofar as they can be parsed and stored as such; display unparsed stuff as is. This yields "2008-03-27" and "Autum 1983". Display how Zotero has parsed it ("y m d" and "y", respectively) in a tooltip — it's not that important to users in the majority of cases.
    A1. Leave it to the user how they want to display the date. Or obey the OS language settings. Both options are better than the current situation.

    B. Keep things as they are now but adjust the "as parsed" display to match the order in the input field. This is a little less confusing that how things currently are, and therefore an improvement, but personally I think the prominence of the "y m d" display is just a case of nerdview. If the date field just displays as parsed, the problem is solved, especially if I can just say I want a certain display format across all entries.

    My vote would thus be for option A. What's the use of displaying "March 27, 2009" if it has been parsed (and intended) as 2009-03-27? It's especially anoying when different repositories supply dates in different formats. I don't care that Reference Global gives "02/2009" while JSTOR gives "Feb., 1985". I think none of us cares really. If Zotero parses both as "y m", why not display them in YYYY-MM format (or some other specified format, see A1)?

    Once again, this doesn't have implications for partly parsed dates, which are the only dates for which it makes sense to be displayed as literals.
  • edited November 14, 2010
    Any news about date ranges? I have just realised Zotero still doesn't support them, and this would be a real deal-breaker for me. What is the use of supporting multivolume works (with number-of-volumes), if you can't enter anything as simple as Author, Title (Publisher, 1985-1987), 2 vols.? Why, even journals occasionally have a two-year issue (vol. X, 2001-2002).

    Or am I missing something?
  • For some reason, I thought date ranges were supposed to work in 2.1. They are certainly supported in citeproc-js, the new citation parser ( If they aren't happening yet, perhaps this can still happen in the 2.1 beta cycle...
Sign In or Register to comment.