A UI proposal for a better date field
I'm crossposting this from here because I think that other thread has become a bit diffuse, and Zotero's functionality has changed since that thread was first started. Just to avoid misunderstanding, this is not about literal passthrough, nor does it have to do anything with CSL. It is just about the Zotero user interface for dates, which is currently confusing and could be improved easily.
Here is the proposal..
There is already a date field, an "as parsed" indicator and a tooltip. I'm suggesting that Zotero:
1. Consistently display YYYY-MM-DD (or some format of the user's or OS's choosing) in the date field, except for unparsed stuff which is displayed as literal.
2. Get rid of the bold grey "as parsed" text to the right of the field and instead display it in the tooltip.
Here's the issue. There are many reports in the forums of people thinking that the grey bold text to the right of the date field indicates the order in which Z has parsed the input (this is also thomassprinzing's issue, above). So I have "March 27, 2009" and the grey text says "y m d". And I have "Feb., 1985" and the grey text says "y m". That is simply confusing. A further source of confusion is that dates always stay as they have been entered (by a site translator or —more rarely— by the user). Thus stuff grabbed from Reference Global has 02/1985 whereas JSTOR has "Feb., 1985". That is confusing and in most cased unwanted clutter from the user's point of view. I don't care that Reference Global stores its dates differently than JSTOR does, and I certainly don't want my own database to replicate their idiosyncrasies!
I am aware of at least four desiderata with regard to the date field:
(1) Be liberal with regard to input format; Zotero is good at parsing dates so it does not force the user to supply it in a set format.
(2) Be liberal with regard to input content: Zotero knows that "Autumn 1981" is used sometimes so it allows that, even though it parses only the year because Autumn doesn't resolve to a numeric month.
(3) Show how the date has been parsed so that the user can check things are okay.
(4) Show what is in the date field (which may be different from what's parsed due to #2).
The way these desiderata are implemented currently is clearly confusing to users. The above proposal meets the same desiderata but makes much more sense from a UI point of view.
To reiterate: Always display all dates in YYYY-(MM-(DD)) format insofar as they can be parsed and stored as such; display unparsed stuff as is. This yields "2008-03-27", "1985-02" and "Autum 1983". Display how Zotero has parsed it ("y m d", "y m" and "y", respectively) in a tooltip — it's not that important to users in the majority of cases.
Optionally (and even better), leave it to the user how they want to display the date. Or obey the OS language settings. Both options are better than the current situation.
Here is the proposal..
There is already a date field, an "as parsed" indicator and a tooltip. I'm suggesting that Zotero:
1. Consistently display YYYY-MM-DD (or some format of the user's or OS's choosing) in the date field, except for unparsed stuff which is displayed as literal.
2. Get rid of the bold grey "as parsed" text to the right of the field and instead display it in the tooltip.
Here's the issue. There are many reports in the forums of people thinking that the grey bold text to the right of the date field indicates the order in which Z has parsed the input (this is also thomassprinzing's issue, above). So I have "March 27, 2009" and the grey text says "y m d". And I have "Feb., 1985" and the grey text says "y m". That is simply confusing. A further source of confusion is that dates always stay as they have been entered (by a site translator or —more rarely— by the user). Thus stuff grabbed from Reference Global has 02/1985 whereas JSTOR has "Feb., 1985". That is confusing and in most cased unwanted clutter from the user's point of view. I don't care that Reference Global stores its dates differently than JSTOR does, and I certainly don't want my own database to replicate their idiosyncrasies!
I am aware of at least four desiderata with regard to the date field:
(1) Be liberal with regard to input format; Zotero is good at parsing dates so it does not force the user to supply it in a set format.
(2) Be liberal with regard to input content: Zotero knows that "Autumn 1981" is used sometimes so it allows that, even though it parses only the year because Autumn doesn't resolve to a numeric month.
(3) Show how the date has been parsed so that the user can check things are okay.
(4) Show what is in the date field (which may be different from what's parsed due to #2).
The way these desiderata are implemented currently is clearly confusing to users. The above proposal meets the same desiderata but makes much more sense from a UI point of view.
To reiterate: Always display all dates in YYYY-(MM-(DD)) format insofar as they can be parsed and stored as such; display unparsed stuff as is. This yields "2008-03-27", "1985-02" and "Autum 1983". Display how Zotero has parsed it ("y m d", "y m" and "y", respectively) in a tooltip — it's not that important to users in the majority of cases.
Optionally (and even better), leave it to the user how they want to display the date. Or obey the OS language settings. Both options are better than the current situation.
First, for people who might be more familiar with the field, is there any user-interface research out there that might provide support for this proposal?
Second, how would you extend it to the more advanced data functionality we've discussed?
We'd also likely need to still store the date as entered, because part of the point of the current implementation is that parsing logic can improve and better support previously entered dates. So when users clicked on the field, it would probably still show them the date in its original format (which would help if it was parsed incorrectly).
But generally speaking, I could see displaying parsed dates in locale format and/or a user-configurable format. Removing the parsing indicator might be all right, but I suspect there would be plenty of users who wouldn't understand (or simply notice for a given date) that there was a difference between a parsed date displayed in a predetermined format and an unparsed date displayed as is. But perhaps those users wouldn't benefit from an indicator either.
http://forums.zotero.org/discussion/14578
- with Zotero 2.1, seasons can also be recognized, right? Does this mean we will be seeing "y s" as well (e.g. for "March 2010")?
- would it clarify things if the tooltip would include the order of the parsed date elements? E.g. "2010-10-01 (YYYY-MM-DD)" instead of "2010-10-01".
- would it clarify things if the tooltip would only show the date elements found? E.g. "2008-04" instead of "2008-04-00".
- As has been suggested elsewhere, the "y m d" indicator could also stay but display the actual order of parsed elements, since that seems to be the source of confusion for some people.
- Zotero could display a formatted date (if available) in, e.g., the Date column in the middle pane but still display the unparsed date in the metadata pane.
So maybe I should rephrase for clarity: considering that (1) the great majority of items enters the library through translators, and (2) the great majority of items has completely parseable dates, my proposal is that parseable dates (which would be the great majority of all dates in the db) are all displayed in the same format by default. The proposal is concerned simply with maximizing consistency while respecting exceptions. I don't see what good it does to perpetuate informationally equivalent idiosyncrasies, at least not in basic user-facing UI elements.
Storing the original format is fine if that's useful info (I see the point about possible future parsing improvements, though this does not obtain in common cases like "09/2002" vs. "Sept., 2002".) I think it would make sense to display the stored date in the tooltip, along with the (reordered) parsing indicator.
Thinking from the user's point of view, having consistent behaviour for the majority of cases is way more important than knowing about parsed or unparsed state. The current UI gives priority to idiosyncrasies and exceptional cases. I propose to do away with the needless idiosyncrasies and to treat the exceptional cases for what they are: exceptions. My guess (which I think will be confirmed by the devs) is that the majority of dates in the average db is in fact parseable. Let us take that fact as the starting point.
- A consistent presentation of date in the Date column would be helpful as I very often sort on that column and use it as a quick way of understanding the evolution of some issue/concept/etc.
- A user selectable presentation of date in that column and in the Date Modified column would be very helpful. I prefer y-m-d, but because of my Windows region settings, Date Modified defaults to D-M-Y.
- I'm much less concerned about the presentation in the metadata panel. I don't often get actually confused, but it does seem odd that there is often a mismatch in ordering between the greyed "y m d" and the date as entered.
Thanks!
Tom
From Dan's first comment I gather that he agrees that the proposal is sensible: And I do believe I have countered Rintze's worry that the distinction between parsed and unparsed dates might be confusing. (Basically, the present UI is confusing in more cases and to more users.)
Will take another look though.
I really don't have an answer; just asking.
My views on this will also be influenced by my question on extending it for more advanced date features. So if I want to indicate a "circa" data, do you imagine the same approach, but with a) a "circa" checkbox to the right of the field, or b) the need for the user to add some modifier (like "1954-04-03~") to the displayed field?
Just compare this with how it currently works. You enter a date as 02-1986. JSTOR gives you "Feb., 1986". Reference Global gives you "1986/02". Some other translator gives you "February 1986". All of these bloody variants stay just the way they are. And the user is supposed to understand that Zotero has in fact parsed them? Oh no, there is some grey text to the right of the field that makes it seem like it some are parsed in the wrong order! Then you display the "date" column in the middle pane and you see that the idiosyncrasies even have propagated there! There is something in a mouseover tooltip that you only discover after months. But by that time the average user has been so thoroughly confused that they (1) certainly won't have understood Zotero's date parsing power and (2) feel forced to enter the dates in some consistent format themselves. I've seen this happen.
As for the second issue, I'm not sure how it is relevant. To reiterate, my issue is just the consistent display of parseable dates in the UI. If Zotero's parsing is improved to deal with "circa" dates I guess some way can be found to indicate that the date has been parsed as such.
I think everyone understands and has taken on board your complaints about the current UI, there's no need to repeat them. But there will be a lot of work and some risk involved in moving to a fresh set of compromises, and people are wanting to be sure that when we jump, we jump to the best possible position in the field of possibilities. Further to that thought ... The ability to handle "circa" dates has already been implemented in the CSL processor, so this isn't something lurking off in a distant future. Whatever solution is adopted in the next weeks or months is likely to actually land on people's desktops, and we're open to (eager for, even) concrete and specific suggestions on how to get it right -- how best to handle this case in the data layer, and how best to represent it in the UI.
If input works like that and if the parsing and handling of this is taken care of in the CSL processor, they will just be delighted that everything works as expected. I don't think a checkbox or a special convention, as per Bruce's suggestion, is necessary; this adds to visual and cognitive clutter. Besides, "ca." is a widespread convention already for encoding uncertainty. (In fact it is so widespread that I wonder whether localization would be needed. I can't think of a scholarly tradition that doesn't use the Latinate "ca.". But I might be wrong here.)
For users who worry about how their dates are parsed (of which there are very few), I would use the tooltip to say "Parsed as: circa YYYY-MM".
So I like your ideas basically. I guess I would say, in answer to my own questions, that a circa date would be displayed (as parsed) exactly as it should be stored in the data layer , or at least how it's interpreted for processing and export, which is to say I should see "1954-03-05~", and when hovering over it see "c. March 5, 1954" (or whatever).
Part of the problem, though, is that this field is also for editing. So I'm not sure about the fundamental design disconnect here, which is that the stored date is not what the software actually ultimately sees. It may be that this needs a rethink.
2) You bring up the fact that different translators produce dates in different date formats (even if the date was originally structured, as is the case for e.g. PubMed records), and that Zotero should display these dates in a single format. Wouldn't it be preferable to solve this at the source, i.e. in the translators? Maybe translators can be allowed to save dates in a structured format (the date field is currently always plain text, right?), or, alternatively, we could just agree on a common unambiguous format (e.g. YYYY-MM-DD).
3) Although I recognize the problems with the current approach, a redeeming feature is the ability to see both the unparsed (in the field) and parsed date (in the tooltip) at the same time. With Dan's suggestion ("So when users clicked on the field, it would probably still show them the date in its original format (which would help if it was parsed incorrectly)."), you would seem to lose that benefit.
4) Regarding "circa": first, most scientists don't have to deal with approximate dates, so for them a checkbox would be a (confusing) distraction. Also, it wouldn't account for date ranges with an approximate date at one end of the date range. Furthermore, CSL 1.0 locales contain full ("long") and abbreviated ("short") terms for "circa", so it might be possible to parse localized variants.
For dates that weren't fully parsed, the editing view and the display view would be the original value, but hovering would still show the parsed value.
The one still-kind-of-ugly part of that is having the editing view show the inconsistent original format. One option might be to have the date processing code flag dates that are totally unambiguous (which would include but not be limited to dates in the agreed-upon structured format) and store and display those consistently rather than as is. There's not really any ambiguity about "March 5, 1954", so that could conceivably be stored solely in a structured format. On display, dates stored that way would display in a consistent format (which might just be the structured format or might be something else unlikely to be made unambiguous during editing).
We might want to look through some examples of how translators handle dates in order to develop a recommendation for translator authors, but this probably won't affect the underlying design in any case.
In other words, just reverse the current behavior, but replace the tooltip with a clickable edit button. The hint and extra step might help avoid user alarm when clicking on the field changes the representation.
I don't what the performance issues are of attempting to parse in _all_ locales, but it would be awfully nice if I didn't have to be careful to type English dates, even when the remaining data is all in, say, Russian.
I suppose there's also a small risk of ambiguity, if a month name in one language is a different month in a different language, but ... let's hope that's not a significant issue.