A UI proposal for a better date field

mark · October 1, 2010

I'm crossposting this from here because I think that other thread has become a bit diffuse, and Zotero's functionality has changed since that thread was first started. Just to avoid misunderstanding, this is not about literal passthrough, nor does it have to do anything with CSL. It is just about the Zotero user interface for dates, which is currently confusing and could be improved easily.

Here is the proposal..
There is already a date field, an "as parsed" indicator and a tooltip. I'm suggesting that Zotero:
1. Consistently display YYYY-MM-DD (or some format of the user's or OS's choosing) in the date field, except for unparsed stuff which is displayed as literal.
2. Get rid of the bold grey "as parsed" text to the right of the field and instead display it in the tooltip.

Here's the issue. There are many reports in the forums of people thinking that the grey bold text to the right of the date field indicates the order in which Z has parsed the input (this is also thomassprinzing's issue, above). So I have "March 27, 2009" and the grey text says "y m d". And I have "Feb., 1985" and the grey text says "y m". That is simply confusing. A further source of confusion is that dates always stay as they have been entered (by a site translator or —more rarely— by the user). Thus stuff grabbed from Reference Global has 02/1985 whereas JSTOR has "Feb., 1985". That is confusing and in most cased unwanted clutter from the user's point of view. I don't care that Reference Global stores its dates differently than JSTOR does, and I certainly don't want my own database to replicate their idiosyncrasies!

I am aware of at least four desiderata with regard to the date field:
(1) Be liberal with regard to input format; Zotero is good at parsing dates so it does not force the user to supply it in a set format.
(2) Be liberal with regard to input content: Zotero knows that "Autumn 1981" is used sometimes so it allows that, even though it parses only the year because Autumn doesn't resolve to a numeric month.
(3) Show how the date has been parsed so that the user can check things are okay.
(4) Show what is in the date field (which may be different from what's parsed due to #2).

The way these desiderata are implemented currently is clearly confusing to users. The above proposal meets the same desiderata but makes much more sense from a UI point of view.

To reiterate: Always display all dates in YYYY-(MM-(DD)) format insofar as they can be parsed and stored as such; display unparsed stuff as is. This yields "2008-03-27", "1985-02" and "Autum 1983". Display how Zotero has parsed it ("y m d", "y m" and "y", respectively) in a tooltip — it's not that important to users in the majority of cases.
Optionally (and even better), leave it to the user how they want to display the date. Or obey the OS language settings. Both options are better than the current situation.

bdarcus · October 1, 2010

I really have no strong opinions on this, but have two questions:

First, for people who might be more familiar with the field, is there any user-interface research out there that might provide support for this proposal?

Second, how would you extend it to the more advanced data functionality we've discussed?

ajlyon · October 1, 2010

I can see having the date displayed on mouse-over conform to the user's locale.

dstillman · October 1, 2010

So I agree that displaying parsed dates in lots of different formats is unnecessarily confusing, but the solution may not be quite as simple as you make it.

1. Consistently display YYYY-MM-DD (or some format of the user's or OS's choosing) in the date field, except for unparsed stuff which is displayed as literal.

If we did this, "unparsed stuff" would probably have to mean "the entire field as is, if it doesn't parse cleanly", because the order of the data is lost after parts are pulled out during parsing. And not to mix (contentious) issues, but that's how literal passthrough will likely work—it will just pass through the whole field if the whole thing doesn't parse cleanly.

We'd also likely need to still store the date as entered, because part of the point of the current implementation is that parsing logic can improve and better support previously entered dates. So when users clicked on the field, it would probably still show them the date in its original format (which would help if it was parsed incorrectly).

But generally speaking, I could see displaying parsed dates in locale format and/or a user-configurable format.

2. Get rid of the bold grey "as parsed" text to the right of the field and instead display it in the tooltip.

Removing the parsing indicator might be all right, but I suspect there would be plenty of users who wouldn't understand (or simply notice for a given date) that there was a difference between a parsed date displayed in a predetermined format and an unparsed date displayed as is. But perhaps those users wouldn't benefit from an indicator either.

dstillman · October 1, 2010

Not that it should happen, but for what it's worth:

http://forums.zotero.org/discussion/14578

Rintze · October 1, 2010

My preference would be to keep the current behavior of showing the unparsed date in the date field. Otherwise some fields will show a parsed date, while others will show an unparsable/unparsed date, and I don't know how a) you'd communicate to users that date fields can toggle between these two states, and b) how you'd show whether a given field contains a parsed or unparsed date. Some thoughts:

- with Zotero 2.1, seasons can also be recognized, right? Does this mean we will be seeing "y s" as well (e.g. for "March 2010")?
- would it clarify things if the tooltip would include the order of the parsed date elements? E.g. "2010-10-01 (YYYY-MM-DD)" instead of "2010-10-01".
- would it clarify things if the tooltip would only show the date elements found? E.g. "2008-04" instead of "2008-04-00".

dstillman · October 1, 2010

Other thoughts, to go along with Rintze's:

- As has been suggested elsewhere, the "y m d" indicator could also stay but display the actual order of parsed elements, since that seems to be the source of confusion for some people.
- Zotero could display a formatted date (if available) in, e.g., the Date column in the middle pane but still display the unparsed date in the metadata pane.

Rintze · October 1, 2010

the "y m d" indicator could also stay but display the actual order of parsed elements

Just noting that in this case my second point becomes even more important (because the "y m d" order of the indicator currently corresponds with the order of the date elements shown in the tooltip).

mark · October 1, 2010

Responding to Dan:

If we did this, "unparsed stuff" would probably have to mean "the entire field as is, if it doesn't parse cleanly", because the order of the data is lost after parts are pulled out during parsing.

I have no problem with this. My wording was ambiguous but that is what I intended.

We'd also likely need to still store the date as entered, because part of the point of the current implementation is that parsing logic can improve and better support previously entered dates.

This is fine. Mine is a UI issue; I trust Zotero to keep handling things under the hood as needed. I am not proposing any loss or impairment of functionality, just a better way of organizing the UI.

Removing the parsing indicator might be all right, but I suspect there would be plenty of users who wouldn't understand (or simply notice for a given date) that there was a difference between a parsed date displayed in a predetermined format and an unparsed date displayed as is. But perhaps those users wouldn't benefit from an indicator either.

I think the latter is the case (users who don't understand/notice the difference between parsed and unparsed are unlikely to care about the parsing indicator). But in a way the parsing indicator is a minor issue; I think there is a much larger set of users who presently are confused as to why the date field would be displaying different things for entirely equivalent dates (my Reference Global vs. JSTOR example).

So maybe I should rephrase for clarity: considering that (1) the great majority of items enters the library through translators, and (2) the great majority of items has completely parseable dates, my proposal is that parseable dates (which would be the great majority of all dates in the db) are all displayed in the same format by default. The proposal is concerned simply with maximizing consistency while respecting exceptions. I don't see what good it does to perpetuate informationally equivalent idiosyncrasies, at least not in basic user-facing UI elements.

Storing the original format is fine if that's useful info (I see the point about possible future parsing improvements, though this does not obtain in common cases like "09/2002" vs. "Sept., 2002".) I think it would make sense to display the stored date in the tooltip, along with the (reordered) parsing indicator.

mark · October 1, 2010

Responding to Rintze:

My preference would be to keep the current behavior of showing the unparsed date in the date field. Otherwise some fields will show a parsed date, while others will show an unparsable/unparsed date, and I don't know how a) you'd communicate to users that date fields can toggle between these two states, and b) how you'd show whether a given field contains a parsed or unparsed date.

But the present situation is worse (in a measurable way, namely in terms of the number of informationally equivalent fields that display different formats, thus adding to visual clutter and potential confusion): some fields show a JSTOR date ("Feb., 1985"), others show a Reference Global date ("02/2009"), others show my own input system ("2009-02"). Users have little clue as to what this means and as if that's not enough the parsing indicator confuses them. As I say above, the proposal is concerned simply with maximizing consistency while respecting exceptions.

Thinking from the user's point of view, having consistent behaviour for the majority of cases is way more important than knowing about parsed or unparsed state. The current UI gives priority to idiosyncrasies and exceptional cases. I propose to do away with the needless idiosyncrasies and to treat the exceptional cases for what they are: exceptions. My guess (which I think will be confirmed by the devs) is that the majority of dates in the average db is in fact parseable. Let us take that fact as the starting point.

mark · October 1, 2010

Responding to Bruce:

First, for people who might be more familiar with the field, is there any user-interface research out there that might provide support for this proposal?

I'm not very familiar with the field, but in this case I think the basic principle really is common sense: take the majority case as the starting point for basic UI elements, and for the exceptions, find sensible solutions that do not get in the way of the majority case.

bentle · October 1, 2010

Just a few comments from a probably not unusual user:

- A consistent presentation of date in the Date column would be helpful as I very often sort on that column and use it as a quick way of understanding the evolution of some issue/concept/etc.
- A user selectable presentation of date in that column and in the Date Modified column would be very helpful. I prefer y-m-d, but because of my Windows region settings, Date Modified defaults to D-M-Y.
- I'm much less concerned about the presentation in the metadata panel. I don't often get actually confused, but it does seem odd that there is often a mismatch in ordering between the greyed "y m d" and the date as entered.

Thanks!
Tom

mark · October 1, 2010

Just to note: this discussion is primarily about the date field in the metadata panel, but indeed, much the same holds for the "Date" column in the middle panel. In particular, that column, too, perpetuates idiosyncrasies for no good reason. Consistency there would be good too.

mark · October 7, 2010

I' a bit unclear as to why this thread suddenly fell silent. Dan, Bruce, Rintze, have you seen my replies to your comments?

From Dan's first comment I gather that he agrees that the proposal is sensible:

But generally speaking, I could see displaying parsed dates in locale format and/or a user-configurable format.

...

Removing the parsing indicator might be all right

And I do believe I have countered Rintze's worry that the distinction between parsed and unparsed dates might be confusing. (Basically, the present UI is confusing in more cases and to more users.)

bdarcus · October 7, 2010

Sorry, I think I saw it, but am easily distracted with a very busy semester. I tend not to reply if I can't figure out what to say in about 10 seconds ;-)

Will take another look though.

bdarcus · October 7, 2010

I'm not very familiar with the field, but in this case I think the basic principle really is common sense: take the majority case as the starting point for basic UI elements, and for the exceptions, find sensible solutions that do not get in the way of the majority case.

Well, I guess inevitably the question, then, is if the "majority case" is clear and uncontroversial? Would your proposed solution appear intuitive for both advanced and beginning users, people from Japan as well as the United States and France?

I really don't have an answer; just asking.

My views on this will also be influenced by my question on extending it for more advanced date features. So if I want to indicate a "circa" data, do you imagine the same approach, but with a) a "circa" checkbox to the right of the field, or b) the need for the user to add some modifier (like "1954-04-03~") to the displayed field?

mark · October 8, 2010

The answer to the first question is simply yes. You enter a date whichever way you like. Translators enter dates whichever way they like. But all dates, insofar as they are parseable (= the great majority in most libraries), are displayed in a consistent way, according to the locale you have selected (or in some other hidden-pref customizable format). What could be less controversial?

Just compare this with how it currently works. You enter a date as 02-1986. JSTOR gives you "Feb., 1986". Reference Global gives you "1986/02". Some other translator gives you "February 1986". All of these bloody variants stay just the way they are. And the user is supposed to understand that Zotero has in fact parsed them? Oh no, there is some grey text to the right of the field that makes it seem like it some are parsed in the wrong order! Then you display the "date" column in the middle pane and you see that the idiosyncrasies even have propagated there! There is something in a mouseover tooltip that you only discover after months. But by that time the average user has been so thoroughly confused that they (1) certainly won't have understood Zotero's date parsing power and (2) feel forced to enter the dates in some consistent format themselves. I've seen this happen.

As for the second issue, I'm not sure how it is relevant. To reiterate, my issue is just the consistent display of parseable dates in the UI. If Zotero's parsing is improved to deal with "circa" dates I guess some way can be found to indicate that the date has been parsed as such.

fbennett · October 8, 2010

mark,

I think everyone understands and has taken on board your complaints about the current UI, there's no need to repeat them. But there will be a lot of work and some risk involved in moving to a fresh set of compromises, and people are wanting to be sure that when we jump, we jump to the best possible position in the field of possibilities. Further to that thought ...

I guess some way can be found to indicate that the date has been parsed as such.

The ability to handle "circa" dates has already been implemented in the CSL processor, so this isn't something lurking off in a distant future. Whatever solution is adopted in the next weeks or months is likely to actually land on people's desktops, and we're open to (eager for, even) concrete and specific suggestions on how to get it right -- how best to handle this case in the data layer, and how best to represent it in the UI.

mark · October 8, 2010

I think everyone understands and has taken on board your complaints about the current UI, there's no need to repeat them.

Thanks for the reassurance Frank. I guess it didn't feel like that because the discussion was so open-ended.

we're open to (eager for, even) concrete and specific suggestions on how to get it right

I can't say much about the data layer but as for representing circa dates in the UI, I think most users would be happy if the date field would say "ca. 1850" for a circa date. In keeping with my main proposal, it should be consistent everywhere — so map a list of variants ("ca.", "circa", "c.", if those are all unique) to the main indicator ("ca.") so that when I enter a date as "circa 1850" and a translator delivers "c. 1850" both end up being displayed in the same way. Note that this in itself is already a clue to the user that sensible parsing is being done — unlike the current UI, which suggests (to most users) that no parsing is done at all.

If input works like that and if the parsing and handling of this is taken care of in the CSL processor, they will just be delighted that everything works as expected. I don't think a checkbox or a special convention, as per Bruce's suggestion, is necessary; this adds to visual and cognitive clutter. Besides, "ca." is a widespread convention already for encoding uncertainty. (In fact it is so widespread that I wonder whether localization would be needed. I can't think of a scholarly tradition that doesn't use the Latinate "ca.". But I might be wrong here.)

For users who worry about how their dates are parsed (of which there are very few), I would use the tooltip to say "Parsed as: circa YYYY-MM".

bdarcus · October 8, 2010

Besides, "ca." is a widespread convention already for encoding uncertainty.

Certainly in English display conventions it is, but I really worry about using what may be quite localized conventions for a tool that aspires to be international friendly.

So I like your ideas basically. I guess I would say, in answer to my own questions, that a circa date would be displayed (as parsed) exactly as it should be stored in the data layer , or at least how it's interpreted for processing and export, which is to say I should see "1954-03-05~", and when hovering over it see "c. March 5, 1954" (or whatever).

Part of the problem, though, is that this field is also for editing. So I'm not sure about the fundamental design disconnect here, which is that the stored date is not what the software actually ultimately sees. It may be that this needs a rethink.

Rintze · October 8, 2010

1) My proposals above were mostly meant as a stopgap solution.

2) You bring up the fact that different translators produce dates in different date formats (even if the date was originally structured, as is the case for e.g. PubMed records), and that Zotero should display these dates in a single format. Wouldn't it be preferable to solve this at the source, i.e. in the translators? Maybe translators can be allowed to save dates in a structured format (the date field is currently always plain text, right?), or, alternatively, we could just agree on a common unambiguous format (e.g. YYYY-MM-DD).

3) Although I recognize the problems with the current approach, a redeeming feature is the ability to see both the unparsed (in the field) and parsed date (in the tooltip) at the same time. With Dan's suggestion ("So when users clicked on the field, it would probably still show them the date in its original format (which would help if it was parsed incorrectly)."), you would seem to lose that benefit.

4) Regarding "circa": first, most scientists don't have to deal with approximate dates, so for them a checkbox would be a (confusing) distraction. Also, it wouldn't account for date ranges with an approximate date at one end of the date range. Furthermore, CSL 1.0 locales contain full ("long") and abbreviated ("short") terms for "circa", so it might be possible to parse localized variants.

bdarcus · October 8, 2010

3) Although I recognize the problems with the current approach, a redeeming feature is the ability to see both the unparsed (in the field) and parsed date (in the tooltip) at the same time.

I really think we should reevaluate this fundamental assumption: the only reason we have this design is because of uncertainty about translation (and about a structured date representation; though I think we now have a better handle on this).

dstillman · October 8, 2010

I guess I would say, in answer to my own questions, that a circa date would be displayed (as parsed) exactly as it should be stored in the data layer , or at least how it's interpreted for processing and export, which is to say I should see "1954-03-05~", and when hovering over it see "c. March 5, 1954",

Except we're now saying the reverse. You'd see the localized "c. March 5, 1954", hovering over it you'd see the as-parsed "1954-03-05~", it'd be stored as, say, "1954-03-05~ circa March 5 1954" (since Zotero stores a multipart date with a sortable portion first), and clicking into the field to edit it you'd see the original "circa March 5 1954".

For dates that weren't fully parsed, the editing view and the display view would be the original value, but hovering would still show the parsed value.

The one still-kind-of-ugly part of that is having the editing view show the inconsistent original format. One option might be to have the date processing code flag dates that are totally unambiguous (which would include but not be limited to dates in the agreed-upon structured format) and store and display those consistently rather than as is. There's not really any ambiguity about "March 5, 1954", so that could conceivably be stored solely in a structured format. On display, dates stored that way would display in a consistent format (which might just be the structured format or might be something else unlikely to be made unambiguous during editing).

dstillman · October 8, 2010

Maybe translators can be allowed to save dates in a structured format (the date field is currently always plain text, right?), or, alternatively, we could just agree on a common unambiguous format (e.g. YYYY-MM-DD).

The main downside to adjusting translators would be that they'd need to do more date parsing to determine whether to pass dates in in the structured format or as is, which could lead to a lot of duplication of date processing code, instead of just letting the data layer deal with it. But at least for more reliable data sources it could make sense.

bdarcus · October 8, 2010

The main downside to adjusting translators would be that they'd need to do more date parsing to determine whether to pass dates in in the structured format or as is, which could lead to a lot of duplication of date processing code, instead of just letting the data layer deal with it. But at least for more reliable data sources it could make sense.

Picking up ideas we were discussing on zotero-dev, could we have a parseDate JS function that could cut down on that duplication you note?

bdarcus · October 8, 2010

Except we're now saying the reverse.

Either way is fine (notwithstanding the tricky details we've noted). I just haven't had time to read this thread carefully.

dstillman · October 8, 2010

Picking up ideas we were discussing on zotero-dev, could we have a parseDate JS function that could cut down on that duplication you note?

I'm not sure the issue I raised is a valid concern. strToDate() is actually already available in translators, but there wouldn't be a point to running that and looking for the 'unambiguous' flag, because any date that triggered that would be stored that way anyway given my proposal above. So any logic that did exist to determine date validity would probably be pretty translator-specific.

We might want to look through some examples of how translators handle dates in order to develop a recommendation for translator authors, but this probably won't affect the underlying design in any case.

fbennett · October 8, 2010

If the UI shows a universal or localized form of the date on the surface (such as "2002-03-22"), you could provide a clickable rollover or left-click popup that provides a hint against the original content, such as "Edit: 23 Mar 2002". Clicking on the item would open the field with this original content in place for editing.

In other words, just reverse the current behavior, but replace the tooltip with a clickable edit button. The hint and extra step might help avoid user alarm when clicking on the field changes the representation.

erazlogo · October 10, 2010

Have you figured out how to deal with months/seasons? It would be great to sort June 2001 or Summer 2001 before September 2001 or Fall 2001 , i.e. to parse the seasons as well. But if you display the parsed date (2001-09-01) first and have users roll over every time to make sure it's a month/season and not an exact date, it would be extremely confusing an inconvenient to the user.

ajlyon · October 10, 2010

While we're adding to the wishlist / use cases / specifications for a new date system, can we handle multiple languages as well? It looks like months and seasons are/will be parsed in the current locale only (or just in English?) -- this is of course not a good thing.

I don't what the performance issues are of attempting to parse in _all_ locales, but it would be awfully nice if I didn't have to be careful to type English dates, even when the remaining data is all in, say, Russian.

I suppose there's also a small risk of ambiguity, if a month name in one language is a different month in a different language, but ... let's hope that's not a significant issue.

erazlogo · October 10, 2010

Presumably foreign month names can be parsed by using localized CSL terms lists--not sure about seasons. I'm still in favor of free-form date entry field with ability to parse as much as possible and rendering rare cases as entered, as discussed on this ticket. That said, I haven't been following the recent date discussion on zotero-dev--I will go back and read those messages.