Parsing problem on Italian names
This is an old discussion that has not been active in a long time. Before commenting here, you should strongly consider starting a new discussion instead. If you think the content of this discussion is still relevant, you can link to it from your new discussion.
And as a general comment, while I understand Dan's unwillingness to encourage data entry hacks, as a Zotero user I cannot feel but neglected, since for many years now Zotero has done little to make it easier to achieve correct name and title formatting (see also https://forums.zotero.org/discussion/51980/bold-italic-etc/?Focus=233795#Comment_233795), while it seems that both should be core competencies of a referencing tool. I probably bump into these issues on an above-average level since I deal with a lot of Dutch authors and paper titles filled with italics and the like, but these limitations must be affecting many users, and some flexibility on allowing stopgap solutions would be greatly appreciated. Maybe a switch to HTML fields is just around the corner, but the multi-year wait for a database field revision doesn't make me hopeful we will see a "perfect" solution any day soon.
First, the dynamic menu does not introduce any new data entry conventions; it only helps users to apply the conventions that we already have with considerably less effort. (The double-quote escape hack has been with us for quite awhile, and as nickbart notes, it is currently the only means we have of forcing a sort on leading lowercase elements of a surname. For better or for worse, the workaround needs to be retained until something more elegant emerges, since sorting errors in a bibliography can be a factor in manuscript rejection.)
Second, the menu provides a contextual illustration of the particle categories and their significance, which should be helpful on the forums. Frustration over particle handling has been partly my fault, for past bugs and inconsistencies in the CSL processor; but even in the best support scenario, users often need an assist to grasp the structure behind the freehand conventions (i.e. what are particles, what are their categories, how are they entered, what are their effects). Quite a bit of effort has gone into our exploration of the "particle space," as shown by the threads linked below. The dynamic menu is a partial distillation of what we have learned, and I think it would make support on the forums significantly less burdensome.
Anyway, that's the why of it.
- “first-discussed” thread (4 Jul 2008→6 Jul 2008) 5 posts
- “di Estos” thread (20 May 2010→16 Jul 2015) 31 posts
- “multipart last name” thread (22 Feb 2011→22 Feb 2011) 13 posts
- “van der Aalst” thread (23 Mar 2011→23 Mar 2011) 3 posts
- “Author Names” thread (29 May 2011→30 May 2011) 6 posts
- “De” thread (12 Dec 2011→14 Nov 2012) 27 posts
- “Rafael La Porta” thread (14 Jun 2012→26 Aug 2012) 5 posts
- “Eric von Hippel” thread (13 Aug 2012→14 Aug 2012) 16 posts
- “Einstein” thread (13 Nov 2012→16 Nov 2012) 6 posts
- “von, van, de” thread (14 Feb 2013→3 Aug 2015) 62 posts
- “Del Maestro” (19 Feb 2013→29 Mar 2013) 10 posts
- “BBC” thread (8 Jul 2013→14 Jul 2013) 12 posts
- “R.S. De Groot” thread (16 Jul 2013→16 Jul 2013) 2 posts
- “abu-” thread (22 Jul 2013→19 Aug 2015) 85 posts
- “de Villepin” thread (6 Nov 2013→13 Nov 2013) 3 posts
- “Juan de la Chica Caicedo” thread (16 Mar 2014→15 Aug 2014) 9 posts
- “Ab Halim” thread (18 Mar 2014→20 Mar 2014) 7 posts
- “van den Heuvel” thread (22 Apr 2014→22 Apr 2014) 13 posts
- “N.C.F. Van Sas” thread (5 Sep 2014→11 Sep 2014) 13 posts
- “Van Welie” thread (17 Nov 2014→18 Nov 2014) 4 posts
- “Claudio De Felice” thread (3 Dec 2014→11 Aug 2015) 12 posts
- “Adolph von Harnack” thread (15 Dec 2014→15 Dec 2014) 3 posts
- “French dropping particle” thread (28 Jan 2015→30 Jan 2015) 14 posts
- “María Isabel del Val Valdivieso” thread (25 Feb 2015→25 Feb 2015) 5 posts
- “Names reform” thread (28 Feb 2015→11 May 2015) 41 posts
- The “Dutch” thread (31 Aug 2015→21 Sep 2015) 32 posts
- The “Italian” thread (6 Sep 2015→22 Sep 2015) 95+ posts
(EDIT: listing sorted and styled for clarity)※ original proposal that non-dropping particles be placed in a separate field
※ first recommendation of quoted input, implemented 3 May 2010
To quote mark here: Here we have a rather difficult-to-implement feature, very long and *time-consuming* discussions, and finally an incredible work from Frank…
Please don't do a remake of "counters/hints to notes, tags, and related tabs".
At least… yes!
From a GUI point of view, double quotes can be replaced by non-breaking spaces in the future, but that has the inconvenient of being invisible when checking an important amount of data. An indicator/tooltip like the one which exists for the date field could be helpful.
Our concerns are broader than just the bibliographic issue here, and include how data is displayed, processed, and analyzed in many different contexts. If the plain-text version is implemented in citeproc-js, it will work after Zotero is updated to that version, but I don't want to add a GUI feature that adds hacks to visible data or encourage the use of such a format, because it will make various other functionality across the Zotero ecosystem not work properly. One example: the middle pane of Zotero itself. Frank's screencast shows the quotes appearing there, which we certainly wouldn't want. So this informal markup format would need to be parsed for display, sort, search, and processing, there and everywhere else that Zotero data is handled.
But we already have a markup format — HTML — that we deal with in notes, and people use it in titles because of the citeproc-js support, so simply broadening the expectation that Zotero fields are HTML is far more acceptable and seems like the obvious solution here. That means the wait will be a little longer, but 5.0 is nearing completion. And if we can do this all via HTML and don't need additional fields, this can happen in 5.0 proper, without waiting for data model changes.
(Edit: It's also worth mentioning that the basic processor input format for personal names hasn't changed significantly since the release of CSL 1.0 in May of 2010. The aim here has been to get Zotero and the processor working more smoothly together, so that CSL can do its thing.)
Especially if you think HTML markup is the way to go, why are UI features for adding that exact same markup not acceptable as an interim solution until 5.0 is released? Can't you empower your users, and give them the choice of whether any improved functionality is worth the ungainly sight of some unparsed markup tags?
In my field of study, rich text markup in titles is everywhere, and I've always been pushing for better support, first (and successfully) in CSL, and later in Zotero. I've created and shared workarounds from before citeproc-js made it into Zotero (https://forums.zotero.org/discussion/3875/rich-text-in-titles/#Item_11), and have been very thankful for citeproc-js' ability to parse title markup in more recent years. It's disappointing to me that after all these years, Zotero still doesn't have shortcuts to quickly add this markup.
With regard to Frank's particle menu, I feel the same way. Can't you accept his PR if we replace the double-quote markup with a HTML span, that later can be parsed by 5.0?
Dan, I love Zotero and I greatly admire your work, and we all know Zotero has a small team and limited resources, but it's saddening and disappointing to me to see that too often Frank and others put in a lot of work into a new feature, only to see the resultant PR gather dust (the "counter" PR referenced by Gracile above is a prime example). I really think Zotero would be a better project if its user community was better equipped to influence its development.
To the issue at hand: when people add manual markup, they understand that it's a hack. That's not the case with a GUI menu. Adding a button that caused markup to show up in the creator field or in the middle pane would, in my opinion, be embarrassing — an obvious unfinished hack — and I don't see any reason to do it when a proper solution is right around the corner.
But you still seem to be responding as if I'm rejecting this request. At this point I'm really just asking for exactly what you say — a version of Frank's PR that generates HTML (plus some minor UI changes, which we can discuss later) — but against master. Since 5.0 will be in beta, releasing it with visible markup would be somewhat more acceptable, and we can add in HTML parsing, editing, and rendering separately. I'm happy to help move that forward. It just can't happen on 4.0.
(Re: counters and other pull requests: we accept pull requests all the time, of course. But some do get neglected, and I apologize for that. Bumping doesn't always work, as the counters PR demonstrates, but I'm happy for people to do that when they think something has slipped through the cracks. As you know, we're also currently hiring additional developers and our first product designer, so we'll have more people to help tend the PR queue and also build polished UIs to accompany them. But that's not the issue here.)
PR = pull request (the preferred method of submitting contributions to an open development project)
It took me quite some time to figure this out. I hope this helps others particularly non participants understand and realize that this thread isn't as hostile as it might seem at first glance when viewed out of context.
One thing I'm not clear on: with appropriate HTML markup, a GUI menu, and flexibility of presentation, would the display of the particles in the given vs. family name field still be ideal? Or is that just a workaround given the plain text?
<span class="family-name">de Gaulle</span>
For rich text titles, would you be willing to accept a PR with just shortcuts (against master), as I offered at https://forums.zotero.org/discussion/51980/bold-italic-etc/?Focus=233795#Comment_233795 ?
(Dan, also, in general, is it still worthwhile to work on UI features implemented in XUL? If you have any thoughts about how Zotero will be dealing with XUL's imminent demise, I appreciate hearing them)
edit: that said, I always find editing two-field names in Zotero annoying because with a narrow right-hand column, editing one of the fields hides the other.
Anyway, for the current issue: I was assuming that there'd be a span around the particle with a class indicating its type, at least for some of the modes, but maybe that doesn't make sense. Someone more familiar with the issues here will have to take the lead on that. I certainly agree with the first part. Not sure I follow the second. What I'm asking is whether it would still be necessary for the menu actions to move the particle between the given and family name fields, as it does in Frank's PR, or if we could get rid of that and just add appropriate HTML classes around the particle. Let's discuss that on the dev list. Short answer is that we obviously want to minimize XUL work, we can make minor changes if need be, and we should start laying the groundwork for HTML everywhere.
The benefit of the current division of particles across the given and family name fields is that at least it gives a cue to the user that not all particles are the same (although discoverability is currently an issue, since it's not obvious where particles should be entered, and how this affects name processing).