[Suggestion] Problems with Disambiguate by First Name

I love Zotero and have used it for more than a decade.

However, I have a suggestion regarding disambiguation by first name. Unfortunately, meta-data sources are inconsistent in how they provide information. As an example, when I download articles, my own name might be filled in as "Hudson, N. W.", "Hudson, Nathan W.", or other variants.

This leads to a problem with the default APA Zotero style sheet, which has "disambiguate by first name" enabled. I'll start to get errors such as Zotero citing (Hudson, 2021), (N. W. Hudson, 2021), and (Nathan W. Hudson, 2021) in papers to disambiguate them. Naturally, I don't have time to go through my Zotero database of thousands of articles to make sure every author's name is listed correctly.

Thus, every time a new APA style sheet is released, I have to modify it to turn off disambiguation by first name.

However, this is not particularly ideal if there are truly two different authors with the same last name.

Thus, I'm wondering whether it might be better/smarter for Zotero to only disambiguate by first name ONLY IF the initials don't match. Thus, Zotero should assume that "N. W. Hudson" and "Nathan W. Hudson" are the same person. This seems like it would cause Zotero to use the correct behavior in the VAST majority of cases (e.g., where meta-data is entered incorrectly or differently across articles). It seems like it would cause problems in only a fringe number of unusual cases (e.g., where authors share the exact same initials but different names).
  • We're about to update the APA style to do just that (in order to accomodate new APA rules).
    We won't do the same for Chicago style, though, as there has to be a way to get this right and Chicago wants disambiguation by full first name if initials are the same.
  • edited June 11, 2021
    Please tell me I have misunderstood your comment, @adamsmith

    Please don't have Zotero and APA-7 style "assume" that N. W. Hudson and Nathan W. Hudson are the same person. That isn't how I interpret the rule. If necessary, I can provide hundreds of real examples of authors with identical initials who are different people.

    My own name David Williams Lawrence -- there are at least 3 other DW Lawrence authors in my database: David Wyndham Lawrence, David Wilson Lawrence, and two other David W Lawrence authors who don't provide their full names but have papers of _very_ different topics from one another and different from any of the full-name David W Lawrences. Then there is Duncan W. Lawrence ...

    Zotero has long recommended that, when there are different versions (different completeness) of author names, names be edited so that all have the same complete version.
  • @DWL-SDCA APA 7 changed its disambiguation rules. It’s not that it “assumes” Nathan W. And N. W. are the same person. It just doesn’t want disambiguation beyond initials, so if two authors have the same initials, it doesn’t disambiguate. (I think APA has decided the only purpose of disambiguation is to guide readers to the correct entry in the reference list.)

    @nate.hudson As a regular practice, you should ensure that an author in your library has their name spelled consistently across all items. I recommend always entering the names fully.
  • @bwiernik That seems correct to me. In actual research and publishing settings in psychology, in all versions of APA style (at least for the past 12+ years), we have never disambiguated, for example, a paper written by John Smith (2001) and Jane Smith (2002). They would simply be (Smith, 2001) and (Smith, 2002). Thus, it makes little sense to disambiguate (Jane Smith, 2001) from (John Smith, 2001), but instead use the (Smith, 2001a) and (Smith, 2001b) form. It seems that APA 7 is just updating to what authors and journals already do.

    I do understand your point about database maintenance. Unfortunately, I have literally thousands of articles in my database. I correct obvious errors (e.g., titles, page numbers). But remember that the average user is going to do what's easiest. Especially given that even full reference lists use only initials in APA style (e.g., I'm always listed as Hudson, N. W.), there's little incentive to go through my database, find all instances of my own name (much less other authors' names) and correct them (and this doesn't touch on the fact that if Smith, J. K. is imported into Zotero, there's essentially zero change that I or other users are going to look up the author's full name to figure out what the initials should be). It's the classic battle between what UX designers wish users would do, and how users actually use the product.
  • I am a psychology researcher—I’m well aware of publishing practices.

    My recommendation isn’t to try to clean an entire database at once. Rather, as you import items, take a second to check the data after import (eg, complete author names, put the title into sentence case, etc). Zotero does a great job at this a lot of the time, but it is good to take the few seconds and correct any issue when they happen.

    Then, only update existing items when you run into a surprising citation as you are writing—correct the two items to make the author names consistent.
  • And this isn't that we really want people to do what we say and turn into mini-catalogers, but that there are a number of things that just don't work quite right in unexpected ways if your metadata isn't right: bibliographies of styles that use full first names might not sort correctly, for example. If you're fairly certain that you're only ever going to publish in APA style, this particular issue is unlikely to matter, but that's a pretty big if, even in psychology.
  • @adamsmith you're 100% correct. In at least my fields of psychology (social/personality), the changes you're making are 100% correct. None of our journals ever disambiguate by initials or first name, because the purpose of disambiguation is to point people to the correct references. Indeed, APA style doesn't include first names at all in reference lists (only initials are included).

    I'm grateful that you're aware of changes in how APA7 handles citations that that you're correcting it to a form that will work for psychological researchers without further modifications.

    As I've said before, most Zotero users likely don't have the technological know-how to edit Style Sheets. And even if they do, they're likely not motivated to do so. Zotero needs to "just work" for most people to adopt it. I'm so glad that you're actively working on changing the APA7 stylesheet to reflect how actual researchers and journals expect citations to be used! Thank you!
  • edited June 12, 2021
    Sorry this is off-topic but here goes:

    @nate.hudson @bwiernik

    Particularly, Nate: You are lucky that essentially all of your articles include your first name and middle initial in PubMed, PsycINFO, and my database, SafetyLit. There are other NW Hudson-named authors who have published in the behavioral or psychology field in the years you published with whom might not want to be confused.

    Having a few thousand articles and several thousand author names is probably too much for one person to fix retroactively but you have graduate students and possibly staff who could work on this if your budget will allow. One of my first well-paying full-time indoor jobs was doing exactly that, editing Reference Manager records in a professor's database in 1983-1985 (under CP/M --before there was DOS). Tracking down authors full-names was detective work that satisfied my OCD tendencies. My boss was clearly more OCD than I because she valued the idea of full author names. This required seeking out printed university catalogs and foundation annual reports because the Internet and the World Wide Web were only a dream at that time.

    Carrying on my OCD and speaking of thousands of articles, SafetyLit currently has 660 thousand journal articles and several million authors. We have volunteers who use a tool to disambiguate author names by comparing publication dates, co-authors, subject matter, and author biographies and their institutional affiliation.

    By the way, there 15 Nathan W. Hudson articles and 9 Brenton M. Wiernik articles in the database. Brenton, your "green period" articles didn't fit the inclusion criteria.
  • @adamsmith wrote:
    If you're fairly certain that you're only ever going to publish in APA style, this particular issue is unlikely to matter, but that's a pretty big if, even in psychology.
    It also presumes that APA will not change their style standards to demand full (first) names.
Sign In or Register to comment.