Merge inconsistently named authors
Haven't seen this being directly discussed here. I saw in the forums some people having trouble with name disambiguation and I also saw that this is addressed in the documentation, here: https://www.zotero.org/support/kb/given_name_disambiguation
I was wondering if a tool for merging inconsistently named authors is out of the question, much like we already merge duplicate items.
1. All items in which the (possibly) same author appears are displayed in a list;
2. user selects the items which belong to the same author;
3. user selects which version of the author's name they want to use be the canonical one;
4. Zotero changes all names in those items to the one the user selected.
Now that I think about it, the same logic could be used to merge any inconsistently written field, as long as sensible similarity measures are used for each field.
I understand there is a batch editing functionality planned for Zotero version 5.2, so if the developers believe that it will solve the problem I've described, then ignore this.
I was wondering if a tool for merging inconsistently named authors is out of the question, much like we already merge duplicate items.
1. All items in which the (possibly) same author appears are displayed in a list;
2. user selects the items which belong to the same author;
3. user selects which version of the author's name they want to use be the canonical one;
4. Zotero changes all names in those items to the one the user selected.
Now that I think about it, the same logic could be used to merge any inconsistently written field, as long as sensible similarity measures are used for each field.
I understand there is a batch editing functionality planned for Zotero version 5.2, so if the developers believe that it will solve the problem I've described, then ignore this.
I used a bib file I exported from my collection and found authors with the same initials, or with names that are only differentiated by the presence of accents etc. and tried to keep the most unabbreviated ones. I'll try to provide some executable or command-line program in the future, if people are interested.
It is getting really frustrating. There must be a way to consolidate different formats of the name of same author to one single one. Is there anyway? This is really annoying that you can’t sort all the articles of ONE AUTHOR because there may be different combinations of first and middle names.
Also, by creating a "saved search" folder you may face a new combination of the name of that creator when adding a new item, because you hadn’t previously considered this specific "new" combination.
Some sort of similarity function may be concocted in order to ease the merging process, so that authors with similar names are presented close to each other in some sort of list, but ultimately, the user would have to judge each case individually.
1. As always, I want to emphasize that I believe Zotero should be seen as a knowledge management software. Otherwise, users or developers trying to think about managing references leads to more focus on "items" (e.g. articles, books, webpages, etc.) instead of knowledge. By knowledge, I mean a bigger picture that connects those dots that could have been impossible if you want to do that with the power of your mind.
2. Thinking about the knowledge, it is important to capture the creators, because it provides the identity for the sectors of the mental map you have. This sectoral identity provides better understanding of the bigger picture, hence you can compare and relate different ideas (knowledge maps) together. This is critical for knowledge creation (particularly for those of us who write academic articles)
3. Keeping in mind all of that, it is crucial for all of us to remember and remind who says what and who creates and proposes certain ideas. Therefore, it is necessary to view the ideas of particular creators in a nutshell.
This is my point of view that "we should be able to observe the contents, materials, items, and ideas through the perspective of creators."
Another point of view could be from "Publications". There can be a section that you can only focus on the title of the publications so that you can handle different parameters. Have you ever noticed that sometimes also it is stated in this forum that we need to define the Journal Abbreviation for a specific publication? Also, there are some other questions regarding how to assign that particular Journal Abbreviation to multiple items that have been published in a journal.
These types of issues exactly come from the fact that the focus is on the items rather that "creators" and "publications".
Hope the developers and users understand my points.
Another example:
Have you ever faced this confusing situation regarding a particular author?
1. James G. March
2. James March
3. James G March
Although this has been resolved in many other applications many years ago, Zotero considers all these three examples as three distinct authors! This issue also creates inconsistencies in our bibliographies! Having said that, there should be way that I can review the name of the creators to see whether there are any types of these problems.
Another example:
Have you ever thought about how many "different" publications you have in your library? Besides, do you have any publication title that by chance has a typo? For instance, how can you find this anomaly if it exists in your library?
1. American Journal of Sociology
2. American Journal of Sciology
I know that check spelling and spell correction can be a one way, but the best and systematic way is the ability to have an overview of all of your articles to see those bizarre situations.
Although this has been resolved in many other applications many years ago...
I'd love to know your examples of software that does this automatically. Especially any software that can handle J. G. March, JG March, J March, Janice Giselle March, John Garfield March, etc.
My own (non-Zotero) online bibliographic database has a logic system to help with the different but similar names / same author problem. It involves weights for identical coauthors, similarities between topics, and publication year proximity to arrive at a calculated probability of a match. I still need a human to make the final decision and do the edits. My system still has more than 14,000 names that are very similar to one, two, or three other names out of 860,000 total names in the database.
edit: Add name variants, where for example a name can be published with a single German decorated vowel or an English 2-vowel spelling, (oe, œ, ö) and the decision becomes even more subtle.
https://github.com/retorquere/zotero-creator-metaphone/releases
Sorry, but there are some misunderstandings here. I don't remember that I said anything about the automatic correction –– please correct me if I'm wrong. The only thing that I want to convey is we have had this problem that we can't "globally" change different details of items at the same time. Again, think about the example above, also, consider that there are articles and items related to each one of them:
1. James G. March ---> 38 items
2. James March ---> 17 items
3. James G March ---> 15 items
Now, think about consolidating all the variations of names of author. In this case, I want to keep "James G. March" as the principal name and correct the others:
James March ---> James G. March
James G March ---> James G. March
How can I do that? The only way is going through 17+15 items and make the changes for each individual item. Isn't it ridiculous? Couldn't it be done in a more systematic way? Seriously, think about all other issues relating to this.
Another example:
American Journal of Sociology ---> 340 items
American journal of sociology ---> 157 items
My goal is the following:
American journal of sociology ---> American Journal of Sociology
Is it possible to that manually?
All that I wanted to say is Zotero must have the feature that users can globally change an attribute that has been assigned to several items.
To the best of my knowledge, in Papers 3 you have a separate view to see the items in your library by "Authors" and "Publication Title". In that case, if you make any change to a publication title, then changes will be applied to all other "nested items"
I appreciate, since I didn't know the feature. Besides, I'm not an advanced user and hope to be able to that.
By the way, I'll backup before every changes. But, for normal users, it would be easier to do that in a simpler manner.
Many thanks again.
@aliakhavan89 has nicely captured the spirit of what Zotero should be endeavouring to do. I have adopted this workflow with my use of the item object holding an 'article' to also be used to 'capture the knowledge' by tossing in reflections, emails, notes and deliberations about the key ideas etc.
Not only is there the irritation of alternative names, but some authors sometimes prefer to use a pseudonym and muddled my referencing a tad, such as Hergé (Georges Prosper Remi). Recognising that this is still one person anchors all these dimensions. I am similarly curious how Zotero 5.2 will implement this. Perhaps it may have an author view similar to tags, coupled with a validation prompt with suggestions from an external verification source.
It does what @douglasrizzom script does, plus it will also try to fix entries with wonky creators that have just one "name" field rather than a "firstName" and a "lastName" (I have discovered the RIS files exported from articles in Science look like this, and it can mess up your formatted citations, depending on the style you're using). It does all this using the pyzotero API, so you'll need your library ID (group or user) and an API key to make this work. There are instructions on how to assemble the necessary credentials here: https://pypi.org/project/pyzotero/.
Just run this on your command line, and then right click on your library in your client and choose 'sync'. You author names should be fixed! (or at least as fixed as this method can get them). It worked great for me, but YMMV
Citavi does that (for both authors and journals, opening a new window with a list with authors or journals, you can sort it in alphabetical order and then manually merge entries).
In this case, I somehow would really prefer a manual solution over an automated solution that potentially destroys my entire database but call me paranoid. ;-)
Sooo... Zotero 7 up for it?
[removed — code is here — D.S.]
I have no understanding of Java, but was nonetheless able to open the Javascript interface (Tools -> Developer -> Run Javascript), paste her code into the dialogue, and substitute the names I wanted to change.
20 years ago, Reference Manager 10 allowed users to batch replace terms through a dialogue box. There was a list of authors (or journal names, etc.) and one could either select individual names and edit them manually (edits would be applied to all instances of the term) or select several and merge. The absence of a tool for doing this in Zotero has been a real irritant. So it is great to have a workable patch at last.