ORCID
An ORCID identifier [1] is like an ISBN or DOI for a researcher, or other author. Each identifier is a URI, whose right-hand part is a unique, 16 digit string, separated into four groups of four by hyphens for example, mine is http://orcid.org/0000-0001-5882-6823 The ORCID website [2] has technical details, such as the range and check digit system.
ORCID is an open project, run by a not-for-profit foundation.
Zotero should include an ORCID field for each author in a citation. Once that's available, scrapers can capture ORCIDs when pulling in citation metadata from websites or other sources, and include them in their various output formats. A user could also manually (or in a semi-automated process) add an ORCID to a citation, where they are confident of the author's identity. And, of course, a scraper should be provided that can fetch citation metadata from am author's ORCID profile page (I've asked the ORCID team to add semantic markup such as COinS, a microformat, or microdata to their pages).
Within the ORCID library, ORCID identifiers should link to the subject's profile on the ORCID website (such as mine, above).
Anyone may register, free, for an ORCID identifier, at the ORCID website [2]. It takes less than a minute to do so, and I would encourage you all to get one. Indeed, the Zotero forums should include an ORCID parameter in user profiles.
There was some discussion of ORCID here previously [3] but it went off at a tangent; I'll comment there shortly, linking to this discussion.
[1] https://en.wikipedia.org/wiki/ORCID
[2] http://orcid.org/
[3] https://forums.zotero.org/discussion/3913/single-author-with-two-different-name-spellings-short-vs-longhand
ORCID is an open project, run by a not-for-profit foundation.
Zotero should include an ORCID field for each author in a citation. Once that's available, scrapers can capture ORCIDs when pulling in citation metadata from websites or other sources, and include them in their various output formats. A user could also manually (or in a semi-automated process) add an ORCID to a citation, where they are confident of the author's identity. And, of course, a scraper should be provided that can fetch citation metadata from am author's ORCID profile page (I've asked the ORCID team to add semantic markup such as COinS, a microformat, or microdata to their pages).
Within the ORCID library, ORCID identifiers should link to the subject's profile on the ORCID website (such as mine, above).
Anyone may register, free, for an ORCID identifier, at the ORCID website [2]. It takes less than a minute to do so, and I would encourage you all to get one. Indeed, the Zotero forums should include an ORCID parameter in user profiles.
There was some discussion of ORCID here previously [3] but it went off at a tangent; I'll comment there shortly, linking to this discussion.
[1] https://en.wikipedia.org/wiki/ORCID
[2] http://orcid.org/
[3] https://forums.zotero.org/discussion/3913/single-author-with-two-different-name-spellings-short-vs-longhand
ORCID does have an API, but the API is not sufficient for Zotero's use case. What Zotero needs to be able to do is query ORCID with a unique identifier for a journal article (DOI) and be able to get an ordered list of authors with their associated ORCIDs. I don't see a way to do this currently.
Adding an ORCID (or similar identifier) field would, at best, come with 4.2, but we need to see if this will be at all usable. This includes deciding how disambiguation is supposed to be handled as pointed out in the threat that "went off on a tangent".
Would love to see this happen though.
The ORCID system/ API is still being built.
Publishers are starting to include ORCIDs, both on-page [1],[2],[3] and in metadata [4]. We're also including them in Wikipedia articles [5] and Wikidata entries [6], and working on including them in citations on Wikipedia.
[1] http://www.hindawi.com/80647648/
[2] http://www.nature.com/ni/journal/v14/n7/full/ni.2633.html (click on name of "Chen Dong" to see pop-up profile)
[3] http://www.plantcell.org/content/early/2014/03/17/tpc.113.121830 (on PDF)
[4] http://www.ncbi.nlm.nih.gov/pubmed/24592396?report=xml&format=text
[5] https://en.wikipedia.org/wiki/Category:Wikipedia_articles_with_ORCID_identifiers
[6] https://www.wikidata.org/wiki/Special:WhatLinksHere/Property:P496
Once you have an author's ORCID, then the ORCID API will tell you who they are (their website, other identifiers, etc.) and what else they have published, as well as other names/ name variants they have used.
One note on the API (this is the example from the documentation)
http://pub.orcid.org/v1.1/search/orcid-bio/?q=digital-object-ids:"10.1087/20120404" returns authors for http://www.ingentaconnect.com/content/alpsp/lp/2012/00000025/00000004/art00004?token=004b14c36a64405847447b496e2f2a3125763b6b634c7633757e6f3f2f2730673f582f6bf2f but it returns "Paglione, Laura" twice with two different ORCIDs. I wonder if this is a common issue and how we're supposed to deal with it. I'll contact ORCID, but just wanted to note it here.
ORCID currently has no standard for entry of author name variants. It has no standard for entering publication metadata. Those of us who have been active from the beginning (my SafetyLit bibliographic database was a "launch partner") have been offering to volunteer to help with name and article metadata structures; however my offer to help as an expert in cataloging and bibliographic database programming was not accepted. I know of several others who offered to help even to spend their own money to travel to meetings. These offers were rejected or not even acknowledged.
When an author tries to use the semi-automatic systems to import their works into the ORCID system, the result is entries with author names and publication names in multiple forms; multiply duplicated entries, and different formats for abbreviations, page ranges, publication dates, etc.
Although the publication records can be deleted, there isn't a system that allows the records to be edited. One can hand-enter publications but the data entry form is nearly impossible to use.
The search system is crazy bad. Entering my own name as firstname middle name last name shows many possibilities who have a first name the same as my last name and a last name the same as my first name. A similar problem occurs when I enter my name in the form lastname, firstname.
The system allows users to import works from CrossRef but the way the ORCID system queries the CR system finds hundreds of partial matches (including books with titles that include any of the author's names) but omits some publications that are in the CR system.
This has become a rant. I apologize. One day, ORCID or some similar project will make it easier to do our bibliographic work.
I have already explained how, today, Zotero could capture, store and emit ORCID identifiers along with other bibliographic metadata. Adding the parameter now will facilitate various use cases, and tools to deliver them, and will provide future-proofing for the later addition of more advanced functionality. I sat as a volunteer on ORCID's 'works metadata working group'; I found it collegial and effective, and I understand that ORCID are busy working to implement its recommendations (which document is cited in my ORCID profile), including those which address many of the issues you raise. It is also possible to raise issues via ORCID's feedback system [1]; I have done so; and have received feedback indicating that work to implement some of my suggestions is in hand.
[1] http://support.orcid.org/forums/175591-general
With API syncing in place, future changes to the data model will be easier, so even if they don't make it into 4.2 they can be added later.
The biggest concern from my side would be that in order to make this worth the dev time as well as GUI space and complication, there needs to be a tangible benefit to a broad range of users. I have my doubts about how important ORCIDs are - at least in their current state - for Zotero's core functionality (i.e. reference management), but given yesterday's announcement that Zotero will strive towards integrating better with paper repositories there's probably more of an immediate need. I'd expect that to come up in the Penn State collaboration.
If your assumption that each of the paper's authors has an ORCID identifier is valid, then the publisher should include those identifiers in the paper (and ideally in the online metadata), and that is where you should obtain it. ORCID will then tell you who that author is, disambiguate them from others with the same name, and list their other publications.
The statement which I called bold, and quoted, was not by you, BTW.
So basically, at this point, this is pending wider use of ORCID in metadata and, of course, Zotero 4.2.
P.S. with a proper ORCID API that would do what I was talking about above, retroactively assigning ORCIDs to existing items in users' libraries (once ORCIDs are supported) would make this a whole lot easier. Given that most publishers will adopt ORCIDs in metadata, our only other option would be essentially to re-fetch metadata from publisher web pages (this is planned in general, but may take even longer than 4.2 to accomplish).
http://support.orcid.org/forums/175591-orcid-ideas-forum/suggestions/5791894-request-authors-to-specify-which-author-they-are-w
http://support.orcid.org/forums/175591-orcid-ideas-forum/suggestions/5791947-make-it-impossible-to-assign-more-than-one-orcid-t
http://support.orcid.org/forums/175591-orcid-ideas-forum/suggestions/5792028-provide-exact-matching-for-the-search-api
The problem is getting ORCID data.
We are collecting the metadata of everything with a DOI on WikiData (the database sidekick of Wikipedia). The database already has some ORCIDs, and as these become common it will incrementally incorperate them. Getting ORCIDs from Wikidata to a user's Zotero collection would be totally automated, and would not rely on the publisher offering ORCID metadata on their article pages (although it would be nice if they all did; perhaps Wikidata metadata would help them insert it).
Details:
https://forums.zotero.org/discussion/36151/wikified-copyleft-bibliographic-database/
Zotero could start offering an ORCID field now, as long as it is not intrusive for papers that lack ORCIDs. I don't think that there is much doubt that ORCIDs will become widespread, so it's a question of implementing now or later. Is there any advantage to waiting?
Edit: We could probably start rolling out a non-syncing alpha earlier though
I suspect that (at least until the publishers start using ORCIDs systematically) the easiest way is for academics and their co-workers to add (and correct) their own ORCID data, ideally through a user-friendly interface that they already use, like Zotero.
With ambiguous Asian names (Zheng, Tanaka, etc.), some scientific publishers have recently adopted the sensible practice of appending the full author name in the original script, in parentheses after the romanized form. Unfortunately, since ORCID records do not provide language variants of an author's name (and seldom provide it in the original script), we can't draw on them for this use case, even if the author's ID is known.
ORCID is a great thing for altmetrics work, and it helps authors to hold out their full publication history to a broader community under their own ID. The flip-side of that appeal is that the content in ORCID records tends to be "helpfully" translated into English, regardless of the language of the underlying resource.
I could be wrong, but I don't particularly see ORCID as a game-changer for reference managers and citation formatting tools. It is a very meaningful piece of metadata, though, and the idea of typing Zotero in as a means of crowd-sourcing improvements to related records is interesting.
(In a broader rant, the multilingual metadata provided by cite aggregation services like CiNII [Japan] and CNKI [China] is often of ghastly poor quality, and it would be great to a community effort to clean up those messes.)
ORCID is obviously very important and meaningful metadata, and when the dev cycle touches the schema again, I'm sure we'll see provision made for it in Zotero records.
https://forums.zotero.org/discussion/3913/single-author-with-two-different-name-spellings-short-vs-longhand#latest
Also see new thread on affiliations here: https://forums.zotero.org/discussion/76274/authors-and-affiliations
Each author is a database item, which contains all available public information about J. Smith, including things like their ORCID, where they work, where they got their PhD, their birth and death dates, and so forth. If you wanted to write a query asking for the names of all the people whose spouses were born in the 1700s in the same city as one of the grad students of one of the authors of the paper, you could. More usefully, you can search all papers by the same author since 2010, or all papers which share two authors with the current paper, etc.
For example, see
https://www.wikidata.org/wiki/Q56855591
That unique identifier, Q56855591, represents Donna Strickland, and links her to insanely detailed information on what her names are, the URL of her official website, who her doctoral advisor was, her fields of work, her past and present professional positions, and what IDs, including ORCIDs, are associated with her.
The Wikidata bibliographic database is now utterly huge, and has a good API. I would estimate that Zotero software has been used to build most of it. If Zotero included a download/upload feature, so that Zotero could interface with Wikidata and Zotero users could upload disambiguated and corrected bibliographic information, I think it would be useful to all parties. I hate proofreading bibliographies and would love to crowdshare the process, and it would provide what is probably the only way to disambiguate two "Smith, J."s.
Academics are also involved in proofreading their own data. There's also an #icanhazwikidata campaign; the idea is that academics tweet it to request the addition of their paper etc. to the database.
https://www.wikidata.org/wiki/Wikidata:Zotero
(disclosure: I'm that page's main author)
See the ORCID support topic: Is it possible to register an ORCID iD for a deceased person? - "No. Our policy is that an ORCID iD can only be created by the individual themselves, not by any other person. This is because a core principle of ORCID is individual control." Source: https://support.orcid.org/hc/en-us/articles/360024829193-How-are-ORCID-records-for-deceased-people-handled- retrieved: 2019-11-19
Being someone who is working on resolving some paper authors to ORCID and Wikidata there are definite gaps where publishers have not (yet?) applied an ORCID and where an ORCID could not be included with a publication and people may never link their ORCID, but integrating Zotero with Wikidata and ORCID makes sense to facilitate with entity resolution across authors and further linked open data tools. Combining an ORCID, VIAF, Wikidata Q and other authority identifiers
Other Identifiers such as VIAF (https://viaf.org/) and others applied by Libraries, Archives, etc could be a way to approach reliability of resolved entities since there is momentum around linked open data.
WikiCite (http://wikicite.org/) is working on some of this focused on Wikipedia citations, but others on are curating specific topics for research. Wikicite is referenced in the link pigsonthewing provided above and an ORCID is a key identifier that helps with the resolution to a Q in Wikidata and can enhance the resolved linked open data.
Tooling for papers and author disambiguation in Wikicite and used in Wikidata has improved for such resolution of author entities.
While coverages is not perfect Zotero can make it easier to curate this and import resolved content, the effort to curate this is in progress by various parties with support by Crossref and more.
Tools for disambiguation:
https://tools.wmflabs.org/author-disambiguator/work_item.php?id=Q56705592&doit=Get+author+links+for+work
Finding gaps in resolved authors to papers: https://tools.wmflabs.org/scholia/topic/Q1093434/missing
Integrating Zotero profiles with ORCIDs and an associated Wikidata Q could enable distributing the work of verifying accurate resolution. It would allow nearly effortless extended generated profiles based on Wikidata curated information in linked open data which is open to edit and monitor.
Enables:
Generated presentations of Authors: https://tools.wmflabs.org/reasonator/?&q=57978392
Generated Topic Summaries: https://tools.wmflabs.org/scholia/topic/Q1093434
Other tools - imports/sync to ORCID and DOI:
https://tools.wmflabs.org/sourcemd/ (generates Quickstatements like the current Zotero work)
Author presentation: https://tools.wmflabs.org/scholia/author/Q57978392
Work presentation: https://tools.wmflabs.org/scholia/work/Q56705592
Related curation and entity resolution/modelling work by Library professions integrating linked open data with Wikidata: https://wiki.lyrasis.org/display/LD4P2/LD4-Wikidata+Affinity+Group
To sum this up, ORCID is nice, but I think there should be a broader discussion on how to integrate with the linked open data ecosystem for a more wholistic view of how to integrate with existing work being done especially since ORCID is only a partial solution for resolving authors and at least two IDs are needed to tackle living and historical authors. Encouraging ORCID adoption by those living authors though is a positive move to help facilitate curation of the papers long term.
Wikidata QIDs are often seen as a linking hub that binds together ORCID, VIAF, and other canonical identifiers like Twitter handles (heh).
On the library interest front, see also:
Allison-Cassin, S., & Scott., D. (2018). Wikidata: A platform for your library’s linked open data. The Code4Lib Journal, (40). Retrieved from https://journal.code4lib.org/articles/13424 [1]
ARL Task Force on Wikimedia and Linked Open Data. (2019). ARL White Paper on Wikidata. https://www.arl.org/resources/arl-whitepaper-on-wikidata/
1. Confession: I was a co-author of this article.
Within the Zotero database, creators are in a table that keeps a unique ID and a first and last name (just two text fields). I propose that a table be added to the database that is "creatorIDs", with three columns - the internal id from the creator table, a text string that is a unique ID of some kind, and a schema or "translator" for that ID. The combination of the first two would be the unique key for the table. This structure would allow any identifiers to be added to a creator, with multiple supposedly unique identifiers per person. The meaning of the schema would depend on software support, and the Zotero software could choose to enforce a uniqueness constraint on the creator id and the schema (so only allow one ID for various schema). The identifiers could be ORCIDs, Wikidata QIDs, email addresses, ReasercherIDs, Social security numbers ;), or whatever. One option should be the Zotero user id from the API. They should be correctly associated with the person whose name is in the creator table (but that has to be done manually).
Within Zotero (desktop or online), a new tree root could be created for creators, which would list all the people in the database. A leaf of that tree would be "Duplicate names," which would propose names for merging. Options could control the merge criteria, but the default could be matching last name and at least one matching unique ID. In the "creator" display, the user could also select two or more names and merge them, much like the current merging of records. If a user chooses not to merge creators, that is their choice - the names would continue to be used as entered/scraped/imported. If they merge them, the user will choose which names to use (probably the longest), and the other unique IDs associated with the creator id will be merged without duplicates. When clicking on a creator name, the user would see an editable form like the item editor, allowing the unique IDs to be managed. There might be a dropdown of supported ID schema.
These IDs could be exported into CSL/CSL-JSON as properties on the name object as a URI (based on the assigned schema/translator). The renderer could then choose to render that URI in whatever format it wanted (e.g., as a link, in parens after each name, etc.). Exporting to something like Bibtex would be more of a problem, but they can be just dropped and the status quo maintained.
Otoh, better author management in Zotero, e.g. the simple ability to merge/unify names across the database quickly, would solve a lot of issues people have on a daily basis with a lot less overhead.
I like the way, this is done in Citavi. They provide an interface for certain field types where you are presented with all entries in your db for that field type and tools for merging and editing the entries. This exists for names (personal and institutional), journal and newspaper titles, series titles, publisher names, keywords, and some Citavi specific fields.
The editing tools in that interface allow access to all the different aspects of the field type, like all the different name parts, abbreviations, ISSNs etc., and an extra note field.