one-click update metadata for a zotero object
I use zotero to manage a library of journal articles. These articles are often first published electronically, and don't get page numbers, etc., until they are published in print. I often find myself making one zotero item when the article is published online, then when the print info is available, making another zotero item, attaching a link to the pdf, putting in new html tags to italicize things correctly, adding a journal abbreviation if it is not already there, mergeing duplicates, etc. I don't think that there is a way to have zotero update the metadata for an item automatically. It seems like it would be pretty easy to implement a one-click way that you could ask zotero to retrieve any changes to the metadata associated with a certain article, similar to the functionality that lets you create a new object by doi or pmid. It would be great if it could do this in such a way that it would not replace all fields, just empty ones, so things like html tags for italics in titles and user-added info like journal abbreviations wouldn't need to be entered manually again.
Regarding journal abbreviations, this is already somewhat automated (using a list of journal abbreviations from MEDLINE), so you do not need to worry about entering it as much.
GS without alternative for retrieving metadata from PDFs because it's the only comprehensive full-text archive, but once we have a DOI, PMID, or even a title, we can work with those.
THX a lot.
So we have to be carefull not to mix all those up.
The "retrieve PDF metadata" command would be enough to get this done, since it searches for the DOI anyway. The DOI never changes, even when an article goes from an ASAP to a published paper; thus if the same process were used for a parent item it should work.
Ideally, it'd also have a mechanism to work where a DOI is absent, though i'd say that's a lesser consideration.
I'm pretty sure zotero's lead dev would accept patches if someone wants to have a go, though I'd recommend discussing the general approach on zotero-dev before diving into it.
The GUI is there, the code is there, the feature is already known by its exact name.
I know pretty much nothing about programming or coding but I would definitely be willing to learn in this case. It just screams to be done!
I have 2 such update systems for my web-based (non-Zotero) bibliographic database. One system uses the pmid to requery the PubMed database 12 months after the online publication and (if the metadata hasn't been updated) every 2 months until the record is complete. My system inserts "ePub" for missing volume, issue, and pagination when these metadata items are missing (but I store the information for each field origionally provided by PubMed -- more on this later).
For the many journals that aren't included in PubMed we poll CrossRef using the DOI with the same 1 year delay and 2 month repeat.
Some publishers send updates of their metadata but the update contains what I think is usless information. For example, Taylor and Francis Group journals' initial metadata (depending on the datasource) might contain empty fields or "ePub" but a "0" for the pagination field. Sometimes TFG will update the "0" pagination metadata and provide the number of pages that the article will consume but still not provide any new volume, issue or page-range metadata. I don't want to store that number-of-pages value because it could be later confused with an article item number for electronic publications.
Some publishers of online only journals will provide metadata that includes an article number before the article has been assigned to a volume/issue. Sometimes that article number will be unchanged when the article is assigned to a volume/issue. In those cases the article number is often the right-most characters of the DOI. Other times the article number _will_ change when the article has been assigned to a volume/issue.
The question here is do we want temporary/transitory information in metadata fields or should ePub metada not be updated until final and complete information is available? I don't want a zero in the pagination field or the number of pages in that field. I believe that kind of metadata is unwanted by journal editors or professors. I have chosen to not provide temporary metadata but to wait until it is complete. I use the publishers' original metadata and knowledge of each individual publisher's metadata-release patterns in an algorithm to determine whether the updated metadata is yet useful of not.
edit
Another problem is how to time the requests for updates so that the system that holds the metadata will not be adversely affected by my requests.
Human Kinetics (late January 2019) for this journal article provides the pagination as 1-30 for this ahead-of-print publication. When assigned to an issue the actual page numbers will likely change but the publisher's pagination metadata isn't always updated. This is especially true when the prepublication article is assigned to an online only journal -- the number of pages masquerading as a page range often isn't updated to reflect the article number.
By the way, have a look at this publisher's page source html for this article.
Summary: Automatically updating article metadata will have limits to full accuracy and some amount of hand-editing will likely be needed.
Is there a timeframe for having this feature implemented/released?
If updating a whole item is too complicated. It will be nice to have a addon to just update one field.