'Confirm' option for automatic metadata retrieval
On a few occasions, the automatic metadata retrieval function has produced incorrect results, leading to the pdf in question being attached to an incorrect item. As far as I can tell, there is no easy way to undo this and 'un-attatch' the pdf so that it is once more standalone in my library. I either have to delete the whole item and re-add the pdf from wherever it came from, or else create the correct item there and then and drag the pdf from the incorrect item to the correct item, then delete the incorrect item.
It seems to me that the most suitable solution would be to have the option to confirm the automatically retrieved metadata before creating the new item(s). Shouldn't be too hard to implement, just a check box next to each item.
This may also help when retrieving the metadata for several pdfs, especially into an already populated library. If more than one is incorrect, under the current system they all get added to the library and it is easy to forget which ones needed correcting and then have to trawl through the library to find them again.
Another potentially useful feature in a similar theme is an ability to find metadata for an already existing item from identifier. For example, if you add a pdf then use the 'create parent item' option to produce a blank item, there should be an easy way to fill in those details if you have a DOI or other identifier. Similarly, items with incorrect automatically retrieved metadata could be corrected simply by entering the correct identifier.
It seems to me that the most suitable solution would be to have the option to confirm the automatically retrieved metadata before creating the new item(s). Shouldn't be too hard to implement, just a check box next to each item.
This may also help when retrieving the metadata for several pdfs, especially into an already populated library. If more than one is incorrect, under the current system they all get added to the library and it is easy to forget which ones needed correcting and then have to trawl through the library to find them again.
Another potentially useful feature in a similar theme is an ability to find metadata for an already existing item from identifier. For example, if you add a pdf then use the 'create parent item' option to produce a blank item, there should be an easy way to fill in those details if you have a DOI or other identifier. Similarly, items with incorrect automatically retrieved metadata could be corrected simply by entering the correct identifier.
It's still a bit annoying and I'd be OK with a confirm dialog as you suggest, but in the meantime this should help.
Also, the retrieve metadata function is designed to minimize such false positives, so if you have PDFs for which Zotero retrieves the wrong data we'd be interested. If you can link to them somehow that'd be great.
some version of this is planned. Probably not going to happen super soon, but definitely on the agenda.
The most recent article (and the only one I can still remember) I've had incorrect information for is:
Birks, H.J.B. (2005). Mind the gap: how open were European primeval forests? Trends in Ecology & Evolution 20, 154–156. DOI: 10.1016/j.tree.2005.02.001
which comes out as:
Malik, H. (2005). finds centromeres in the driver’s seat. Trends in Ecology & Evolution 20, 151–154. DOI: 10.1016/j.tree.2005.01.014
(as a side-note, that Malik paper should be entitled 'Mimulus finds centromeres in the driver’s seat' - possibly the retriever has a problem with the italic formatting on Mimulus.)
I guess the problem might be that the article PDF has the end of the Malik paper at the top of the page, above the start of the Birks paper, so it finds the DOI from there. But this isn't an uncommon feature of journal articles, so should ideally be taken into account in the retriever method.
Thanks for the quick response!
See here for how this should look: http://imgur.com/fL6NA4m
(colors will differ depending on your OS)
I can reproduce the two other issues, thanks, will take a look.
The Malik article is just wrongly entered in CrossRef (where we - and anyone else - query for metadata from DOIs):
http://www.crossref.org/openurl/?pid=zter:zter321&url_ver=Z39.88-2004&&rft_id=info:doi/10.1016/j.tree.2005.01.014&noredirect=true&format=unixref
IIRC, publishers deposit data directly with CrossRef, so this should be reported to Elsevier, but not sure. In any case, as you can see if you look at the XML, there's nothing we can do about that on the Zotero side.
I don't think it's that rare to have an article starting on the same page as the end of the previous one. However, it's also fairly normal for the PDF of an article to come with a cover sheet when downloaded, in which case the correct information should be retrieved from there in preference to the end of preceding article. So hopefully shouldn't be a problem too often.
OK, thanks for looking in to this.