'Confirm' option for automatic metadata retrieval

On a few occasions, the automatic metadata retrieval function has produced incorrect results, leading to the pdf in question being attached to an incorrect item. As far as I can tell, there is no easy way to undo this and 'un-attatch' the pdf so that it is once more standalone in my library. I either have to delete the whole item and re-add the pdf from wherever it came from, or else create the correct item there and then and drag the pdf from the incorrect item to the correct item, then delete the incorrect item.

It seems to me that the most suitable solution would be to have the option to confirm the automatically retrieved metadata before creating the new item(s). Shouldn't be too hard to implement, just a check box next to each item.

This may also help when retrieving the metadata for several pdfs, especially into an already populated library. If more than one is incorrect, under the current system they all get added to the library and it is easy to forget which ones needed correcting and then have to trawl through the library to find them again.

Another potentially useful feature in a similar theme is an ability to find metadata for an already existing item from identifier. For example, if you add a pdf then use the 'create parent item' option to produce a blank item, there should be an easy way to fill in those details if you have a DOI or other identifier. Similarly, items with incorrect automatically retrieved metadata could be corrected simply by entering the correct identifier.
  • As far as I can tell, there is no easy way to undo this and 'un-attatch' the pdf so that it is once more standalone in my library.
    there is a relatively easy way to undo this. You can just drag the PDF away from the parent item to turn it into a "standalone PDF" again. It's a bit finicky, so no wonder you missed this, but once you know it's possible shouldn't be hard to do.

    It's still a bit annoying and I'd be OK with a confirm dialog as you suggest, but in the meantime this should help.

    Also, the retrieve metadata function is designed to minimize such false positives, so if you have PDFs for which Zotero retrieves the wrong data we'd be interested. If you can link to them somehow that'd be great.
    Another potentially useful feature in a similar theme is an ability to find metadata for an already existing item from identifier.
    some version of this is planned. Probably not going to happen super soon, but definitely on the agenda.
  • edited October 20, 2013
    I had tried dragging pdfs away, but it doesn't seem to be working for me. I can drag them into a different item, but not just to become a standalone pdf again. Can you confirm it does work for you? - in which case I guess it's just a problem at my end. But glad to know this feature should exist.

    The most recent article (and the only one I can still remember) I've had incorrect information for is:

    Birks, H.J.B. (2005). Mind the gap: how open were European primeval forests? Trends in Ecology & Evolution 20, 154–156. DOI: 10.1016/j.tree.2005.02.001

    which comes out as:

    Malik, H. (2005). finds centromeres in the driver’s seat. Trends in Ecology & Evolution 20, 151–154. DOI: 10.1016/j.tree.2005.01.014

    (as a side-note, that Malik paper should be entitled 'Mimulus finds centromeres in the driver’s seat' - possibly the retriever has a problem with the italic formatting on Mimulus.)

    I guess the problem might be that the article PDF has the end of the Malik paper at the top of the page, above the start of the Birks paper, so it finds the DOI from there. But this isn't an uncommon feature of journal articles, so should ideally be taken into account in the retriever method.

    Thanks for the quick response!
  • edited October 20, 2013
    I had tried dragging pdfs away, but it doesn't seem to be working for me. I can drag them into a different item, but not just to become a standalone pdf again.
    definitely works for me. You need to drag the PDF between items.
    See here for how this should look: http://imgur.com/fL6NA4m
    (colors will differ depending on your OS)

    I can reproduce the two other issues, thanks, will take a look.
  • It gets the wrong article because the DOI of the preceding article (i.e. the Malik piece) is towards the top of the page. I don't know if we can/want to do much about that, I'd hope it's pretty rare.

    The Malik article is just wrongly entered in CrossRef (where we - and anyone else - query for metadata from DOIs):
    http://www.crossref.org/openurl/?pid=zter:zter321&url_ver=Z39.88-2004&&rft_id=info:doi/10.1016/j.tree.2005.01.014&noredirect=true&format=unixref

    IIRC, publishers deposit data directly with CrossRef, so this should be reported to Elsevier, but not sure. In any case, as you can see if you look at the XML, there's nothing we can do about that on the Zotero side.
  • I see - yep, dragging PDFs to between items does work. I was dragging to blank space below the other items, which seemed more natural to me.

    I don't think it's that rare to have an article starting on the same page as the end of the previous one. However, it's also fairly normal for the PDF of an article to come with a cover sheet when downloaded, in which case the correct information should be retrieved from there in preference to the end of preceding article. So hopefully shouldn't be a problem too often.

    OK, thanks for looking in to this.
  • I don't think it's that rare to have an article starting on the same page as the end of the previous one.
    true, but to trigger the false positive the previous article also has to have its doi listed at the bottom. I'd hope the combination is quite rare. If it turns out it's not, we'll need to revisit this
Sign In or Register to comment.