Duplicate PDF management

I had a very messy collection of PDFs that I recently added to Zotero. I used "Get PDF metadata" to turn them into proper Zotero items, and then cleaned them up under "Duplicate items" by doing "merge items" for each one. There were many of these - the PDF collection had many of the articles as multiple PDF copies. However, the result is now that I have many items in my library with two or more identical PDFs associated with them. This seems confusing and wasteful. Is there a way to automatically make each item only have one PDF?
Thank you!
  • No sorry. This has been discussed in the past and would generally be desirable, but it's not a trivial problem: PDFs are rarely _exactly_ identical (e.g. more and more publisher print timestamps&provenance on them) and in those cases Zotero wouldn't know which PDF to remove and in the worst case, the one deleted could be the one annotated by the user.
  • That makes sense. Is there at least a way to easily find entries with duplicate PDF? (Sorting by number of attachments?) Right now I'll not only need to manually delete duplicates, but also manually find them first in a library of thousands of items.
  • The tag #duplicate_attachments might help you
  • Sorry, I'm not sure what you mean. Could you explain more?
  • Open the tag selector (https://www.zotero.org/support/collections_and_tags) and select the tag #duplicate_attachments. It might help you to find duplicated pdf.
  • I see - there is nothing tagged #duplicate_attachments in my library. Possibly because all the items were created from PDFs stored on my computer, not from the web?
  • Possibly yes, sorry it didn't help.
  • edited April 1, 2019
    #duplicate_attachments isn't a tag that Zotero adds, so I'm not sure what you're referring to there.
  • edited April 2, 2019
    @dstillman I never created any tag, but this tag exists in my standalone version of Zotero.
    [edit] Seems related to the Zotero Storage Scanner, while I cannot reproduce with a fresh installation of the standalone Zotero
  • @poettli it's the Zotero Storage Scanner that creates those tags. Do you happen to know if we can delete all duplicates at once?
Sign In or Register to comment.