"Convert linked files to stored files" made two copies of every attachment

I think I potentially triggered this by clicking on the command twice. Does anyone have suggestions from removing copies of files from my database without going through every single file and clicking "move to trash" for the second attachment?
  • (My storage directory is now doubled in size from 6 GB to 12 GB, which is also synced to the cloud.)
  • Quick question. It looks like I can use the python API to iterate through my library, looking at the children of each item, which already have an md5sum calculated, I see, and then deleting one of them if it's a duplicate. Any quick feedback on whether this is problematic?

    Thanks!
  • I don't know if this will miss items (it shouldn't given that they're exact duplicates, but may be overlooking something) but I can't see how it'd be problematic -- just make sure you only delete the attachment, not the top-level item ;)
  • Thanks for the speedy reply. If everything works, I'll post code snippet later!
  • An alternative might be to go back to one of the last automatic backups of the zotero.sqlite database. You might then need to sort your folders in Zotero's storage by date to remove the duplicated attachments. There is also this add-on that might help: https://github.com/retorquere/zotero-storage-scanner.
  • edited April 29, 2020
    Success! 6 GB saved. I'm posting the code as a gist b/c I can't figure out how to edit these comments for code.
  • Unfortunately, white space is being deleted. Here's a gist:

    https://gist.github.com/ckemere/44178399f9b104d9fb35e2b8d7c7cd20
  • The one thing that would be helpful - when I pull the items, I need to get all children to find out which ones are attached PDFs. Is there a way to avoid this?
Sign In or Register to comment.