Finding and deleting/merging duplicates

Hi,

I had a few problems recently with syncing and then my zotero db got corrupted for disk reasons. The upshot is that in fixing my problems I've ended up with on average 4 copies of every entry in my database. Is there a tool (and if there isn't I'd like to suggest it) for finding and deleting/merging duplicates in the library?

In my case I know that most of the time I can delete 3 of the 4 entries that appear in the top level of My Library. But unfortunately only one of those 4 entries is categorised in a subcollection but it's impossible to know which one from the top-level view and if I go into the sub-collections those entries that I see there are exactly the ones that I don't want to delete. Actually my best solution would be a method to delete all entries that appear only in the top-level and not in any sub-collection.

Any suggestions please?
Dave.

PS. I should add that these multiple entries are carrying multiple copies of my pdf so this is also a disk space/speed issue for me.
  • Could you export the subcollections to RDF (I think that maintains the link to the PDF's, but you want to check that out!). Then zap everything and import the subcollections. I expect you'd need to rename them. And of course zip up and back up the sqlite and storage folders before you start any of this!
  • There are plans to add duplicate detection, but it is a rather tricky issue. For the time being you will need to use sorting by title and delete the dups.

    That said, you can see all the collections an item is in, hold down the “Option” key on Macs or the “Control” key on Windows. This will highlight all collections that contain the selected record. This should speed up seeing which one is in a collection.

    One other option: If you run an advanced search for "collection is not" and then specified each collection in your library, then set it to "any" instead of "all" you could identify all the items in your library which are not in a collection. Then, save the search as a smart collection. You could then select all the items in that smart collection and delete them.
  • Thanks Tjowens, I was really looking for this solution.

    Just a minor comment, shouldn't be "all" instead of "any"? If you set "any" you will delete all items but those belonging to ALL collections.

    ALL: collection is not MyCollection1 AND collection is not MyCollection2 ... AND
    collection is not MyCollectionX => only items not belonging to 1,2...or X are identified.

    ANY: collection is not MyCollection1 OR collection is not MyCollection2 ... OR
    collection is not MyCollectionX => only items not belonging to 1,2...and X are identified.

    I know it is confusing but I'm (almost) sure I'm right.
  • edited October 14, 2009
    Thanks for the correction julioraffo I do believe you're right. Must have gotten tripped up in the any/all is/is not situation.
Sign In or Register to comment.