Find, consolidate/remove duplicate automatically
Sometimes when I import a library (say when syncing my libraries between two computers) I end up with a lot of duplicates.
As a zeroth order fix to this problem I want a feature to sequester all duplicates in their own folder for review. Then I'd like some way to delete the ones I want to delete and have them simultaneously removed from the main library.
As a higher order feature I'd like it if it could simply merge all the dupes as intelligently as possible, place the merged one in the main library and delete the others. I note that in most case this operation will be unambiguous. Things are most likely to differ in notes and attachmens but will rarely be in conflict. In the rare cases where two items do disagree, like say they show different publicaiton dates or URLs, then one can have the automatic merger be deferred.
As the highest order feature, Id like the to make it easy to handle cases where there is conflict inthe merger by being able to select one of the replicates and say that I want this one to win in any dispute but otherwise to merge the other content like notes and non-duplicate attachements.
As a zeroth order fix to this problem I want a feature to sequester all duplicates in their own folder for review. Then I'd like some way to delete the ones I want to delete and have them simultaneously removed from the main library.
As a higher order feature I'd like it if it could simply merge all the dupes as intelligently as possible, place the merged one in the main library and delete the others. I note that in most case this operation will be unambiguous. Things are most likely to differ in notes and attachmens but will rarely be in conflict. In the rare cases where two items do disagree, like say they show different publicaiton dates or URLs, then one can have the automatic merger be deferred.
As the highest order feature, Id like the to make it easy to handle cases where there is conflict inthe merger by being able to select one of the replicates and say that I want this one to win in any dispute but otherwise to merge the other content like notes and non-duplicate attachements.
This discussion has been closed.
I also have lots of duplicates since I started library syncing. Before this problem is resolved in the Zotero itself, could the developers release a small tool to rectify the symptom, if not the cause?
i.e. a tool that looks directly at your ZoteroDB and removes all (but one) of each set of records that is identical. It could require Firefox to be closed and operate on the SQLite db.
Any chance?
I have so many duplicates that it has added perhaps 30-40% to my library. Enough to bump me up to the 100mb limit..... as I contemplate whether to stump up some money for storage, I see an economic motivation for Zotero NOT to have a duplicate removal option!