barrier to entry: no duplicate detection
I'm migrating a group of 10-15 people onto zotero and, in the process, importing a pretty redundant old system of files/folders containing PDFs. as we bring in old folders and tag them, we're discovering lots of duplicates, and one of the great things about zotero is the non-exclusivity of tags and collections, allowing one paper to live in multiple collections. the problem is that as the library grows, manually pruning duplicates becomes impossible, and more so as individual users bring in their personal collections. even the rudimentary ability to find duplicates for manual merge/deletion would be a huge help. my understanding from the 4 year old thread linked below is that such a function might be a hidden preference but i cant find it. any advice on how to activate this feature, or any info on when it might become available would be great. this is obviously a crucial feature for many.
http://forums.zotero.org/discussion/42/2/duplicate-detection/
http://forums.zotero.org/discussion/42/2/duplicate-detection/
extensions.zotero.debugShowDuplicates
Set it to "true", and you shouid find the Show Duplicates option in the gear menu. It's said to be slow to run, but it should narrow down the listing to mostly duplicate items.
I ran it on my duplicate-free library of 826 items and it came up with 26 false positives, for a specificity of 97%, which is perfectly acceptable to then allow me to manually prune the duplicates. Of the 26 false positives, the heuristic, if that's the word, behind each was clear--patents with nearly identical titles, papers with the same first author or nearly identical titles, etc.
as rudimentary as it is, it's still a huge help to us in our importing of libraries.
many many thanks for this.
I'm really at my wits end on this one--I've patiently waited for about two years now for Zotero to implement some sort of duplicate detection, and hopefully deletion, and still nothing functional has been released. I can't even use my library anymore.
Sure I could spend 6 hours manually deleted everything, and I've tried that. Then I discovered that not every entry could find the snapshot attached to it, so really I'd need to go through each entry and pull up the snapshot to make sure the link hasn't been broken.
Please do something about this duplicates problem!
But we're aware of the demand for this feature. Doesn't really help you now, but for what it's worth, this only happens if you sync separate upgraded-from-1.0 libraries. Zotero sync never creates duplicates by itself—it can't, in fact.
I really appreciate you responding to my comment. I know you guys have put a lot of work into this and are very busy.
I guess my point is, it's been three years since people have been asking for duplicate detection, and so far there's still no solution. I've read the forums--and I understand you don't want to break links, etc, but frankly without this feature Zotero is severely crippled.
I created the option, but I cannot find the menu..
However, on top of my wish-list is to have the duplicate check performed at the time of import. I noticed that I can still import refs that are already in the database. Hopefully that will be included in the released version.
Again, thanks for sharing the hidden feature! Even in this not-yet-perfect state it's helping me a lot.
Using FF 3.6.13, Mac Snow Leopard.
Tried restarting, etc.