Accept all duplicates

I have looked through a few threads and haven't found any reference to an existing option to accept all duplicates, though several people over the last few years have requested such a feature. If it already exists then please let me know and apologies for not finding it already. If not...

I realise that one line of advice given to such people is to minimise duplicates at point of upload to Zotero, but for some (like me) this may not be an option. My situation is that I need to catalogue the records returned by 50 individual (very overlapping) searches, but also want to treat the combined records returned by all searches as a single collection (without duplicates). The subcollections facility is fantastic in this regard (I have uploaded the results from each search into a separate subcollection of one, big parent collection, and confirmed that merging duplicates in the parent collection still leaves each record in all relevant subcollections. Given that Zotero can automatically detect duplicates, an option to simply accept and merge all of them with one click of button would be massively helpful.

Thanks,

Mark
  • simply accept and merge all of them with one click of button would be massively helpful.
    That's not so simple though. What strategy should Zotero use for merging?

    When a field is only populated for one item, Zotero could be greedy and populate that field on merge (usually that is the desirable result).

    What if the field is populated in more than one item and the values do not match? Which value should Zotero pick for automatic merging? There are some heuristics that we could do to determine if one value is better than the other, but, generally, Zotero needs the user to decide which value should be retained.

    So, we can improve this a little bit, but, overall, I don't think that we can avoid user interaction. Similar problem exists for sync conflicts as well.
  • Thanks for your reply. Where two entries are not perfectly identical I agree that automated detection of duplicates will always be difficult. However, detection isn't an issue here - all duplicates I want to get rid of are PERFECT duplicates, and from what I can see in the duplicates selection pane Zotero already "knows" what records to merge. All I want is a button to enable me to merge them all with one click. Even for users dealing with non-perfect duplicates, an option to select fields required to be identical before finding duplicates, then ability to check through putative duplicates before an automated, one-click merge, would put the onus for getting the process right squarely on the user. This would save many users a lot of time.
  • No, the issue is not with finding duplicates, it's precisely with merging duplicates that have different values for the same fields. In any case, I agree with you that we can make the process much easier for duplicates that are perfectly identical.
  • With regard to merging of non-unique fields, you could have a few basic options that would work well for many situations. E.g. Keep field with most characters, Prompt user whenever all fields aren't identical, keep oldest entry in library, or keep newest. One of these should suit all users at least as well as current situation of manually approving each and every merge. Cheers, Mark
  • The issue on deciding which of the duplicates has to keep the information is not an issue at all. You could either stop the auto-duplicate functionality whenever there is a decision to be taken and ask the user, or just be greedy (prompt a warning to the user before starting the merging process). This really is straight forward and I wonder why Zotero devs haven't implemented this already.
  • For those of us who are careful about the quality, uniformity, accuracy, and completeness of the record data, merging is not often straightforward.
  • @ aurimas re comment June 25, 2015: at least for perfect duplicates, as you agreed, why is this still not available then? There's loads of forum pages on Zotero on the matter of a merge-all functionality for years and years.

    Is it a matter of technical possibility, manpower, willingness, etc? I believe in matters like this, developers need to be accountable and answerable to user needs.
Sign In or Register to comment.