Merge all duplicates

Hello, first time Zotero user here, bit baffled that I cant ask the program to merge all duplicates, instead of having to click "merge" on each individual case.

As I have just combined some old and new reference libraries of mine, I have several hundred (if not thousands) of duplicates from the different ref. libraries, and I'm not in the mood of clicking "merge" then waiting 5 seconds and clicking "merge" again.. will take me hours..

Any solution to this?

/thanks
«1
  • No solution atm. Are the references identical? (except for date added, date modified timestamps) Otherwise, I'm not sure how Zotero is supposed to automatically figure out which metadata you want to keep and which you want to discard.
  • (Maybe I should add that I'm a biologist, so its mainly journal references that I'm talking about, and in the context of writing my own manuscripts etc.)

    Thanks for you reply.
    Yes, next to identical at least - and since I have so many of them anyway I'm not going to bother choosing one over the other, a simple merge using the newest item as "master" would suffice.

    If there isnt a way to get a solution to this I will basically have to choose one of my libraries and discard all of the others, since there is a big overlap. It also prevents me from searching on ISI for a topic/organism/method/etc I'm currently interested in and writing about, downloading all the citations and then putting them into my library. Because that would in almost all cases lead to a huge amount of duplicates...

    Actually, simply deleting the oldest duplicate would be perfectly fine too, if merging is intensive computationally.
  • If deleting is ok and the duplicates are coming from separate imports (e.g. A1, B1, C1, D1 were imported first, then you imported A2, B2, C2, D2), then the following may work out well.

    1. Select one of the items that is duplicated and tag it with "duplicate" tag. The tag should now appear in the tag selector in the bottom left.

    2. Open the duplicates collection. Select all item with Ctrl/Cmd + A and drag-drop them onto the "duplicate" tag in the tag selector. This will tag all duplicate items.

    3. Go to My Library. In the middle pane add the Date Added column. Use the column to sort all of your items. Now select the "duplicate" tag in the tag selector. This will display all duplicate items in your library sorted by Date Added. (e.g. A2, B2, C2, D2, A1, B1, C1, D1).

    4. Now, using Shift + click you can select all items up to some date (e.g. the timestamp when you imported the first time) and delete those items.
  • I tried following your advice, but I dont see how this would allow me to delete *only* the duplicates imported at the time, and not all the references imported at that time? Or do you mean I have to add the "duplicate" tag to all duplicate items? Because that kinda defeats the purpose of not having to click "merge" a thousand times in the duplicates list..

    I found a "solution" by installing endnote in trial mode, exporting the zotero library to RIS, importing the RIS file into endnote and chose the "remove duplicates" when prompted in the import sequence. But then I'm left wondering if I should just stick with endnote instead... although I dont particularly like it.
  • Or do you mean I have to add the "duplicate" tag to all duplicate items?
    Yes
    2. Open the duplicates collection. Select all item with Ctrl/Cmd + A and drag-drop them onto the "duplicate" tag in the tag selector. This will tag all duplicate items.
    That's a single step
  • This is a quick and dirty hack for automating the merger of duplicate zotero items.
    In order to use this you will need to have your zotero library open in firefox and have a set of duplicate items selected in the duplicate pane.

    Open a browser tab and enter: chrome://zotero/content/include.js


    Then open the "Web Console" from the developer tools(there are other options like Execute JS too)


    Then paste the following lines into the console:


    var Zot_Dup_Pane = Components.classes["@mozilla.org/appshell/window-mediator;1"] .getService(Components.interfaces.nsIWindowMediator).getMostRecentWindow("navigator:browser").Zotero_Duplicates_Pane;


    //note this will fail on duplicates that are not the same item type or it may crash if you have a large library - in which case I export subsections of the
    //library and merge them in smaller sets - importing them into a new library at the end.
    //You may want to change the while loop into a for loop with smaller iterations so you can control how many records are merged in each go

    while (true){
    Zot_Dup_Pane.merge()


    }
  • I ran into this problem and created a Sikuli script to automate clicking on the 'merge' button. I just click 'run' and leave it to run overnight (or over the weekend):

    https://github.com/escaped-echidna/zotero_merge_all_duplicates

    Also I noted that there was an 'AutoHotKey' solution posted earlier by a user called 'Graham_MTM.' I haven't used it but it could work:

    https://forums.zotero.org/discussion/30816/semi-automate-merging-duplicates

  • Speaking as a non-coding geek, do any of these scripts work for macintosh?
  • Sikuli(x) does run on a mac; in principle that solution should work, though haven' tested.
  • I should have clarified that I used my mac to both create and test the Sikuli script. I had to restart it a few times to merge all of the duplicates, because it would hang up for some reason - maybe because it didn't recognize all of the non-article icons.
  • Any solution in Zotero for this? Clicking "merge" for every duplicate is very impractical. Zotero already shows a list of items to choose the "master" prior to merging, perhaps the first item in each list can be chosen as "master" automatically for all duplicates? Thank you.

    The issue has been raised a few times already:
    https://forums.zotero.org/discussion/68501/merge-auto-function
    https://forums.zotero.org/discussion/40457/merge-all-duplicates
    https://forums.zotero.org/discussion/29381/merging-many-duplicates-in-zotero-standalone
    https://forums.zotero.org/discussion/50221/accept-all-duplicates
    https://forums.zotero.org/discussion/31565/how-do-i-merge-12-000-duplicates
    https://forums.zotero.org/discussion/36391/tips-for-speeding-up-duplicate-merging
  • @rachelmurray25 that was very helpful! I can make it work but I have issues when I have duplicates of different nature (i.e., six conference papers and one journal) which are recognized as duplicates but it won't let me merge them unless they have the same nature. Could you manage to automate that process too?
  • I sort by Notes and that keeps the auto highlighting in Duplicate collection steady. I then can just click several hundred times to merge my datasets... good enough for me.
  • I found a way to (kind of) automate it, which works for me so far.

    Open up the Duplicate Items folder. Go to Tools > Developer > Run JavaScript

    and use this Snippet:

    var DupPane = Zotero.getZoteroPanes();
    for(var i = 0; i < 100; i++) {
    await new Promise(r => setTimeout(r, 1000));
    DupPane[0].mergeSelectedItems();
    Zotero_Duplicates_Pane.merge();
    }

    Basically, this click 100 times the "Merge X items" button with a second waiting time in between. This worked well for me, as I had to merge some thousands of items after importing multiple files. Maybe that helps someone. As I am not really aware of any implications / side-effect this function may have, use it with caution!
  • @marcelparciak Could you specify in which application this script should be run and if the web version or/and desktop version of Zotero should be running?
  • In the desktop version you have the Tools > Developer > Run Javascript menus.
  • This gives an error from time to time.

    `TypeError: oldestItem is undefined`

    One can fix this by fixing dates on any items with stuff like "In press".
  • Perhaps I am mistaken, but I think the reason that you get `TypeError: oldestItem is undefined` is that you have not selected any of the entries (i.e., you clicked on the Duplicate Items *collection* but did not click on any of the items that appeared within the resulting view). "Fixing dates" only appears to solve the problem because in the process of doing that you incidentally selected an entry.
  • edited June 15, 2020
    @Aurimas re suggestion dated September 17, 2014 above, my library does not seem to record 'date added' as the date added to zotero (rather, the date it was added to a previous ref. manager (I am migrating from mendeley)).

    ('date accessed' also does not reflect the date items are added to zotero)

    I imported references on several occassions, each with its own collection.

    Any ideas on mass de-deduplications in this case?

    Thank you
  • I imported references on several occassions, each with its own collection.
    Well, what do you mean by that? If you use Zotero's official Mendeley importer multiple times, it won't create duplicates. If you imported from a file multiple times, those would have distinct Date Added values.
  • @dstillman yes i used the zotero mendeley importer.

    I had to import several times as the import process failed several times.

    It creates duplicates.

    Files do not have distinct date added values.
  • @aurimas re comment of September 17, 2014; I cannot seem to add multiples items from the 'Duplicate items' special collection to a tag in the lower left panel.
  • @Chomplainer re comment of September 17, 2014; until there's something more workable when 'date added' does is not correct, and batch tagging duplicate items is difficult, and also sorting by tag, I have resorted to your suggestion and it works for me. Thank you.
  • @marcelparciak
    Your trick seems to be working for me. Running it now. No problems so far.
  • The applet "autoclicker.exe" is another useful resource in this situation, at least for Windows 10. (It does exactly what the name implies.)
  • re: @aurimas September 17, 2014

    "...Open the duplicates collection. Select all item with Ctrl/Cmd + A and drag-drop them onto the "duplicate" tag in the tag selector. This will tag all duplicate items..."

    I notice the 'Select All' shifts back to only one set of duplicates (A1+A2 (+A3 and so on where applicable)) based on the item where the cursor was last located to begin the drag action. i.e. it will not drag ALL items in the 'duplicate items' collection to the duplicate tag.

    It seems 'selection' in the 'duplicate items' collection does not behave like it does in other collections.

    Does anyone else experience this? Is there a way to drag ALL duplicate items to the tag as suggested in this solution?
  • edited June 10, 2021
    @a.abdul-rahman

    You can use Zutilo to copy a tag to all selected items.
  • I can't believe Zotero has not find a way to merge all duplicates with one click of a button. I have 876 items in the duplicate folder. There must be a way in 2021?
  • Thank you @marcelparciak for your script.

    It works for a few items, then I get the message: TypeError: oldestItem is undefined

    Any idea why?
Sign In or Register to comment.