Error when merging duplicates

Hey,

I'm writing my thesis and I'm struggling with the merge-function for my duplicates.

Before you merge you can control the articles that get included, and chose the "master item", but the majority of the duplicates that Zotero has identified include irrelevent articles. Some even suggest merging up to 40 items, where the articles are all different.

I will appreciate it like crazy if there is a way to solve this.

Thank you in advance.
Kindest regards,
Ida
  • Can you take a screenshot showing what you're describing, upload it somewhere (Google Drive, Dropbox, etc.), and provide a link here?

    Items shouldn't be detected as duplicates unless they share the same DOI or ISBN or a very similar title, and in that case that's just bad metadata in your library.
  • Here's the link: https://drive.google.com/file/d/10AS42475MJdBu8Ft4NqE69ROkZn2v_Vh/view?usp=sharing

    The data is downloaded from acknowledged databases.

    Am I doing something wrong?
  • edited November 18, 2022
    Where did you import from? If you look at those items, you'll see that they all have the same DOI, which is obviously a mistake — DOIs should be unique, which is why Zotero uses them for duplicate detection.

    (You can look at one of the items in the library root and copy its DOI into the search field in All Field & Tags mode to see all the items with the same DOI.)
  • Articles are imported from Pubmed, Cochrane library and Embase.

    I used the Zotero connector plugin for Pubmed and Cochrane, but for Embase I imported the articles as a RIS file.

    You're right! They do have the same DOI. That's very strange..

    So something must be wrong with how i imported the articles? I just don't know what.

    The DOI is the same for articles from both Embase and Cochrane, and I used different methods importing them.

    I'm very grateful for your help and advice!
  • edited November 18, 2022
    If you can provide exact Steps to Reproduce for a couple different items that end up with the same DOI, we can take a look. (In other words, you'll need to actually reproduce the problem and confirm that you end up with two new items with the same DOI, and then tell us exactly how we can reproduce that ourselves.)
  • https://drive.google.com/drive/folders/1GSmZ1mv2Q-uQSKiewuq4wU8Rtq0yDNcB?usp=sharing


    I went through the entire process again, and unfortunately the results are same.

    I've sent a link with different folders and names of the pictures explaining the steps of the event. I hope they make sense.
  • edited November 18, 2022
    Sorry, but those aren't steps to reproduce. See the linked page for an example. Steps to reproduce are, literally, go to this site, type this exact text into the search box, check these two checkboxes, click this button, etc., and then look at the two new items with these titles and see that they have this same DOI… We could try to suss all that out from the edges of your screenshots but really we just need you to tell us exactly what to do to reproduce this on our computers.
  • (And it only needs to be for two or three items, not hundreds. The point is to tell us the minimal steps to reproduce it so that we can fix it, not to add hundreds more items to your library for no reason.)
  • I've decided to do eliminate the duplicates manually.

    But thank you for your swift aid.
  • edited November 18, 2022
    OK, but the point is that we'd like to fix this for everyone if it's something we can fix. All we're asking for is for you to tell us the 30-second process to generate two items with the same DOI.
  • I find the instructions slightly confusing.

    So I'm not able to provide report ID, but you want me to use the "BAD Report ID: 1892199645" ?

    In the title or the text?
  • You're really overthinking this. The "Good" sections — not the "Bad" sections — are the examples from the page you're supposed to follow, but I'm just asking you how you're reproducing this. Just tell us what site you're going to, what you're searching for, what results you're exporting, and how you're exporting them. All we want to do is reproduce what you're seeing.
  • I'm sorry. I'm an overthinker. Especially around thesis-times, and I've struggled so much with this so I'm pretty exhausted.

    Apologize for taking so long. I had to investigate which database the duplicates came from.


    1. Go to embase
    2. Search for "Impact of transobturator tension free vaginal tapes on quality of life and sexual function in women with mixed urinary incontinence"
    3. Tick the box of the first article
    4. Click on the "export" button and choose RIS format (mendeley, Endnote).
    5. Export
    6. Download the RIS file
    7. Go to Zotero
    8. Go to the file tab and press import
    9. Import the RIS file from EMBASE

    10. Go to embase
    11. Search for "The relationship between age & the impact of pelvic floor symptoms: 'The 4,000 women study'"
    12. Tick the box for the first article
    13. Click on the "export" button and choose RIS format (mendeley, Endnote).
    14. Export
    15. Download the RIS file
    16. Go to Zotero
    17. Go to the file tab and press import
    18. Import the (second) RIS file from Embase

    19. Check the DOI of both articles.
  • Could it possibly just be my computer not cooperating?

    I checked my duplicates folder again and the two articles that previously had the same DOI (and I sent to you) now has been connected to other articles. I really don't understand anything.

    I hope it's just my computer/software/technological curse acting up.
  • Yeah, so if you go to each record and scroll down to the bottom, you can see that they have the same DOI (10.1002/nau.20973), which is also included in the RIS file for each. It looks like these are all from the same conference proceedings (pp. 839-840 and pp. 844-845), and at the time (in 2010) they didn't generate distinct DOIs for them.

    For a future version, we can try to improve duplicate detection to ignore DOI matches when the title or various other things are completely different. Ultimately, though, DOIs are the most reliable way to identify sources for all sorts of contexts (not least of which is persistent links to the individual sources), and journals shouldn't do this. And they mostly don't — this isn't something we see regularly.

    For now, your best bet is probably to just ignore these in the duplicate view. In a future version, we'll also make it possible to mark items as non-duplicates, so you'd be able to hide these at that point.
  • Thank you for everything.

    Looking forward to see the adjustments.
  • (the DOI is for the full conference program, which has just abstracts, not full papers -- Embase just appears to index every individual abstract and then includes the DOI to each of them, which isn't great, but also not completely unreasonable)
Sign In or Register to comment.