please give us a duplicate items pop up warning

hello,
overall zotero is fantastic. I hope you have a duplicate items pop up warning (like a small pop up that would pop up somewhere as soon as a duplicate is being downloaded.
the reason is that for me, I don't see any good reason to have a duplicate item. So duplicate items are always a headache for me. The problem is I never know if I'm downloading a duplicated item or not. When I discover some duplicated items, normally there are five or 10 already piling up in that folder. Because I create many different folders (more than a thousand), and very often an item would go to three or four different folders or even more. If I have one article that should go to seven different folders ABCD EFG. If I have only one article, is not a problem. If I have duplicated 2 articles, the first one goes to folder ABCD, the second one goes to folder EFG, by the time I discover the duplication, I usually will never be able to consolidate the article and move it to all seven folders. This is because an item doesn't show which folders it belongs to (which is another's feature I like to suggest-- for example, when we right-click an item, ideally a drop down window (maybe 2nd layer) shows all the folders this article belongs to).
Thanks a lot
  • For the second request:
    https://www.zotero.org/support/collections_and_tags#identifying_collections_an_item_is_in

    For the first request, there is an add-on that kind of does this if you're _not_ running Standalone (i.e. just the Firefox version). If you search the forums for "prevent duplicates" you should be able to find more info on that.
  • Hi Adamsmith,

    thank you so much for your quick reply. I encounter problems for both solutions.

    For the first request, I was able to get everything ready to install the add on to Firefox but the new Firefox blocks it (they have updated their security policy to block any unverified addons). I think if you do a poll, my suspicion is that most people will find duplication unnecessary. If 99% of people (I just made it up-- but you can ask researchers at your school) finds duplication a waste of energy, why not figured out a way to prevent it from happening once for all?

    For the second request, imagine that I have 1000 collections, zotero would lock the screen to the first folder (alphabetically) so I wouldn't be able to see any other folders if they are not showing within this one screen.

    Any suggestions? thanks a lot!
  • For the first one -- there's a hidden option in Firefox to disable that check it should be described in at least one thread on the add-on. The reason this isn't implemented in Zotero isn't because people don't think it'd be useful, but because it just hasn't been done (the add-on is a provisional solution with several lilmitations, including the fact that it won't work for Standalone).

    I don't have much else on the collections. I think displaying them in the right-hand panel is generally planned, but I wouldn't expect that super soon.
  • Migth you have any suggestions for deduping a huge set and not going one-by-one? I am using the Duplicate Items view and merging items title-by-title. I am working with a really big set of citations however (over 20K) and the citations are duplicated by title, but are not exact duplicates in terms of citation. I am using Standalone so I can't install the hidden option you've suggested above, Adam. The searches are easy to run if I could prevent pulling in duplicates, that would be ideal, I am using Standalone so I can't install the hidden option you've suggested above, Adam. The real issue is that the same citation is indexed by several databases (Pub Med, MEDLINE Ovid, Web of Science etc.) so the citations are not exact duplicates, while the title most often is. I am going through all 20K title by title and i am dying over here with it. I am open to suggestions. Please.
  • There's nothing obvious/out of the box. bwiernik is most likely to have clever ideas for solutions, given the work he does. I have nothing, I'm afraid.
  • I am guessing that you are preparing a systematic review. I will offer some unsolicited advice and a couple of comments. None of this will be directly related to Zotero more than to any reference management software. Duplicate identification is as much art as science.

    First, know that _everything_ in Medline is also in PubMed but PubMed has many records that are not in Medline. If you pull records from both you will have duplicates of each record you retrieved from Medline. Note also that, for some topics, you must take great care in selecting the index terms and text-words you use in your search. Any search of PubMed that doesn't also include text-words is certain to omit relevant records. (The articles that aren't in Medline do not have MeSH terms assigned and although there is an algorithm that can helpfully map MeSH terms to text-words it is imperfect.

    If you will look closely at the records imported from different databases there will be differences in the completeness and the character of the references depending upon the database. Unless you examine each duplicate item by item you will not know the most complete record. Some databases provide only author initials and not the full name. Others, don't provide the DOI. Some databases delineate title from subtitle with some sort of punctuation while others do not. Some databases, most notably Web of Knowledge / (Social) Science Citation Index, actually make additions to the punctuation and even the spelling of title words "Traumatic Brain Injuries" becomes "Traumatic Brain-Injury" or "Traumatic-Brain Injury". Prepositions are changed. Articles are changed. These changes are nihilartikel or Mountweazel entries to alert the database management that identical items in another database are captured or plagiarized from the parent database. This makes duplicate-finding more complicated. This also complicates using _all_ bibliography management software because the citation can have subtle inaccuracies. At one time the database record for some journal articles added or subtracted a page from the listed pagination.

    Also know that databases and journal publishers differ in how they treat letters to the editor and corrections that concern an article. You will want to identify these adjuncts to the articles that meet your inclusion criteria. Some of these can show up as duplicates other times they may not be found at all. Quite a few times in reviewing manuscripts I have pointed out that a cited article had a table that was corrected in a later issue and one time that the article had been withdrawn.
  • Do you have attachments already for each item? If using Medline/PubMed, I'm guessing not. In that case, if you have 20k articles that need to be merged and you don't want to do it one by one in Zotero, I'd recommend exporting all the references to CSV, opening true CSV in Excel, using Excel's remove duplicates feature, then re-importing to Zotero (after deleting the original exported items). In Excel, you will want to first sort using article title, then DOI or PMID to ensure that the record that is retained is the most complete one.
Sign In or Register to comment.