Duplicates

A now closed discussion on duplicate detection in version 3.0b1 (http://forums.zotero.org/discussion/19230/30b1-duplicate-detection/) elicited the response that it was not possible to confirm detected items as non-duplicates. Is there any advance on this? It really is necessary, in my view. Zotero mistakes far too many items as duplicates - different editions of the same work, separate volumes of the same series, separate reviews of the same book, subsequent publications by the same author with similar or identical titles to earlier works, and so on. The work-around proposed in the earlier thread was to suspend duplicate detection for offending items. This is unattractive as an option, and I'm not sure it's available anyway. Essentially, whenever there is an algorithm that checks criteria to make a prediction there is the possibility of the prediction being wrong and there needs to be the option to correct it, surely?
  • I should add apologies if I have missed either (1) a false-positive option in 3.0, or (2) another thread dealing with this matter since the beta version one was closed. I don't think I have, but could be wrong.
  • nothing new.
    I don't think I quite understand why that's such a big problem though? So there are a bunch of items in the duplicates folder that shouldn't be.
  • I realise I could just ignore the Duplicate Items list. But that seems a waste when presumably an awful lot of work went into this feature; and if it worked or was manageable, it would be a really useful one for me.
  • A lot of the work went into the ability to merge duplicates, which should still be useful.
    How many wrong positives are we talking here? If it's 10-20, I don't see how that makes the feature useless. If it's 100-200 the problem is mainly that the algorithm still performs poorly.
  • It's more a matter of proportions of true to false positives than absolute numbers. There isn't a single genuine duplicate in the first twenty items in my list. The false ones are the huge majority, and discourage one from looking at the list.

    I'm sure the algorithm will be improved. But no algorithm will ever be sensitive enough to detect duplicates absolutely unerringly in all the many different areas of academic and professional publication that Zotero gets used for. Not would I expect unerring accuracy. But I would like to be able to put it right in relation to individual errors. It is irksome and messy that the programme thinks there are unresolved duplication issues in my material when there aren't, just as it is when a word processing programme wrongly thinks I've made a spelling error - but in the latter case I can add my spelling to its lexicon.
  • I agree with Clive. The Duplicates mechanism has great potential to help managing my collection, but I have tried and given up on using the Duplicates folder to weed out duplicate items. This is because it is much harder to systematically do so when you cannot identify items as "not duplicates". So, every time you go back, the same false positives are there.

    Thanks,
    Tom
  • I (regrettably) now use Mendeley for this task. It seems to have superior duplication detection and removal abilities.
  • Personally I've had very good experience with the Duplicates function - I suppose it may depend on the nature of your citation database (I don't usually keep multiple versions of an item, e.g. conference paper, working paper, journal paper, and people in my field tend to modify the titles each time).

    I had an idea that I could devise a workaround using "Advanced Search", e.g. tagging the "false positives" and then searching in Duplicates only for those items whose tag did not match that tag. But I notice that "Duplicates" isn't listed as a Collection, although Saved Searches are a type on which you can specify a query. Maybe somebody else is more knowledgeable or inventive with Advanced Search?

    Alan.
  • edited October 17, 2013
    I don't know if this is correct thread for inquiring about 100% duplicates. But will give it a shot...

    When one imports 100% duplicates to existing entries, do these duplicates go to straight to trash or does one need to "weed" them out manually?


    *******

    Seeing that the last active post is 2012, and not so explicit in terms of its relevance to Zotero 4.0+, I decided to open a new thread:

    https://forums.zotero.org/discussion/32538/duplicate-detection-for-zotero-40-/
  • you need to merge them manually. If Zotero recognizes them as duplicates, they're in the "Duplicate Items" folder, where you can quickly merge them
  • Ah - sorry might then now have to delete the other thread...

    But on this point, does one have to do one by one?

    Some suggested sets to merge seem to be quite unrelated, how do I indicate not a duplicate?

    And any batch methods?
  • no batch mode currently and nothing has changed on marking non-duplicates from the discussion above. Typically there is a good reason, though, e.g. do they have the same DOI?
  • Like for different chapters of same book.
  • with different titles? That's odd.
  • I am abit confused by this myself - they are all different titles, different authors, though I haven't systematically checked if they are the same DOI.

    Initially, I had mistakenly assumed the duplicates are 2nd and extra copies of what's in my library - so I trashed the whole lot out. But now it seems I need to go through it one by one.

    Would be nice if there is one folder for 100% duplicates which can be safely deleted.
  • Can you export the false duplicates as Zotero RDF and post them on http://gist.github.com ?
  • edited October 21, 2013
    Sorry will need to look into this another time - have some deadlines to meet.
  • I am newcomer here but having some difficulties...
    1 - when a citation (Podichetty ....) appears again in the text, it acquires a new reference number(10) and not the one (4) asit appears first in the text. How to proceed? Thanx a million!
  • What citation style are you using? Are you inserting the second reference from the "Cited" section of the dropdown list?
  • Where can I find the citation style I'm using?
    Yes, I am inserting the second reference fom the "Cited" section.
  • I'm using Chicago Manual of Style 16th edition (note).

    And now?
  • change to a different style like Nature or Vancouver - Chicago Manual uses Footnotes, those are supposed to be sequential and not repeat themselves.
  • Adam, when I change to Nature or Vancouver, the footnotes disappear. Where can I find them? How can I do to find them? (I'm a medical doctor from Sao Paulo, Brazil, and it was a friend from US who wrote to me about Zotero. I'm "suffering" a little bit as a beginner. Thanx for helping.)
  • Just completing: when I change to Nature or Vancouver, the footnotes disappear and I cannot find the references in the paper. Where are they?
  • Nature and Vancouver are both numeric styles, so they only have a marker in the text, like [1]. To add a bibliography, you have to use the "Zotero Insert Bibliography" button (see http://www.zotero.org/support/word_processor_plugin_usage ).
Sign In or Register to comment.