Duplicate detection false positives
I've encountered an error in duplicate detection--items with the same or very similar titles are identified as duplicates even though they have different author/year/publication in one case, and different patent number and issue date in another. IIRC, the older duplicate detection algorithm tended to to be overly stringent with title-matching--tiny differences of punctuation or capitalization eluded detection. Perhaps the current problem is an unintended consequence of solving that issue?
If there were any way to manually mark non-duplicates (https://forums.zotero.org/discussion/23682/marking-nonduplicates/) that would be one way to solve the problem. Another would be to have the algorithm look to a secondary field to confirm duplicate identity.
example 1:
http://www.ncbi.nlm.nih.gov/pubmed/?term=12152921
http://www.nap.edu/catalog.php?record_id=12875
example 2:
http://www.google.com/patents/about?id=XNMDAAAAEBAJ
http://www.google.com/patents/about?id=moYXAAAAEBAJ
If there were any way to manually mark non-duplicates (https://forums.zotero.org/discussion/23682/marking-nonduplicates/) that would be one way to solve the problem. Another would be to have the algorithm look to a secondary field to confirm duplicate identity.
example 1:
http://www.ncbi.nlm.nih.gov/pubmed/?term=12152921
http://www.nap.edu/catalog.php?record_id=12875
example 2:
http://www.google.com/patents/about?id=XNMDAAAAEBAJ
http://www.google.com/patents/about?id=moYXAAAAEBAJ
Item 1 is an edited book.
Item 2 is a chapter in that book. The chapter has the same title as the book (really!); the chapter author is not one of the editors of the book.
All other information in the two items (editors, place of publication, publisher, and date) is the same.
I can reproduce this by making two new dummy items like Items 1 and 2 above. They show up as duplicate items only if Item 2 includes the name of the book editor(s); if Item 2 does not include the name of the book editor(s), then the two items do not show up as duplicates.
Thanks.
Can I do anything to correct this mistake?
CrossRef does assign DOIs not just for articles, but also to journal titles, volumes, issues, etc. but I don't really think they should be part of the article date.