Can ISBN import be improved?
I regularly use the "add item by identifier" function. Apart from a small and easily fixable UI problem, I find it works well for DOIs, but not for many books.
Current problems with ISBN book import:
1. All creators are imported as authors (editor/author distinction is lost)
2. Multiple creators are not supported, only the first creator is imported (always as author, cf. #1)
3. Complex names are truncated ("van der Zee, Emile" becomes "Zee, Emile")
4. Multiple places (e.g. Oxford/New York) are imported with messy diacritics in between (Oxford;;New York)
5. Publishers are inconsistent (e.g. some "John Benjamins", others "John Benjamins Pub. Co.")
Many of these problems may be due to the source repository Zotero import is relying on, but unfortunately it seems the user has no say in what repository is used (Worldcat? Google Books?).
If it would be possible to get the Library of Congress data that would be near-perfect; they have in my experience not only the most extensive collection but also the highest quality metadata.
Current problems with ISBN book import:
1. All creators are imported as authors (editor/author distinction is lost)
2. Multiple creators are not supported, only the first creator is imported (always as author, cf. #1)
3. Complex names are truncated ("van der Zee, Emile" becomes "Zee, Emile")
4. Multiple places (e.g. Oxford/New York) are imported with messy diacritics in between (Oxford;;New York)
5. Publishers are inconsistent (e.g. some "John Benjamins", others "John Benjamins Pub. Co.")
Many of these problems may be due to the source repository Zotero import is relying on, but unfortunately it seems the user has no say in what repository is used (Worldcat? Google Books?).
If it would be possible to get the Library of Congress data that would be near-perfect; they have in my experience not only the most extensive collection but also the highest quality metadata.
My own experience with LOC data has not been as you describe. Perhaps, better than most but I find many differences in publisher names and places. I find that author names are not very consistent -- especially when books are released by different publishers.
I have given up hope that this will be fixed for items published in the past and I've little hope that there will be consistency for future publications in my lifetime. (Who would establish the gold standard?) My own experience with operating the SafetyLit database is that the metadata we are fed from publishers isn't consistent even from the same publisher. We add only 700 records a week and are able to hand edit to improve the consistency of publisher names and places. This work is done by volunteers. This requires lots of time -- an unnecessary cost to publishers when the goal is to list items for sale and not to blend their products with that of other publishers to facilitate a comprehensive search or a listing of works by an author.
I believe it should also be possible to query LoC for ISBNs and get much better data where it's available, but that would involve a lot more work.
That should be done soon-ish. But editors still won't work - they could, but Worldcat seems to only know authors.
Ajlyon - wouldn't SRU work really well for LoC?
http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=dc.resourceIdentifier=9780199286546&maximumRecords=1
gives us marcxml for any ISBN
To test, download this file:
https://github.com/adam3smith/translators/raw/worldcat/Open%20WorldCat.js
and place it in the translator folder in your Zotero data directory
http://www.zotero.org/support/zotero_data#locating_your_zotero_library
replacing the existing one with the same name.
Restart Firefox/Zotero Standalone.
Don't expect too much - Worldcat's RIS isn't that great either - but you should see some improvements in data quality, most notably multiple creators.
Any issues or observations let us know.
(Also - Worldcat RIS provides multiple ISBNs - the current translator imports all of them - what would be the desired behavior)
1. Name of (last) author imported with a period
2. Some ISBNs don't work (e.g. 0313304483)
(How do I revert to the old translator?)
yes, it should.
I'll check for the period.
The ISBN - it seems like the translator only works for 13 digit ISBNs, not for 10 digit. I don't think it's actually getting called for the 10 digit ones, so I'm not sure how my changes could cause that.
revert by using "reset translators" from the advanced tab of the Zotero preferences. For good measure, follow up by "update translators" from the general tab. May have to restart FF/ZSA.
Note that multi-author import works fine indeed with the new one.
If you embed the ISBN parsing script at https://github.com/ajlyon/identifiers-js, you could check if they're the same and drop the ISBN-10 in that case, but that'd certainly be more than any other translator does.
A new version is up under the same link.
All ISBNs (as long as they're in worldcat) should now work and periods removed after authors. Further testing much appreciated.