Query Alternate Sites (Library of Congress) when Retrieving Metadata for PDF

Is there a way to manually change or add to the list of sites Zotero uses to retrieve metadata for top-level PDFs?

I understand it uses any ISBN found in the opening text of a document to query WorldCat, but I'd be interested in specifically changing the queried site for this process to the Library of Congress.

WorldCat is great but it's often not as exhaustive as LOC. It WorldCat often omits publisher, date, and other key information needed for bibliographies, etc.

Thanks for any help with this.

~Tim
  • you can't easily, no, but we could look into doing this.
  • Please do look into this. More and more, I've found that WorldCat has multiple records for any given ISBN. (This is with a query outside of the Zotero look-up.) A majority of the OCLC records have duplicates that have errors or omissions. I can understand that the database can have multiple entries for a given title because there can be multiple editions of the book each with its own ISBN.

    This is shameful. Library membership in OCLC is not cheap. It seems that the database could benefit greatly from additional curation -- even if it was limited to an automated bot. In adverts, the OCLC has been going-on about the number of records instead of the quality of its records. I feel pain when I see that a duplicate record with incomplete metadata has been accepted to the database.

    In my experience, the WorldCat database is far more complete than the LOC. Many published books aren't ever included in the LOC.
  • (OCLC does have a proper API that's hugely better than OpenWorldCat. It requires (paid for) API access and it'd be tricky for Zotero to use that even if they gave us the key for free because we couldn't store it publicly.
  • Actually, I was expressing my frustration with the API results as well as the _paid_ WorldCat.
  • edited April 5, 2017
    rant

    For example, a single journal title can have several OCLC accession numbers (print, electronic, microfilm editions) that is reasonable and proper. Alas, each of these formats often has several near duplicates each with its own catalog number even when having identical ISSNs. Thus, a single item "should" have two to four records -- one for each of its publication formats. It should not have numerous records for each format. These near-duplicate records each have minor or major differences in the metadata provided and each has its own catalog number. These records can have slightly different titles (subtleties such as &, and; :, -, etc.), places of publication, but have the same ISSN. Other near-duplicate records are distinguished only by prominent typographical errors. The place of publication Moscow, Москва, Moskva -- even when the journal has the same ISSN will often have one or more records for each spelling.

    Title differences can be as simple as sentence case vs title case or whether the title ends with a period or not. Typos and misspellings are tolerated (teh/the, nad/and).

    /rant
  • @adamsmith

    Thank you. It'd definitely be helpful to have additional options available for this.

  • So I checked, and retrieve metadata actually goes through our standard queue of ISBN databases (LoC, GBV, WorldCat in that order)already, and getting PDF metadata via ISBNs from LoC works for me, so if you're getting ISBN import from WorldCat, Zotero didn't find the ISBN in the Library of Congress.
Sign In or Register to comment.