Zotero cannot find PDFs for items imported from eric.ed.gov .nbib files

Hi,

I'm working on a search for a scoping review and the primary database where I'm conducting my search is ERIC, via the public website at eric.ed.gov. (I have access via EBSCOhost but I'm not a fan of their interface nor how they index ERIC fields.) I was very surprised to discover that Zotero is unable to find any PDFs at all in a set of 439 references exported in .nbib format. 287 of these have full text available directly from ERIC.

The Browser Connector doesn't seem to have this problem, but given that ERIC doesn't permit changing the number of results per page, its not a feasible alternative for retrieving PDFs at this volume. Plus, I notice that the Browser Connector imports everything from ERIC with a Book type, while the .nbib items are all imported as Journal Articles. (Both are suboptimal -- most of the results are Reports -- but Journal Article is a better type than Book, at least.)

Is this a bug or are eric.ed.gov users simply out of luck when it comes to PDFs? And is there any possibility of paying closer attention to the publication types to get more accurate item types on import?

I am on a Mac running macOS 12.6 and Zotero 6.0.15. Please let me know if you need any more details about my system or my search.
  • edited October 7, 2022
    Mac 12.6 / Zotero 6.0.16-beta.3+29dd0cf5d / Firefox 105.01

    I do not have a problem with ERIC items that are journal articles importing as books. ERIC report types seem to all import as report and books as books on my similar system. I doubt if translators are different for those of us who use the Zotero beta.

    If the records I suggest below will import as books you will indeed need to provide more detail.

    I almost always follow the "Direct Link" to the publisher to obtain the metadata and the PDF. I find that the ERIC metadata is sometimes less complete than that obtained from the publisher. This is especially a problem with article abstracts and pagination. Sometimes the ERIC record doesn't have the DOI and sometimes it does. Thus, I depend on ERIC only for identification but go to the publisher for the metadata.

    A Zotero person will need to answer whether the ERIC translator is sufficiently "smart" to follow the multiple links needed to get to the publishers' sites to find the PDF. I haven't tried that in a long long time.

    I download to Zotero ERIC journal articles as journal article type. Search for this on ERIC: Are Bullying and Reproduction of Educational Inequality the Same Thing? Towards a Multifaceted Understanding of School Violence.

    That imports as a journal article using either the individual item icon or as a selection from the "group folder" icon.

    I often don't get from ERIC the article DOI . See:

    Interpersonal Predictors of Suicide Ideation and Attempt among Middle Adolescents
    Sallee, Emily; Cazares-Cervantes, Abraham; Ng, Kok-Mun
    Professional Counselor, v12 n1 p1-16 2022

    However I can get the DOI from ERIC sometimes:

    Proximal or Peripheral: Temporality and Spatiality in Young People's Discourses on Gender Violence in Sweden
    Joelsson, Tanja; Bruno, Linnéa
    Gender and Education, v34 n2 p167-182 2022
  • Note that Zotero wouldn't automatically try to get PDFs when importing an .nbib file -- you'd have to run "Find Available PDF" on all the imported items after that. Did you do that?
  • That's exactly what I did.
  • edited October 7, 2022
    Here is an example of an incomplete author list from an ERIC record:

    Learners' Perspectives on School Safety in Johannesburg
    Hochfeld, Tessa; Schmid, Jeanette; Errington, Sheri
    South African Journal of Education, v42 n1 Article 1936 Feb 2022

    http://files.eric.ed.gov/fulltext/EJ1344128.pdf

    If I was Shaheda Omar and I reviewed your citation without my name I wouldn't be very pleased.

    With ERIC it is always important to verify your metadata and edit appropriately. In this example the ERIC record also omits the article number.

    Again, ERIC is great for identifying literature but it is not a dependable metadata source.
  • DWL-SDCA, thanks for your thoughts and examples for investigation. The article you suggested does import as a Journal Article, so it may be that all the books I'm seeing are reports. All my browser-based imports have been from the search results, which means I'm only seeing the folder icon and not the book icon.

    I can't use publisher links in this situation, there are 439 search results. This is a systematic search for a scoping review. I actually had to pare down my initial search strategy because it exceeded the character limit at the web page. I reached out to their support people and they suggested I use the API instead, but it doesn't support quoted keyword phrases in searches and you can't download results in a format that can be imported into Zotero. I might have to do this in EBSCOhost instead.

    I too have noticed a lack of DOIs in ERIC records.

    Some other potentially relevant details for whoever might be reading, I have the following add-ons installed and enabled:
    Better BibTex
    ZotFile
    Zutilo

    I also have a number of disabled add-ons installed:
    DOI Manager
    scite
    Storage Scanner
    AutoIndex
    Citation Counts Manager

    Finally, I notice today that the Find Available PDFs experience is different -- instead of the dialog with the progress bar, I get a popup in the lower right hand corner of the screen that disappears after a few seconds, similar to the popup ZotFile gives you when it's renaming PDF attachments. Hm, maybe I should disable ZotFile and see if that changes any behavior.
  • edited October 7, 2022
    Yes, disable all plugins and try again.

    If you're still having trouble, provide a Debug ID for trying to find a PDF for a single item.
  • Disabling plugins and restarting did not change behavior.

    Debug ID generated, it's D476183896.
  • That doesn't show an attempt to use Find Available PDF.
  • Weird. Here's another one: D1643601980

    And here's a screen recording:

    https://www.loom.com/share/0af63b71b8c347429865b7b0644af4ef
  • Find Available PDF only works on items with DOIs or URLs and that don't already have PDFs attached.
  • Ahh, I see. These do have URLs but they are attached to the items as "Catalog Links".
  • Looking again at the above description and ERIC's behavior, the issue has very little to do with ERIC but with the fact that you're focussing on reports. Finding PDFs works almost exclusively for journal articles (and other resources with a DOI). Getting a PDF for a report or other gray literature is generally rare. You'll find that for journal articles in ERIC, the function performs pretty well (and should be close to 100% for articles with full text in ERIC)

    As for the item type -- ERIC's description of item types is a mess -- e.g. lots of items are both reports and journal articles which is a mess, but yes, I think we can do better than what we do right now, but likely won't be super quick.

  • I mean, I'm not intentionally seeking out reports, it's the nature of the search that so many get returned. Many of them do have PDFs in ERIC, too.
  • @adamsmith: But from the screen recording, @marijane is trying on journal articles. There's just no URL in the URL field, because the .nbib translator is creating Catalog Link linked-URL attachments instead of putting the URL in the URL field.

    E.g., for "Cultural Reciprocity Aids Collaboration with Families", with the ERIC URL in the URL field, it will find and attach the PDF.
  • (Or maybe that should be a Report, but regardless, Find Available PDF would work on it with a URL in the URL field.)
  • Does the Browser Connector find the PDFs because it does a better job with the URLs?
  • edited October 7, 2022
    No — the issue is just that Find Available PDF needs to know where to go to try to find a PDF, and without a URL or DOI it doesn't have anywhere to go. A "Catalog Link" attachment isn't any sort of consistent, meaningful thing in Zotero.

    But the fact that the Connector can save a PDF does mean that Zotero knows how to find the PDF on an ERIC page (which would allow Find Available PDF to work in the first place), and the Connector is also saving the ERIC URL to the URL field, which suggests that it's the kind of URL that belongs there.

    So the question is just whether our .nbib translator (which can of course be used for data from multiple sites) should be populating the URL field instead of creating a link. I don't know enough about .nbib to know whether it should.
  • I'm testing out the connector in a different library and I notice that the URL it saves is not the same as the Catalog Link attachments, it's grabbing the entire search results URL with the ERIC document ID &ed onto the end of it, which is also the URL for the attached page snapshot. Interesting difference.
  • The web translators saves the URL into the URL field if and only if the PDF is on ERIC -- that's the correct and desirable behavior.
    Unfortunately, the .nbib file just doesn't provide that information and so I don't think we would do this (recall that the format is originally used on and created for Pubmed, which has no full text at all)
  • Yeah, so there's no real fix here other than special-casing attached ERIC URLs in Find Available PDF, which…I'm not super eager to start doing.
  • edited October 7, 2022
    I mean, are ERIC Numbers (OID in the .nbib) a thing? Should we be setting those in Extra and using them in Find Available PDF?
  • I mean, are ERIC Numbers (OID in the .nbib) a thing?
    That kind of depends how broadly you want to start saving PIDs -- they are the ERIC IDs that are also used in the API and to construct the URLs, so they're definitely a thing and I don't think it'd be unreasonable to save them the same way we're saving some other IDs (like OCLC) even though they're never used in citations.

    That should probably go along with some sort of long-term idea of what to do with those types of IDs in Zotero (which you may already have).
  • edited October 7, 2022
    OK, then I think saving them would be good. I have no problem updating Find Available PDF and other functions to use any specific ids (whether in Extra or real fields later). I guess there often wouldn't be a PDF, but it seems worth checking. Issue created.
  • While we're doing this, any reason not to add search translation to ERIC.js? I assume we could then add look for ED[number] or EJ[number] in both Add Item by Identifier and Retrieve Metadata for PDF.
  • No real reason, no, and I think using E[DJ]\d+ should work nicely, but if we do that, we should be polite & run that through their API, which is very straightforward, but uses JSON (or XML) for results & doesn't have an .nbib option.

Sign In or Register to comment.