How does "Find Available PDF" works?

How is Find Available PDF feature works? I got questions from patrons on why it can't locate the PDF known to be available. I can't seem to find any documentation except for the blog post of unpaywall integration. Is this feature only works if unpaywall has it?
  • The relevant bit is at the bottom of the Unpaywall blogpost:
    Zotero can also now take better advantage of PDFs available via institutional subscriptions. When you use “Add Item by Identifier” or “Find Available PDF”, Zotero will load the page associated with the item’s DOI or URL and try to find a PDF to download before looking for OA copies. This will work if you have direct or VPN-based access to the PDF. If you use a web-based proxy, only open-access PDFs will be automatically retrieved using this new functionality, but you can continue to save items with gated PDFs from the browser using the Zotero Connector.
    This does mean that Zotero wouldn't be able to find PDFs your institution has access to where access doesn't go through their primary publisher (i.e. the target of the DOI/URL) such as those hosted on content aggregators like EBSCO, ProQuest, or JSTOR. Also note the on campus/VPN requirement.
  • Thanks you. I do understand how it functions now.
  • Quick follow up -- am I correct in understanding that, for gated PDFs, when I use the "find available PDF" command, *if* I have set up my institution's link resolver in the Zotero preferencse, then the "find available PDF" command will work pretty well even if I am using a web-based proxy?
    Thanks!
  • edited April 5, 2023
    @kate.nyhan: No, the OpenURL resolver has no connection to this — that's just for the Library Lookup option in the Locate menu.

    The Zotero app itself (as opposed to the Zotero Connector) doesn't currently have any way of accessing gated PDFs except via IP-based (so on-campus or VPN) authorization.

    We'd love to add support for web-based proxies in the app, but the problem is that Zotero would need to send URLs through a configured web-based proxy (after showing the login page in a web view), and it doesn't have any way of knowing which URLs are actually supported by that proxy. Ideally EZproxy would have an endpoint for authenticated users that shared the list of supported domains (or even the config.txt file directly), but we're not aware of any such endpoint (and didn't hear back from OCLC when we asked).

    An alternative might be to have institutions with EZproxy subscriptions submit the same config file to us, but we'd much rather people's clients get authoritative, up-to-date information directly from the EZproxy server.
  • Picking up this old conversation to see if I have it right this time (thanks in advance for the fact checking!):
    1. If I'm adding records to my Zotero library via the connector, and I have the "adutomatically attach associated PDFs when saving items" setting turned on, Zotero will generally be able to save paywalled PDFs for me, whether I'm using IP-based authorization or a proxy server. Question: Does it matter whether I'm using the connector on the publisher website versus a bibliographic database website? Does it matter whether I'm adding via the connector one record or multiple records?
    2. If I have some records in my Zotero library *already* that don't have associated PDFs, and I use the "Find available PDF" command, it'll try IP-based authorization and Unpaywall (not sure about the sequence). So, it's wise to be logged in to my university's VPN in this situation.
    3. If the Find Available PDF command didn't work, I could open the Locate menu with the straight arrow button in the Zotero pane, and choose Library Lookup, in which case Zotero will use Worldcat's OpenURL, or whatever university library OpenURL resolver I've chosen in my Zotero settings. This won't download a PDF for me, but it will (if the library has set things up properly) get me to a webpage where I can access the full text via my library's subscriptions, potentially via a proxy server. This command works on one record at a time.
  • 1. Yes. Multiple vs. single doesn't matter. Publisher vs. database does matter in most cases: Zotero will only get the PDF if it's hosted on the same platform (i.e. it'll get in on EBSCO where they have the full text, but never on PubMed)
    2. Exactly
    3. Correct
  • Thanks!
    As my colleague said, "user forums can be magical places," especially when the rapid response team adamsmith and dstillman are involved!
  • OK, back with more questions about "Find Available PDF."
    I just tried that command on a record that I had in my Zotero library already, without a PDF. The record had the DOI and a ScienceDirect URL. I'm on campus and ScienceDirect recognizes our IP range. The Find Available PDF command didn't work.

    Then I opened the article webpage in my browser and used the browser plugin to add it to my Zotero again. I have the "automatically attach associated PDFs" preference turned on. It couldn't add the PDF but it did add an html snapshot of the full text of the article. (Which, incidentally, is kind of great, because it reflows when I resize the browser window.) This happened in two different browsers (no PDF, yes html full text snapshot).

    I was able to download the PDF manually from the article website. I had to click through a CAPTCHA though. I don't know if that's because I had just tried to get the PDF via Zotero several times, or if the download problems were because of a CAPTCHA that I would have run into anyway....

    Any idea why Zotero isn't retrieving the PDF in this situation? Is there anything I should do as an individual user to try and fix it? Or anything I should ask colleagues at my library to do?

    Thanks
  • It might be this: https://forums.zotero.org/discussion/comment/451390/#Comment_451390

    In short, if you install Zotero 7 beta, the CAPTCHA window now appears in the Zotero app, allowing you to save the pdf while using the browser plugin.
Sign In or Register to comment.