Libkey and Custom PDF resolver

aarontaysmu · September 7, 2021

I was asked about whether our library link resolver supports the custom PDF resolver https://www.zotero.org/support/kb/custom_pdf_resolvers

My understanding is it works with things like Unpaywall, Scihub because they return pdf or links to pdfs but since our library link resolvers currently don't always return pdf and often just dropping you on the landing page (depending on provider), it wont work well.

I was thinking of other possibilities and the one that came to mind is ThirdIron's libkey infrastructure. In particular, I was thinking of libkey.io (there might be other similar 1-click to pdf systems out there but we only subscribe to libkey). For example you could do

https://libkey.io/libraries/646/10.1080/12294659.2016.1147753 and it would show you if a PDF link and available. My thinking is I could plug https://libkey.io/libraries/{doi} into the custom pdf resolver and then use selectors to grab the pdf link.

This is what I came up with (646 is for my institution, your institution is a diff number).

{
"name":"libkey",
"method":"GET",
"url":"https://libkey.io/libraries/646/{doi}",
"mode":"html",
"selector":".article-pdf-option",
"attribute":"href",
"automatic":true
}

But it doesnt work. Any thoughts? Does it matter that the download link eventually get you to the full text via the proxy?

dstillman · September 7, 2021

That wouldn't work, for a couple different reasons.

1) It looks like LibKey pages are rendered client-side via JavaScript. The "Find Available PDF" feature uses only the HTML that comes over the wire — it doesn't load pages in a browser. We could consider supporting that, maybe as an optional flag, but it would be slower. Another option would be to figure out (using browser devtools) whether there's an API request that LibKey pages are using to fetch data. If so, and it's a JSON API, you might be able to use that directly.

2) If LibKey requires a login in your browser to give you access, it's not something that would work non-interactively from within Zotero. If it uses IP-based access, and you're on campus or connecting via a system-wide VPN, that could work. (Same goes for publisher pages that you have IP-based access to files on, which would already be supported by Find Available PDF.)

aarontaysmu · September 8, 2021

Thanks.

About the API, right I forgot about that. So we could use

https://thirdiron.atlassian.net/wiki/spaces/BrowZineAPIDocs/pages/65699928/Article+DOI+PMID+Lookup+Endpoint+LibKey

Then follow the Unpaywall route for the json output?

Leaving aside #1, Can you clarify for #2 ?

So are you saying it won't work even if I have already signed-in once through my web browser for access via ezproxy for the session?

dstillman · September 8, 2021

Your web browser session is completely unrelated to this feature in Zotero.

As I say, this would be no different from "Find Available PDF" being able to retrieve PDFs from publisher pages. From the announcement blog post:

Zotero can also now take better advantage of PDFs available via institutional subscriptions. When you use “Add Item by Identifier” or “Find Available PDF”, Zotero will load the page associated with the item’s DOI or URL and try to find a PDF to download before looking for OA copies. This will work if you have direct or VPN-based access to the PDF. If you use a web-based proxy, only open-access PDFs will be automatically retrieved using this new functionality, but you can continue to save items with gated PDFs from the browser using the Zotero Connector.

aarontaysmu · September 8, 2021

Ah okay.

I guess the user wanting this is envisioning a scenario where he imported a bunch of references via RIS etc. So wouldn't go through the zotero connector?

But from what you described it won't be useful for my setup then, since we always channel everything via the proxy (even when in campus), nor do we use VPNs.

Other institutions might benefit though.

dkufner · December 28, 2022

I am not exacly sure what do you want to achieve but using this in the engines.json gets me full text PDF via LibKey:

{
"_name": "LibKey",
"_alias": "LibKey",
"_description": "LibKey Search",
"_icon": "https://libkey-app.thirdiron.com/images/logo-libkey-io-inverted-54455c1721960b0bbbb1bec5e61239d7.png",
"_hidden": false,
"_urlTemplate": "https://libkey.io/libraries/3148/openurl?query={z:DOI?}",
"_urlParams": [],
"_urlNamespaces": {
"rft": "info:ofi/fmt:kev:mtx:journal",
"z": "http://www.zotero.org/namespaces/openSearch#",
"": "http://a9.com/-/spec/opensearch/1.1/"
},
"_iconSourceURI": "https://libkey-app.thirdiron.com/images/logo-libkey-io-inverted-54455c1721960b0bbbb1bec5e61239d7.png"
}

aarontaysmu · December 28, 2022

Thanks this works but not quite what I was looking for since this affects only the lookup engines, which is a different feature.