Zotero Connector not saving PDFs from HeinOnline
When I try to download pdf articles from HeinOnline, it doesn't get the full text pdf, just shows a red X next to "Full Text PDF". I can download the PDF separately and drag it into Zotero, but when I click the connector button in Chrome, it thinks for a moment and then the text turns red with the red X icon - Debug output is D1770913059. This seems like the same problem that user "hbwhbwhbw" posted about a couple of days ago. It has been happening for me for about the same amount of time.
https://heinonline-org.ezproxy.lib.ucalgary.ca/HOL/Page?public=true&handle=hein.journals/crmcj12&div=23&start_page=239&collection=journals&set_as_cursor=0&men_tab=srchresults
I access it through my university's library for access, but that's how I've always used Hein and it worked until a couple days ago.
The permalink to the above page is:
https://heinonline.org/HOL/P?h=hein.journals/crmcj12&i=226&a=dWNhbGdhcnkuY2E
Let me know if there's anything else I can do to help troubleshoot.
I'm with HeinOnline, can reproduce the problem, but not sure whether anything changed on our end to cause this issue. If there is any insight Zotero can provide that I can pass along to our dev team, I'll be happy to do so.
<META HTTP-EQUIV="Refresh" CONTENT="0; URL=" PDFsearchable?handle=hein.journals/crmcj12&collection=journals§ion=23&id=&print=section§ioncount=1&ext=.pdf&nocover=&display=0">
The space at the beginning of the URL is breaking Zotero import of the PDF. We can work around this easily enough, but I'm also wondering why it's there?
<html>
<head>
<title>Redirecting...</title>
<script>function sleep(millis,callback){setTimeout(function(){callback();},millis);}function foobar_cont(){window.close();};sleep(25000,foobar_cont);</script>
<script type="text/javascript">window.location="PDFsearchable?handle=hein.aallar/spectrum0025&collection=journals§ion=18&id=&print=section§ioncount=1&ext=.pdf&nocover=&display=0";</script>
<META HTTP-EQUIV="Refresh" CONTENT="0; URL='PDFsearchable?handle=hein.aallar/spectrum0025&collection=journals§ion=18&id=&print=section§ioncount=1&ext=.pdf&nocover=&display=0'">
</head>
<body>
Please wait while your request is being processed. Due to the size of the requested file, the download may take a few minutes to complete.
<br><br>
<span id="newlink" name="newlink"></span>
</body>
</html>
URL=
:var m = pdfPage.match(/<META.*URL="([^"]+)/);
We can change that to accept any of single, double, or no quotes, if you prefer. (No quotes seems to be the standard recommendation, these days.) Otherwise I would think you would need to use single quotes for the
CONTENT
parameter and a double quote for theURL
(which maybe is what you did).<META HTTP-EQUIV="Refresh" CONTENT="0" URL="PDFsearchable?handle=hein.journals/alterlj18&collection=journals§ion=22&id=&print=section§ioncount=3&ext=.pdf&nocover=&display=0">
@adamjtramp: It looks like this line is actually now incorrect HTML.
URL
is a parameter for theCONTENT
attribute. It's not an attribute itself. The only reason the redirect would be working at all now is because there's also a JS redirect on the page. Otherwise it would just reload the current page, due to the absence of a URL.I assume you changed this for compatibility with the translator, which was looking for double-quotes after
URL=
, but I believe that was for a previous version of the site that used single-quotes forURL
. The proper fix here is for you to move the URL back intoCONTENT
and just remove the quotes altogether, as per spec, and we'll update the translator to properly use the URL parameter.