I have access to pdf, but zotero cannot automatically add it

Hi everybody, it happens to me that I have access to an online resource but the chrome plugin does not extract the pdf automatically, so I need to download them manually and then add the pdf to my Zotero library. It mostly happens with journals in the Science Direct repository. Is there a solution? Thank you
«1
  • Can you provide an example URL and a Debug ID from Zotero (not the Connector) for trying to save and not getting a PDF?
  • I was experiencing the same problem and tried to replicate for a Debug ID. Turned out the pdfs then were automatically downloaded. The only thing I did in the meantime was manually download and add a pdf to Zotero. Afterwards, this and other pdfs from the same journal were automatically retrieved.
    Here is the Debug ID anyways: D1976779603
  • edited June 24, 2022
    Elsevier has some over-zealous anti-bot defenses on ScienceDirect that make it much harder for Zotero to download PDFs than on other sites, even when you have subscription-based access. We have a special method specifically for this purpose, but it may fail intermittently and potentially needs a longer timeout. @JCricchio, if you reload the page, make sure nothing else is running on your computer (including a Zotero sync), and try again, does it fail every time? Is this a particularly slow computer?

    See below.
  • edited June 24, 2022
    @JCricchio, @jelkebosma, @evelo, @liulu0227, and anyone else experiencing this:

    In the latest Zotero beta, we've increased the timeout somewhat for the special handling we need to use for ScienceDirect. You can try that and let us know if it works for you. If it doesn't, go to the Config Editor in the Advanced pane of the Zotero preferences and try increasing downloadPDFViaBrowser.onLoadTimeout to a higher value — e.g., 2000 or 3000 (2 or 3 seconds) — and then try again. You'll need to reload the page in the browser before trying again.

    See below.
  • @dstillman it still doesn't work.
  • @liulu0227: You'll need to say more than that. Did you try 2000 or 3000? Increase it until it works. If it doesn't work at 10000, provide a Debug ID for that.
  • It didn't work for me. Tried 3000, 6000, 10000... even restarting Zotero and the borwser.

    Debug ID: D773189815
  • It looks like our previous method to get around ScienceDirect's anti-bot measures may no longer be working. We're investigating.
  • Hi, any update on this? It slows down workflow so much to have to save and re upload pdfs.
  • Registering my interest in this too!
  • We believe this is fixed in the latest Zotero beta. You can try that and let us know if it's working for you.
  • Yes, now it works for me! Thanks!
  • Will the beta version be available from chrome eventually too?
  • Those are Zotero app betas, not connector betas, so this applies to all browsers.
  • This should be fixed now in Zotero 6.0.9.
  • edited June 26, 2022
    After upgrading to 6.0.9, PDF download on sciencedirect still doesn't work for me.

    I checked the debug log and found that the PDF url had been resolved actually, with HTML from sciencedirect like:
    <h1 class="title">Preparing your download</h1>
    However, later it reported like:
    Error: downloadPDFViaBrowser: Loading PDF via browser timed out on the JS challenge page after 1500ms Error: downloadPDFViaBrowser: Loading PDF via browser timed out on the JS challenge page after 1500ms onLocationChange@chrome://zotero/content/xpcom/attachments.js:1211:42 From previous event: Zotero.Attachments</this.downloadPDFViaBrowser@chrome://zotero/content/xpcom/attachments.js:1192:35 Zotero.Attachments</this.downloadFile@chrome://zotero/content/xpcom/attachments.js:1130:20
    This reminded me the parameter downloadPDFViaBrowser.onLoadTimeout provided by @dstillman in this discussion before. After setting it from 1500 to 15000, the download succeeded. Maybe this parameter should be set larger by default, with the consideration of different network condition worldwide.
  • edited June 27, 2022
    @rnicrosoft: How low can you set it and have a download still work?
  • @dstillman On my computer with the net cable, 2000 or lower definitely fails, 2500 has fifty-fifty chance, 3000 is okay. I leave it 6000 now, since it is only a timeout not a delay that have to wait everytime.
  • OK, thanks.
    since it is only a timeout not a delay that have to wait everytime
    Well, it's sort of a delay. If you don't have access to a ScienceDirect PDF (and some people never will), it's the delay before Zotero will move on to trying to find an open-access PDF for the article, and while I believe that should happen even if you go to a different page in your browser, it affects the progress shown in Zotero's save popup there, so it would likely cause people to wait. So we can't just increase it to any amount.

    We'll see if there's anything we can do to better detect systems where the page is still loading.
  • @dstillman Same problem occurred to me until I found this thread. I also set the delay at 6000 following @rnicrosoft solution, and the PDF download problem never showed again. As you mentioned in your last comment, I hope Zotero can automatically detect if the download page is still loading instead of setting it at a fixed amount of delay time.
    Thanks for the great work.
  • @yxhuang: Same question. What's the lowest you can set it to? Does 3000 work reliably?
  • @dstillman I just tried around 50 times (100s as the increment). For me, 2600s seems enough for me (2500 failed once in 50 trails). I think 3000s might be a safe choice for most people.
  • 8000 ms for me
  • We've set this to 3000 by default in Zotero 6.0.10, available now. We're still going to investigate whether we can better detect the page load without a fixed timeout.
  • @dstillman Thanks for the update!
  • I'm facing a similar problem. I'm not able to download open access PDFs even after setting the timeout to 15000. On zotero 6.0.26 with the extension on Chrome. Any other suggestions?
  • @hradini: Can you provide a Debug ID from Zotero for an attempt that fails?
  • @dstillman Here is one that fails: D1479700638
    It seems to fail with elife but not with journal of cell science. Both open access papers.
  • Another one D1112252044.
    Any update on that?
Sign In or Register to comment.