Zotero downloads then deletes PDF

andrew.weisman · February 26, 2013

For about a year now Zotero appeared to not download any PDFs on the page I was reading, those that have working translators, like those in acs.org. I ignored the problem, but I just noticed something: Zotero appears to download and then immediately delete the PDF! That is, when I click the download icon in the address bar, both the snapshot and PDF appear to be downloading and I see the icons/lines for these in Zotero, but immediately the PDF disappears and I'm left with just the snapshot. Any ideas? I apologize if I missed this issue in a previous post.

Thanks,
Andrew

adamsmith · February 26, 2013

which site? What you're seeing happens when the file that Zotero downloads is not, in fact, a pdf.

andrew.weisman · February 27, 2013

Hi Adam, thanks for your fast reply.

Here is an example of a NOT working article (PDF does not download):
http://pubs.acs.org/doi/abs/10.1021/jp310110r

Here is an example of a WORKING article (PDF downloads as expected):
http://pra.aps.org/abstract/PRA/v87/i2/e020302

Please let me know what other information you need from me!

Thank you,
Andrew

adamsmith · February 27, 2013

ACS PDFs download fine for me.

- Can you open the PDF w/Links from the URL above?
- Are you using Zotero in Firefox or with another browser and connector?
- Are those the URLs as you see them or is there an institutional proxy in the URL (something like: http://pubs.acs.org.ezproxy.yourinstitution.edu//doi/abs/10.1021/jp310110r )

olunet · March 2, 2013

Confirm. See also http://forums.zotero.org/discussion/9494/bug-pdf-is-added-and-then-deleted/

That is a bit annoying. I can download the article and add it manually, but I would love to rely on automatic solution.

I've tried to use VPN and proxy from different institutions without success.

adamsmith · March 2, 2013

confirm for ACS or another site?
The disappearing of the PDF itself is not a bug, but Zotero doing what it's supposed to do when something isn't a PDF. That's exactly what happened in the thread you link to.

The problem is always site/translator specific, so we need sample URLs. Also, see again the question on whether this is using Firefox or Standalone.

olunet · March 2, 2013

It is standalone version.
I'm using firefox and chrome with connector (with proxy, with VPN).
I've tested two sites above.
But also for my searched articles I could not get pdfs from APS and ACS sites. This is systematic.
Have a feeling that this is related to zotero and how it parse the pages.
Please write what kind of feedback or information would you need.

dstillman · March 2, 2013

Have a feeling that this is related to zotero and how it parse the pages.

Unless you're familiar with the code, don't make statements like this if you actually want help—you're just going to annoy the people who would otherwise help you. You have no idea what the problem is.

As adamsmith said, we need sample URLs where this happens—not just site names—in order to debug this further. A Debug ID for a save attempt would also be helpful.

adamsmith · March 2, 2013

this is a different problem then - the users above _does_ get PDF downloads from APS.
Start a new thread, say whether you have ever gotten any PDF to attach, e.g. on open access sites such as
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2738645/
also, include information on 5,6, and 11 here: http://www.zotero.org/support/troubleshooting_translator_issues

edit: like Dan says, specific sample URLs and Debug would be helpful as well, but do start a new thread please.

olunet · March 2, 2013

Dear Dan, thank you for pointing to Debug_ID.
Zotero standalone 3.0.14, ZotFile 2.3.1, three journals:
1) http://pubs.rsc.org/en/content/articlelanding/2013/ra/c3ra23239e
"Full Text PDF" is added and then dissapears
2) http://jcp.aip.org/resource/1/jcpsa6/v138/i3/p034703_s1
Webpage snapshot added, which is not expected, because in general preferences the corresponding field are untick
3) http://pubs.acs.org/doi/abs/10.1021/ja310640b
"Full Text PDF" is added and then dissapears

This is the same as reported by Frisbee. The Debug ID is D1106735880.

olunet · March 2, 2013

Until now, I've tested Zotero Standalone on Linux.
Just updated translators and turned off addons.
1) http://pubs.rsc.org/en/content/articlelanding/2013/ra/c3ra23239e
"Full Text PDF" is added and then dissapears
2) http://jcp.aip.org/resource/1/jcpsa6/v138/i3/p034703_s1
Webpage snapshot added, which is still unexpected and strange
3) http://pubs.acs.org/doi/abs/10.1021/ja310640b
PDF attached!!!

The Debug ID is D1314614383.

olunet · March 2, 2013

to adamsmith, points 5 and 6 checked -- all fine: http://www.zotero.org/support/troubleshooting_translator_issues

olunet · March 2, 2013

Similar on Windows PC: Zotero Standalone latest, almost clear installation, no plugins, translators up to date:
1) http://pubs.rsc.org/en/content/articlelanding/2013/ra/c3ra23239e
"Full Text PDF" is added and then dissapears
2) http://jcp.aip.org/resource/1/jcpsa6/v138/i3/p034703_s1
Webpage snapshot added, but should not be
3) http://pubs.acs.org/doi/abs/10.1021/ja310640b
"Full Text PDF" is added and then dissapears

The Debug ID is D283072362.

adamsmith · March 2, 2013

double check RSC with add-ons disabled in Chrome - that should really work.
AIP PDFs won't attach with connectors, so that's as expected.

dstillman · March 2, 2013

From Standalone:

(5)(+0000063): CookieSandbox: Managing cookies for
 http://pubs.rsc.org/en/content/articlepdf/2013/ra/c3ra23239e

(5)(+0000000): CookieSandbox: Added cookies for request to
 http://pubs.rsc.org/en/content/articlepdf/2013/ra/c3ra23239e

(5)(+0000179): CookieSandbox: Managing cookies for
 http://pubs.rsc.org/en/content/articlepdf/2013/ra/c3ra23239e

(5)(+0000000): CookieSandbox: Slurped cookies from
 http://pubs.rsc.org/en/content/articlepdf/2013/ra/c3ra23239e

(5)(+0000002): CookieSandbox: Managing cookies for
 http://pubs.rsc.org/en/content/articlelanding/2013/ra/c3ra23239e

(5)(+0000000): CookieSandbox: Added cookies for request to
 http://pubs.rsc.org/en/content/articlelanding/2013/ra/c3ra23239e

(5)(+0000128): CookieSandbox: Managing cookies for
 http://pubs.rsc.org/en/content/articlelanding/2013/ra/c3ra23239e

(2)(+0000206): Downloaded PDF did not have MIME type
 'application/pdf' in Attachments.importFromURL()

(3)(+0000000): 
<!DOCTYPE html SYSTEM "http://pubs.rsc.org/Content/dtd/custom.dtd">


<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="

(3)(+0000000): Deleting item 2475

dstillman · March 2, 2013

And same deal with ACS. A lot of whitespace, and then this:

<!DOCTYPE HTML>
<html>

<script

(We only include snippets in the debug output, so I can't tell you what the pages say, but clearly it's getting HTML pages back.)

dstillman · March 2, 2013

Also, in that second debug output, it doesn't look like ACS actually worked—I see the same problem.

adamsmith · March 2, 2013

I'm trying both RSC and ACS with Chrome (actually - Chromium) connector on Linux and getting PDF attachments, this is why I suspect something else is going on. You have no other Chrome extensions?

andrew.weisman · April 11, 2013

I apologize for not following up with this, but thanks olunet for doing so!

Any progress with this? I was using the Firefox version on Linux, but the problem continues to occur on Windows. I am indeed using a proxy: ezproxy.cul.columbia.edu.

URLs for which it occurs (in Windows I get a red X for the PDF):

http://apl.aip.org.ezproxy.cul.columbia.edu/resource/1/applab/v102/i8/p082109_s1
http://apl.aip.org.ezproxy.cul.columbia.edu/resource/1/applab/v102/i8/p083305_s1
http://jcp.aip.org.ezproxy.cul.columbia.edu/resource/1/jcpsa6/v138/i3/p034703_s1
http://pubs.rsc.org.ezproxy.cul.columbia.edu/en/content/articlelanding/2013/ra/c3ra23239e
http://pubs.acs.org.ezproxy.cul.columbia.edu/doi/abs/10.1021/ja310640b

Working URLs:

http://pra.aps.org.ezproxy.cul.columbia.edu/abstract/PRA/v87/i2/e020302
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2738645/
http://pubs.acs.org.ezproxy.cul.columbia.edu/doi/abs/10.1021/jp310110r

Note that the last links in each section above are nearly identical but only one works!

Through my proxy I certainly have permissions to get the PDF, and clicking on the PDF link (e.g., PDF w/links) definitely always brings up the PDF in Firefox.

Thanks to everybody,
Andrew

adamsmith · April 11, 2013

do you happen to use privacy mode or "Do note Remember History" in Firefox? There are problems using that with Zotero & a proxy currently.

If not, we'd like a debug ID for one of the failing downloads from you, too.
http://www.zotero.org/support/debug_output

gianlucabertaina · July 2, 2013

Hi,
I have the same problem with APS online journals with the Zotero plugin on Ubuntu. I have access to the journals through my University.
For example, the pdf link in:
http://prl.aps.org/abstract/PRL/v110/i16/e165302
Maybe the problem relates in the check that is performed by APS before accessing the pdf (one has to click on a picture of A. Einstein).
Actually for me a workaround is to select the pdf link in the browser and click the image to get access to the pdf, then going back to the initial page and Saving to Zotero as usual. For some minutes Zotero has no problems in importing APS pdfs, because the "Einstein check" is not performed anymore.

Hope this helps,
Gianluca

adamsmith · July 2, 2013

we won't be able to fix that. If a site requires clicking something to show that you're human or to agree to terms of service before getting a PDF, Zotero won't get the PDF (it deletes whatever it downloads because it is not, in fact, a PDF).