Save to Zotero fails for pdf behind login firewall - latest zotero 5 standalone

rafaqz · March 13, 2017

Pdf files supplied by a university (with a login page) or through a proxy can't be imported into zotero when saved from the pdf reader in firefox. Saving the citation from the journal directly attaches the pdf successfully, but many pdfs don't have that option.

Firefox shows "An error occurred while saving this item. Try again, and if the issue persists see..."

And here is the console error message:
```
Error: 401

Forbidden

That action is not authorized. Please ensure that you are authenticated.

(1)(+0000004): Error: Downloaded PDF did not have MIME type 'application/pdf' in Attachments.importFromURL()

Error: Downloaded PDF did not have MIME type 'application/pdf' in Attachments.importFromURL()
Zotero.Attachments</this.importFromURL</externalHandlerImport<@chrome://zotero/content/xpcom/attachments.js:336:12
From previous event:
Zotero.Attachments</this.importFromURL</process@chrome://zotero/content/xpcom/attachments.js:410:11
Zotero.Attachments</this.importFromURL<@chrome://zotero/content/xpcom/attachments.js:414:11
From previous event:
Zotero.Server.Connector.SaveSnapshot.prototype.init<@chrome://zotero/content/xpcom/server_connector.js:486:11
From previous event:
Zotero.Server.DataListener.prototype._processEndpoint<@chrome://zotero/content/xpcom/server.js:448:23
From previous event:
Zotero.Server.DataListener.prototype._bodyData@chrome://zotero/content/xpcom/server.js:325:3
Zotero.Server.DataListener.prototype._headerFinished@chrome://zotero/content/xpcom/server.js:297:3
Zotero.Server.DataListener.prototype.onDataAvailable@chrome://zotero/content/xpcom/server.js:201:4

(5)(+0000001): HTTP/1.0 500 Internal Server Error
X-Zotero-Version: 5.0-beta.167+b732a82
X-Zotero-Connector-API-Version: 2
Content-Length: 0
```

adomasven · March 13, 2017

Can you provide the URL of the page you are trying to save exactly as you see it in the browser?

rafaqz · March 15, 2017

Heres one:

http://ac.els-cdn.com.ezp.lib.unimelb.edu.au/S0743016701000031/1-s2.0-S0743016701000031-main.pdf?_tid=54a5ee78-0993-11e7-ba6d-00000aacb35f&acdnat=1489591584_526bd6e44decaf8560184a05d9e5abb7

adamsmith · March 15, 2017

(That's the PDF from http://www.sciencedirect.com/science/article/pii/S0743016701000031 )

Just to be clear, while this should work -- and I think we've come across the error of saving PDFs from the connector before -- the recommended way of importing this item would still be to go through
http://www.sciencedirect.com.ezp.lib.unimelb.edu.au/science/article/pii/S0743016701000031

rafaqz · March 15, 2017

Totally, that's what I'm doing.

The problem is really when its a pdf from a page with the same issue but without another page with metatdata to download from. Think any papers uploaded by to a university lms. Or sites that require clicking on einstein before download etc. Then its download + drag and drop time, which really isn't the efficient experience I'm used to in Zotero.

rafaqz · March 15, 2017

Something like this:

https://app.lms.unimelb.edu.au/bbcswebdav/pid-5643447-dt-content-rid-21291811_2/courses/NRMT90014_2017_SM1/Readings/Wiley 2007 Cultures of landscape.pdf

adamsmith · March 15, 2017

OK, makes sense. @adomasven would need to look at this.

adomasven · March 16, 2017

Regarding ScienceDirect I couldn't reproduce the problem with the direct PDF link - seems to work fine for me. As adamsmith said, ideally you would be using the publication page rather than PDF page anyway.

Regarding content on lms. Usually university proxies use cookies, but lms uses Basic Auth, which is kinda weird to work with as it's handled transparently by the browser without being directly exposed for developer access. I have some ideas about how to make it work, but it's a weird corner-case, so no promises for now.

rafaqz · March 16, 2017

I think the point is being missed here.

Downloading a pdf to Zotero from the pdf viewer does not work from **any location* that needs my university proxy. Be it LMS, science direct, or any other journal. The journals with human verification are the most annoying, as the publication page zotero button doesn't download the pdf either.

The link is from the pdf itself, but if you click it you will get a redirect to the parent science direct page. But re-clicking on the pdf and downloading from the viewer ***will not work*** through that proxy with the latest zotero 5.0.

Downloading from the publication page of science direct of course works! this is the third time I've said that in this thread. And interesting because it places the issue in your code, not the proxy. If downloading the pdf from the identical url works fine from the publication page, why not the viewer?

With the move to standalone only we have also lost the old "download to zotero" option. That makes this a workflow breaking bug.

Finally, you're talking to a coder like I'm a total newb. I wouldn't post an issue if there wasn't a real, consequential problem with your code.

adamsmith · March 16, 2017

Regarding ScienceDirect I couldn't reproduce the problem with the direct PDF link - seems to work fine for me.

that's the crucial part you're overlooking, though -- if the issue isn't reproducible on our end, that makes it tricky to fix (and also less clear where the location of a potential bug is).

rafaqz · March 16, 2017

I'm not overlooking that at all! If you read back, you guys asked me for a url that I had said was behind a proxy, I was never sure how it would help.

So how to proceed? do you need login details to test this thing?

I doubt it is just a unimelb specific problem as this all worked in 4.0, and just breaking one proxy seems unlikely.

rafaqz · March 16, 2017

Ok this might help:

Works fine in chrome. Broken in firefox. Somehow the connectors are handling this differently.

adamsmith · March 16, 2017

The URL had two purposes -- one, to check the format of your proxy (which it turns out is perfectly ordinary) and two, to make sure we're looking at the same page for testing. I'm assuming Adomas is testing this on his own proxy setup, so this should be analogous, but I'll try this with my proxy when I get a chance.

Different behavior with Chrome & Firefox connector might be a clue -- they're different versions of the connector; the Chrome one is the old code (assuming here you didn't build it from the repo), so unsurprising that it behaves like 4.0.

Just to be sure -- you are using the beta Firefox connector linked to here, right?
https://forums.zotero.org/discussion/comment/267549/#Comment_267549 (I don't think there's another way to connect 5.0 beta to Firefox, but just making sure)

Also, could you try this with all Firefox add-ons, especially privacy-related ones, disabled if you haven't already? And you've made sure that you have fairly permissive cookie settings in Firefox?

Also, non proxied PDFs are saving correctly from Firefox, right?

adomasven · March 16, 2017

Finally, you're talking to a coder like I'm a total newb. I wouldn't post an issue if there wasn't a real, consequential problem with your code.

Most people posting on the forums are not developers, so it is generally a fair assumption to make. No patronising intended and even then, being as clear as possible is just a good plan.

The link is from the pdf itself, but if you click it you will get a redirect to the parent science direct page. But re-clicking on the pdf and downloading from the viewer ***will not work*** through that proxy with the latest zotero 5.0.

That's exactly what I tested - clicking Download PDF and saving from there - worked behind a testing proxy we used for development. Of course, your university proxy does something differently, presumably uses Basic Auth, which is why it doesn't work.

With the move to standalone only we have also lost the old "download to zotero" option. That makes this a workflow breaking bug.

We are aware of this, but the new extension API makes it hard to keep the old behaviour. There are some alternative ways we might be able to support previous behaviour, but once again, no immediate promises.

I doubt it is just a unimelb specific problem as this all worked in 4.0, and just breaking one proxy seems unlikely.

The reason for this is that Zotero is forced to move away from providing a fully-fledged Firefox extension, which handled Basic Auth automatically. We are as unhappy about this change as you are, but Firefox has nearly phased out the XUL extension framework from their browsers and we are adapting for it as quickly as we can. E.g. proxy support was only added to the connector a few months ago.

If you could provide Debug IDs from both Connector side and Standalone side for attempting to save

http://ac.els-cdn.com.ezp.lib.unimelb.edu.au/S0743016701000031/1-s2.0-S0743016701000031-main.pdf?_tid=54a5ee78-0993-11e7-ba6d-00000aacb35f&acdnat=1489591584_526bd6e44decaf8560184a05d9e5abb7

we can take a look, but I am fairly convinced that the problem is Basic Auth, which the connector currently does not support. I've created a ticket to track the issue as it is likely that when 5.0 goes public more people will encounter it. For now this is the first report and we are aware of people successfully using standard cookie based proxies without problems with 5.0

adamsmith · March 16, 2017

but wouldn't basic auth also break in the Chrome connector?

adomasven · March 16, 2017

Yeah, I've managed to reproduce it on Firefox with the latest connector, so this is something Firefox specific. Thanks for doing additional debugging @rafaqz, will take a look.

rafaqz · March 16, 2017

Sure, that all makes sense. I wasn't aware these changes were mostly pushed from the firefox end, thanks for working around them so well. apologies for getting frustrated over a miscommunication.

The debug code is D1564969086.

adomasven · March 16, 2017

Okay, thanks again for helping with debugging. The PDF saving issue from Science Direct is specific to firefox -- sorry for not catching that. We'll push out an updated version of the connector and standalone sometime soon.

I'm still not sure if that will help with lms. Would be interesting to hear back from you whether that got fixed or not.

rafaqz · March 16, 2017

Thanks guys, I will test this thoroughly when the update comes through. I have a lot of pdf readings coming from everywhere at the moment.

dstillman · March 17, 2017

New Zotero and Firefox connector beta builds are out, so you should upgrade to those and let us know if you're still having trouble.

rafaqz · March 21, 2017

After some testing the update seems to work perfectly. Journals and LMS pdfs are all saved to zotero.