Zotero Connector saves PDF attachments even when disabled

bradrn · January 16, 2021

I personally do not want to save snapshots, attachments etc. when saving a website to Zotero. For this reason, I have deselected ‘Automatically take snapshots when creating items from web pages’ and ‘Automatically attach associated PDFs and other files when saving items’ in my Zotero preferences. Yet when I attempt to save a PDF from my web browser (Firefox), Zotero still downloads a local copy of the PDF to the Zotero storage directory. Is this a bug, or a problem with my configuration?

I am using Zotero Connector version 5.0.78 with Zotero 5.0.95.

dstillman · January 16, 2021

I'm not sure I understand what you're saying here.

If you save a PDF to Zotero…it saves a PDF to Zotero. If you're viewing a PDF, you're not saving a webpage. Are you expecting to click "Save to Zotero" while on a PDF, have it retrieve metadata for that PDF, and then delete the PDF? That doesn't and won't happen.

bradrn · January 16, 2021

Are you expecting to click "Save to Zotero" while on a PDF, have it retrieve metadata for that PDF, and then delete the PDF? That doesn't and won't happen.

Not quite. To illustrate, let’s say I’ve found an interesting article online in the form of a PDF — https://www.rhenderson.net/resources/papers/pluractionality_and_distributivity.pdf will do. I want to manage this PDF in Zotero, but for whatever reason don’t want to save it locally. (This, at least, is perfectly consistent with my other Zotero settings: I’m not save attachments or webpage snapshots either.) Thus, when I click “Save to Zotero”, I want it to retrieve metadata, save it in whatever collection I specify, add a link to the URL etc. — but I don’t want it to automatically download the PDF and save it in my Storage folder, which is what’s happening right now.

dstillman · January 16, 2021

It sounds like you're asking for exactly what I describe, just with a linked-URL attachment as well. (Zotero can't retrieve metadata for a PDF without the PDF, so the PDF has to be downloaded, and then it would have to be deleted.)

Those settings control what happens when you save from a webpage — whether Zotero then goes and saves a snapshot of the page and tries to download a PDF. You're not saving from a webpage, so they don't apply, and the text of the settings doesn't imply that they would — it's not an "associated PDF" being automatically attached to an item you're saving if it's the very thing you're saving manually.

The PDF setting in no way prevents you from simply adding PDFs to Zotero, either by dragging them to Zotero or by saving them directly from your web browser. The latter is what you're doing here.

bradrn · January 16, 2021

It sounds like you're asking for exactly what I describe, just with a linked-URL attachment as well.

I’m pretty new to Zotero; what exactly do you mean by a ‘linked-URL attachment’?

(Zotero can't retrieve metadata for a PDF without the PDF, so the PDF has to be downloaded, and then it would have to be deleted.)

Huh, interesting; why is this? (Not that I’m arguing; I’m simply curious from a technical perspective as to why this is the case.)

Those settings control what happens when you save from a webpage … You're not saving from a webpage, so they don't apply, and the text of the settings doesn't imply that they would — it's not an "associated PDF" … if it's the very thing you're saving manually.

Ah, now I see — thanks for explaining this! In that case, I suppose what I’m looking for is a way to get Zotero to treat a PDF hosted online as a webpage. (After all, they’re both accessed as URLs via HTTP; they just have different file format.)

dstillman · January 16, 2021

what exactly do you mean by a ‘linked-URL attachment’?

Attachment items that are just links to URLs. It's the thing you asked for. You can also create them manually from within Zotero.

> (Zotero can't retrieve metadata for a PDF without the PDF, so the PDF has to be downloaded, and then it would have to be deleted.)

why is this? (Not that I’m arguing; I’m simply curious from a technical perspective as to why this is the case.)

Because Zotero has to download the file and extract its text to be able to retrieve metadata for it. The Zotero Connector doesn't have access to the contents.

There are a few, rare sites where we're able to use translators from the PDF URL, in which case you'll see a different icon — e.g., a journal article icon. In that case it'd be equivalent to saving from the article page and you wouldn't get the PDF with your settings. But that's only possible on sites where there's an identifier in the PDF URL and the site makes it possible to download metadata based on the identifier. Most sites, and therefore most Zotero translators, don't work that way. And if there's no translator for the site to begin with, it certainly doesn't apply.

I suppose what I’m looking for is a way to get Zotero to treat a PDF hosted online as a webpage. (After all, they’re both accessed as URLs via HTTP; they just have different file format.)

It's just not technically equivalent. Zotero translators run within the context of webpages. PDFs have to be downloaded and processed.

And technical details aside, the user intent wouldn't be clear — again, there's no real reason to think that just because someone doesn't want a PDF automatically attached from a journal article page it means they don't want to save a PDF when they actually click the save button while viewing a PDF.

(For what it's worth, I can't recall anyone else ever asking for this.)

bradrn · January 16, 2021

Attachment items that are just links to URLs. It's the thing you asked for. You can also create them manually from within Zotero.

Yep, that does sound like what I’m looking for. But is there any difference between a link-only attachment and the URL field of an item, at least with regards to how they get used?

Zotero has to download the file and extract its text to be able to retrieve metadata for it. The Zotero Connector doesn't have access to the contents.

That makes sense… it’s inconvenient for me, but I can certainly accept this limitation.

Somewhat hypothetical question: in my case, I’m actually not particularly interested in any metadata other than the PDF filename and URL — I simply want to be able to use Zotero to manage these links. Given this situation, might it be possible for me to write a bookmarklet which retrieves the name and URL and saves them to Zotero? And if so, is there any documentation I can read to give me some idea as to how to proceed?

And technical details aside, the user intent wouldn't be clear — again, there's no real reason to think that just because someone doesn't want a PDF automatically attached from a journal article page it means they don't want to save a PDF when they actually click the save button while viewing a PDF.

I hadn’t really considered it this way, but this actually is true — I had been conflating the two situations.

(For what it's worth, I can't recall anyone else ever asking for this.)

Huh, that seems odd to me… I would have imagined this to be a fairly common request. (Maybe it’s because I originally tried to use a custom workflow based around org-mode, and only switched to Zotero later — perhaps I wouldn’t have needed to do this if I had just used Zotero from the very beginning.)

dstillman · January 16, 2021

But is there any difference between a link-only attachment and the URL field of an item, at least with regards to how they get used?

The URL field gets cited.