Given the PDF file of an article, how can you find out its URI?

edited June 28, 2019
In the Mac Zotero application, entries in a Zotero library have item links of the form "zotero://select/items/1_LEU73EB3". That lets you point from outside of Zotero directly to a specify entry in Zotero. From within Zotero, you can also use "Show file" to display the PDF file attached to an entry.

Question: given this PDF file on disk in a Zotero database, how can I find out its item link, or at least the Zotero library entry to which the PDF is attached?

I would like to create back links, somehow, from the PDF file to the entry to which it is attached. The file on disk is located in a folder such as .../database/storage/8Q9QS6P8, but I can't figure out how to map that identifier to something that corresponds to the entry in Zotero. Is there a way to do this? If there is not a ready-made solution, then I could program one if someone can point me in a starting direction.
  • If you paste 8Q9QS6P8 into the quick search bar in Zotero in "All fields and tags" mode, Zotero will show the PDF in context, i.e. attached to its parent item.

    There is a way to programatically get from the attachment to the parent item via both the server and the local API, but that's a bit more involved and may be overkill for your purposes.
  • Thanks for the tip!

    Regarding the programming options: among other things, I'm a software developer (https://github.com/mhucka), so I'm game :-) With many thousands of papers in my Zotero database, I really have no choice but to find some way to automate some of my processes....
  • You can copy zotero://select links using the third-party Zutilo plugin.
  • Maybe dstillman's suggestion helps, but I'm also not quite sure what you're after. Could you give us a more explicit use case?
  • @dstillman, thanks for trying, but your suggestion answers the opposite of what I'm trying to get at. I already use Zutilo and it's great for getting the link _from within Zotero_. But _outside_ of Zotero, if you are looking at the PDF file in the local database, it doesn't help.

    Let me try to explain it another way. There's a pile of PDF files in the local Zotero database on my disk. (In my configuration, I keep them in "~/databases/zotero-bibliography", where Zotero creates a "storage" subdirectory.) Now suppose you are browsing the "storage" subdirectory. Given one of the PDF files found there, is there any way to _directly_ find out which Zotero entry it corresponds to?

    If there is not currently a way to do that, is there any plugin that can write the item link into the PDF file (perhaps as a file property or Finder comment or other mechanism)? And if none exist, where can I look to get started with figuring out the APIs that might be relevant to implementing my own utility to do such a thing?

    (In my case, I index the PDF files using DEVONthink, so the problem I'm trying to solve is going from a PDF file in DEVONthink to the correct entry in Zotero, but the question is independent of the software involved. The problem is basically is there anything in the PDF file that points back to the Zotero entry?)
  • I mean, the folder containing the PDF is named after the attachment key, so I guess I'm not sure what else you're looking for?

    (The current, recommended URL structure to select an item, by the way, is zotero://select/library/items/:itemKey for personal libraries and zotero://select/groups/:groupID/items/:itemKey for group libraries. There's no way to know which is which from outside without a DB or API lookup, since attachment files are currently mixed together.)
  • Ah-ha! So that's what I didn't know: that there is a URL form that gets you the entry from the directory name. When I look at a Zutilo link, it has the form "zotero://select/items/1_8WC72JJ5" and the identifier at the end is not the same as the directory name.

    Thanks!
  • (the top-level item and the attached file(s) have different identifiers; for your purposes that's irrelevant, but it does matter elsewhere, of course)
  • edited June 28, 2019
    When I look at a Zutilo link, it has the form "zotero://select/items/1_8WC72JJ5" and the identifier at the end is not the same as the directory name.
    No, it's the same — you're just copying the parent item instead of the attachment item. If you generate a link for that attachment item with Zutilo, it will match the directory name.

    The format I gave is just a newer syntax of the URIs you're getting from Zutilo. The functionality is exactly the same.
  • I have used this excellent feature ("the folder containing the PDF is named after the attachment key") to implement Zowie [1], a tool that looks through the PDFs of a local Zotero database and writes Zotero select links into file metadata. This is working great, but a user reported it fails for them and now I understand why: they're using Zotero's "Linked Attachment Base Directory" feature, and the PDF files in that location apparently do not get stored by Zotero in a way that includes the attachment key in the directory path. Instead, their files look like this:

    /Users/personslogin/Box Sync/2021/Author – 2010 – Article title.pdf

    Is there any way to recover the attachment key for the files in this case?

    [1] https://mhucka.github.io/zowie/
  • Not without reading the database (and even then not uniquely, because there could theoretically be more than one linked-file attachment pointing at the same file, though that wouldn't usually be the case). Linked files don't "get stored by Zotero" at all — those are files that already exist somewhere on the disk and are simply linked to.

    To be clear, this would just be about linked files in general, not the "Linked Attachment Base Directory" setting. That setting only affects whether the file paths are stored as absolute or relative paths in the database.
  • Gotcha. Thanks for the explanation.

    I'll document this as a limitation in Zowie, and will also try to find a way to detect the situation and tell users at run-time.
Sign In or Register to comment.