Import reference for PDFs via doi

arnegj · February 18, 2008

Newer PDF-files often have the doi or a html link to the webpage on the front page.
A nice feature would be to make a translator for pdf-files like this:
detectweb:
-uses the pdftotext function on the frontpage of the pdf
-searches (regex) the resulting textfile for a doi or link to webtext page
-loads the webtext page in the background and runs zotero on this page
-if the fulltext page has a translator return a zotero import icon
doweb:
-get the reference from the fulltext page and add the pdf as attachment.

The tools to do this seems to be there in the zotero code, but unfortunately my coding abilities are not good enough to make a hack demonstrating this.

dstillman · February 18, 2008

The translator architecture probably isn't ideal for this, but it could be implemented as a context-menu option for existing standalone PDFs, perhaps with a dialog window to confirm the metadata selection before creating the parent item.

Matthias · February 18, 2008

Alternatively, all basic metadata (i.e. those that are usually required to cite a journal article) can now be fetched directly from the CrossRef OpenUrl resolver. Alf Eaton writes in his blog:

...the CrossRef OpenURL resolver/metadata server has a new parameter: format=unixref, that returns the full metadata for an item...

More info:

HubLog: Full OpenURL metadata from CrossRef
HubLog: CrossRef Citation plugin
CrossTech: Added XML format parameter to CrossRef's OpenURL resolver

Matthias

hubtero · February 19, 2008

The way that Papers does this is nice.

asplundj · March 14, 2008

I really like this idea and that it is possible soon

jonny44 · March 11, 2014

Hi there,

I just installed Zotero and like it very much! But I wondered whether there is a tool like the Citavi pdf picker? A tool to "extract" the information out of pdfs (author, titel) and transfer it into Zotero and reference lists?

Kind regards,
jonny

aurimas · March 11, 2014

This is somewhat unrelated to the topic, so for further questions, please start a new thread, but...

You can just drag-drop pdfs into Zotero, then select however many you want, right-click, and select retrieve metadata. This scans the PDF for DOI/ISBN and retrieves metadata based on that. If nothing is found, then it tries to figure out metadata based on PDF contents and Google Scholar search results (the accuracy is fairly good, but not 100%).

Note that if you have author, title info in the PDF metadata or in the file name, those will not be read.

adamsmith · March 11, 2014

(documentation for retrieve metadata: http://www.zotero.org/support/retrieve_pdf_metadata )