Import reference for PDFs via doi

Newer PDF-files often have the doi or a html link to the webpage on the front page.
A nice feature would be to make a translator for pdf-files like this:
detectweb:
-uses the pdftotext function on the frontpage of the pdf
-searches (regex) the resulting textfile for a doi or link to webtext page
-loads the webtext page in the background and runs zotero on this page
-if the fulltext page has a translator return a zotero import icon
doweb:
-get the reference from the fulltext page and add the pdf as attachment.

The tools to do this seems to be there in the zotero code, but unfortunately my coding abilities are not good enough to make a hack demonstrating this.
  • The translator architecture probably isn't ideal for this, but it could be implemented as a context-menu option for existing standalone PDFs, perhaps with a dialog window to confirm the metadata selection before creating the parent item.
  • edited February 19, 2008
    Alternatively, all basic metadata (i.e. those that are usually required to cite a journal article) can now be fetched directly from the CrossRef OpenUrl resolver. Alf Eaton writes in his blog:
    ...the CrossRef OpenURL resolver/metadata server has a new parameter: format=unixref, that returns the full metadata for an item...
    More info:

    HubLog: Full OpenURL metadata from CrossRef
    HubLog: CrossRef Citation plugin
    CrossTech: Added XML format parameter to CrossRef's OpenURL resolver

    Matthias
  • The way that Papers does this is nice.
  • I really like this idea and that it is possible soon
  • Hi there,

    I just installed Zotero and like it very much! But I wondered whether there is a tool like the Citavi pdf picker? A tool to "extract" the information out of pdfs (author, titel) and transfer it into Zotero and reference lists?

    Kind regards,
    jonny
  • This is somewhat unrelated to the topic, so for further questions, please start a new thread, but...

    You can just drag-drop pdfs into Zotero, then select however many you want, right-click, and select retrieve metadata. This scans the PDF for DOI/ISBN and retrieves metadata based on that. If nothing is found, then it tries to figure out metadata based on PDF contents and Google Scholar search results (the accuracy is fairly good, but not 100%).

    Note that if you have author, title info in the PDF metadata or in the file name, those will not be read.
  • (documentation for retrieve metadata: http://www.zotero.org/support/retrieve_pdf_metadata )
Sign In or Register to comment.