Take pdf's existing metadata
I've posted something related a long! while ago - there was little response so here's a second try.
Wouldn't it be nice if Zotero had a JabRef-like pdf metadata extraction feature (I think Mendeley does this as well; cb2bib does it). What's that all about? There are ways of associating metadata (author, keywords etc.) with a given pdf either by placing it into the "Document Information Dictionary" - that's old school - or by adding it as an xmp stream. Then the metadata is forever linked to the document. In JabRef if you drag a pdf with metadata onto the program, it automatically extracts author names and so on, creating a new bibliographic entry. That way the Zotero feature "Retrieve metadata for PDF" would get completely new value.
Cheers
Stefan
Wouldn't it be nice if Zotero had a JabRef-like pdf metadata extraction feature (I think Mendeley does this as well; cb2bib does it). What's that all about? There are ways of associating metadata (author, keywords etc.) with a given pdf either by placing it into the "Document Information Dictionary" - that's old school - or by adding it as an xmp stream. Then the metadata is forever linked to the document. In JabRef if you drag a pdf with metadata onto the program, it automatically extracts author names and so on, creating a new bibliographic entry. That way the Zotero feature "Retrieve metadata for PDF" would get completely new value.
Cheers
Stefan
skreisel wants to use (and/or write) tags to pdfs themselves. That has been discussed before, I think one of the reasons this isn't done is that there is no uniform standard for bibliographical tags - the DocInfo or xmp stream would be a bit ad-hoc. But I'm not sure - as this has been discussed before, though,if you want to get into this more, search around some and find the old thread so we don't go in circles.
Here are threads on the topic:
http://forums.zotero.org/discussion/3079/importing-and-associating-pdf-files-with-references/
http://forums.zotero.org/discussion/8635/rename-file-update-document-metadata/
AFAIK, it is still the case that Zotero would require executables outside of Firefox to do this (as pdf extraction, etc. currently use). Reading some types of metadata should be relatively easy to do without adding large dependencies. Writing metadata may require a bit more development of an external tool that could do this that is both small and cross platform.
Just to get back to you:
rquiroga - sure if the PDF extraction incl. the DOI works properly then Zotero retrieves metadata effectively.
adamsmith - "I think one of the reasons this isn't done is that there is no uniform standard for bibliographical tags {...]"; while I'd agree, bibtex is good candidate - even Google Scholar outputs in bibtex. So if the PDF metadata complies to bibtex "standard" (there aren't a whole lot of fields, just the basics like author, keywords etc.), why not use it?
noksagt - Most of the threads go into discussing if and how metadata could be written! to a PDF. Even though that's an important topic, I see the point of undermining Zoteros independence by referencing external software. I've "outsourced" the "Retrieve metadata for PDF" command and then writing metadata to the PDF by using cb2bib - it's simply much more flexible (e.g. also looks up metadata in Pubmed).
So basically what it boils down to is: Could Zotero read PDF metadata (i.e. from the "Document Information Dictionary" or the XMP-stream) and use it to create new bibliographic items?
Cheers
https://www.zotero.org/trac/ticket/1695
I don't think so either, and one can't expect that there will ever be a consensus with thousands of journals around.
However, as I pointed out, there are work-arounds to force relevant things into the PDF using e.g. either cb2bib or JabRef (both stick to bibtex). It won't be enough to populate the "Document Information Dictionary" simply because its limited to title, author, subject, keywords, some copyright information and date codes - so the XMP-stream has to be used. I'm not sure if xpdf's pdfinfo can read latter in its full extent. cb2bib uses ExifTool.
Subject: Nature Reviews Neuroscience 10, 670 (2009). doi:10.1038/nrn2698
yet these PDFs typically fail with Retrieve metadata :-( Yes, I can Add item by identifier using this DOI manually but this will not link the new item with the PDF in question.