Save metadata to PDF files
I think it would be really useful to have an option to save the author / title / keywords information directly to the PDF metadata. For some reason, most scientific papers don't use that information properly and it's not uncommon to find the DOI as the title of the document.
The same way most music organizers rely on metadata to sort and search files, I think that adding the option to correct metadata on pdf files would make pdf managing much easier.
However, I would keep that feature optional, as many people wouldn't like to have their pdf files altered in any way.
The same way most music organizers rely on metadata to sort and search files, I think that adding the option to correct metadata on pdf files would make pdf managing much easier.
However, I would keep that feature optional, as many people wouldn't like to have their pdf files altered in any way.
Right now there's an option to rename the pdf file according to the author and title. Why not a similar option to write the author and title metadata into the PDF file?
So I just don't see the point of implementing this. What am I missing? Why do you go through the trouble of doing this manually?
The problem is that Recoll depends, to a certain extent, on the metadata stored inside PDF files, otherwise the results are confusing e.g. showing the DOI number instead of the article's title.
I am aware that zotero's database is just for Zotero and it's not designed to be used by any other software, but it works pretty well nevertheless. I think it's one of the benefits of using free software, that you have full control of what happens to your data.
And I also think that this shouldn't be hard to implement. I might do it myself if I don't find any "official" tool soon. At the end, it's just a matter of being able to organize my data properly.
So the answer to "why?" would be: To provide better interoperability between tools that manage or use PDF documents.
As you say, it's free software so by all means see if you want to implement it—I'd guess Dan would accept a patch if it's well done— but between Qnotero for quick access and Zotero's built in full-text search I don't really see much of a need.
The Kobo is just a plain disaster for searching files: it only relies on its proprietary library and PDF metas. It cannot search or browse filesystem names, otherwise my issue would be solved by Zotero / Zotfile's local file renaming scheme. Indeed, the metadata in original PDFs are generally gibberish. So it would be useful to be able to batch rewrite just those 3 tags to something useful.
Best,
Jan
https://github.com/zaro/pdf-metadata-editor
https://code.google.com/p/pdf-meta/
http://sourceforge.net/projects/exiftool/
JabRef does have XMP-metadata support btw:
http://jabref.sourceforge.net/help/de/XMPHelp.php
https://github.com/JabRef/jabref
Why? Reading on a mobile device (!!!), sharing of pdfs, re-use of pdfs with other tools with different functionality (e.g. visualisation, personal data mining) - generally I use a multitude of tools on the same base for different tasks (vive free and open software).
Functionality? Basic Author, Title, Keywords would be enough, but adding and removing (part of the) metadata or batch edit the metadata would be a bonus, so I can for example add a copyright notice, use a variety of programs, whith the same file-base etc...
Thoughts: This - strangely - is a feature that has often been requested by users from a variety of biblio-software, but has not been implemented... Citavi has this task on it's to-do list since 2012 (Task# 6518)... corr: seems citavi5 has it as a test feature in their beta without batch-options...
Bias: I really would love it, as it would complement my workflow - that's all I can really say :)
Batch processing would be really useful, so the service should be able to handle several references at once, but even if it's one-at-a-time it would already be much better than copy-paste...
Linking DevonThink with BibDesk https://www.youtube.com/watch?v=Dso3z0M6z7I
I believe this is the script: http://www.organognosi.com/how-to-connect-a-pdf-file-inside-devonthink-with-its-record-in-bibdesk/#codesyntax_1
I myself use devonthink + zotero + zotefile. Having the metadata in the pdf would be very nice.
I'm currently using http://broken-by.me/pdf-metadata-editor/ (featured in Gego's list above), to edit one-at-a-time, copy-pasting the basic fields from Zotero.
It's a hassle, but it works so my Kobo reader can retrieve the needed documents.
Also, I tried thinking of ways to create something using automator on mac (though I am new to automator) but hit a wall when looking how to get metadata info from a entry within Zotero. Any ideas?
All you need to do is run the script with the path to your Zotero library as argument. Please make a backup beforehand as this will overwrite your PDFs.
https://github.com/dthirst/write_title_to_pdf_metadata
I also made a standalone script that :
1. searches for an item in your local zotero database (search by DOI or author, year)
2. update metadata of a pdf of your choice.
I'm not a regular SQLite user, so the trickiest was to figure out how to read the database while Zotero is using it (because it's locked).
https://gitlab.com/GullumLuvl/bibloids : pdfmetadata_from_zotero.py
People who store PDFs in systems that make use of it need it, because articles are very often badly annotated. Mentioned example : an e-reader like Kobo; the title that is displayed in the file browser is taken from the metadata, so better have it correct.
Regarding the statement "database should be used by Zotero only": well no, what is wrong with retrieving data from it in a read-only manner? The problem with Zotero official API is that it appears to query the online DB... For such a task as retrieving a file metadata, it's nonsense to request an internet connection. Also I found it easier to understand the DB schema and run SQL statements than to understand the API.