I work for an academic journal, and here is the situation:

We have more than 600 articles.
- Older one are searchable images
- Newer one are pdf generated from text, so fully searchable and more "light".
- None of them has ANY metadata. (-_- yeah, I know)

We want to...
1- Add metadatas to the 600+ pdf for the best indexing/referencing possibilties.
2- Publish all of those on a Zotero-web-site ready (probably using RDF embedded).
3- Store all the artciles within our own Zotero database.

As I understand it, there are many ways to add metadatas to pdf (we have adobe pro), but I wonder what type of medatada would maximize the indexing potential, those in the description? (quite limited fields), with xml or?

In which order shall we proceed to reach goal 1-2-3 in the most effective way?
My question is indeed very large... but I was acutally wondering if we could enter all of the files in Zotero, than fill all the fields for article item type for each of the articles, then export in RDF and use the RDF file to do something. The idea is that we don't want to fill all the fields for all article twice or thrice, we can do it once, but more would take the whole summer.

  • just as a note - currently Zotero does absolutely nothing with the metadata included in pdfs. The "retrieve metadata" tries to find a DOI and if it does consults CrossRef, otherwise it uses text from the beginning of the paper with google scholar. I think it would still be great to include metadata - I believe Mendeley actually does look at the metadata tags and Zotero might in the future, but I thought it'd be worth pointing this out.
    Zotero also currently doesn't _write_ metadata tags - again, check Mendeley, they might do that already.
  • Okay, yeah thanks for the info.
    So the best procedure would be to write metadata tags in itself first.
    Then, have the pdf on a Zotero-ready web-site... and finally, we could grab all the info with Zotero without having to re-write everything again.
    Is that right?
  • sounds right to me - but wait for one of the real specialists to confirm.
  • Yes, it would be great to serve the PDF on a page with unAPI, using a format like BibTeX or RIS; just specify the PDF path as an attached file in the BibTeX or RIS. See
