How to guarantee best possible metadata retrieval for Zotero in pdf design of journal articles?

PHollenstein · September 15, 2021

Hi there,
I have the task to redesign a bunch of journals for a university. I wonder how I could guarantee the best possible metadata retrieval in Zotero based on the layout, design, structure, position, etc. of the metadata on the cover page of the articles.

I would love to see the pdfs of the articles imported to Zotero and perfectly read by it, so that not manual correction of metadata would be necessary.

Thanks for any advice!

adamsmith · September 15, 2021

Zotero principally relies on DOIs to retrieve metadata, so the best way to do this is to make sure that the article DOI is prominently included on the first page of the article and that metadata is properly registered with CrossRef.

Anything else is going to be more error prone: Zotero doesn't do much with data from the PDF itself beyond its title.

Also note that the principal way of getting data in Zotero is not via PDFs but using the browser connector, so more important than the PDF is following best practices for the web portal through which the journal is published: https://www.zotero.org/support/dev/exposing_metadata (if you can use something like OJS, that'd pretty much solve this for you)

PHollenstein · September 15, 2021

Thanks, this helps a great deal!

PHollenstein · September 15, 2021

Just a follow-up question: what do you mean by "prominently included"? Does the position of the DOI on the cover-page influence its "readability"?

Thanks in advance!

adamsmith · September 15, 2021

No it doesn't. Zotero just converts the first couple of pages of the PDF to text and uses a regex to figure out where the DOI is, so "prominently" just means it should be there on the first page and not be obscured in some unusual way.

PHollenstein · September 15, 2021

Great! Thank you so much.