Retrieving metatdata from PDF

I'm just starting to use Zotero and so this is a real neophyte's question. I have imported several pdf files from my computer, and I am unable to retrieve metadata or create citations for 2 of them. Am I doing something wrong? I hope so, because it seems like the alternative is to enter the citation manually, which defeats my main reason for using Zotero.
Thanks for any help!
  • If retrieving metadata works for the others, you're very likely not doing anything wrong. Zotero is just not able to find suitable metadata for those PDFs.
    There can be all types of reasons -- they don't contain OCRd text, you OCRd the text yourself, so it doesn't conform to what's searchable on google scholar (which is what Zotero relies on), the article isn't archived on google scholar, Zotero picked a range of unsuitable search phrases from the article to find it.

    You can look up the metadata online, import it via the Save to Zotero icon and then manually. Otherwise, yes, you'll have to do it manually. Depending on the nature of PDFs you're importing, I'd expect that to be the case for anywhere between 5 and 30% of files. Still saves you a bunch of time.
  • (also, note that once your own PDFs are imported, the recommended way of getting stuff into the Zotero is the Save to Zotero icon in your browser, which will guarantee you much higher quality data overall).
  • Thanks so much for responding so quickly. I suspected there would be times when I'd have to enter manual citation info, I just wish there weren't! The two files that haven't been cooperative so far are likely not archived in Google Scholar.
  • If "saving to Zotero" from my browser is better, would you recommend I retrieve them all over again? It wouldn't be so hard, and certainly better than checking each citation for accuracy in the final days!
  • edited January 18, 2016
    Kind of depends what numbers you're looking at and what's important to you. One of the key disadvantages of retrieve metadata is that you don't get abstracts and you rarely get DOIs. The former are rarely used in citations, but many people really want to have them. The latter are an increasingly common citation requirement (notably e.g. in APA style).

    Otherwise google has been getting better about the data and it's pretty good, but not perfect.

    edit: and the downside, of course, is that if you currently only have the PDFs, doing this in your browser is kind of laborious -- search for every article in a suitable database (don't use google scholar, ovbiously -- that won't improve over the PDF), then import.
  • Again, many thanks. I was primarily attempting to use Zotero to create citations for a research paper using APA style. I see that it has a lot more to offer, including a very handy place to make notes for my lit review. With so much to do, trying to learn a new program at the same time is a nuisance! So I very much appreciate your help.
  • "so much to do, so little time, so little patience"
  • About the non-OCR'd files, is there a way to identify the files in this situation aside of having to see one by one after parsing metadata? In my case, I have to verify this on a BIG library (about 2K+) pdfs, and it's quite unpractical doing this by hand.

    My suggestion is to have a plugin or code piece to mark the files somehow after the metadata task. I imagined a new column aside of those Author, Title etc, where one could see which files need attention.
Sign In or Register to comment.