Get pdf metadata fails every time D50177058

No matching references found, ever, anymore.

It was finding about 75% at first, when I installed it earlier, then it started to fail every time.

I re-installed it and tried again. Nada.
  • for how long has is been failing?
    Google scholar will lock you out after a number of fast automated requests. Zotero will display a more helpful error message for that in the next version, due out soon.
  • You're likely being blocked by Google Scholar, which the function uses. Zotero used to detect this, but it's not working at the moment. Zotero 3.0.12, which will be out this week, will detect this properly and also space out requests to Google Scholar to try to avoid getting blocked.

    In the meantime, if you wait a bit—possibly until tomorrow—and do it in small batches you should have better luck.
  • There are several things that could be happening (I don't have access to Debug IDs, so maybe Dan or Simon can provide more insight).

    1) Your PDFs are not OCR'ed (but that's probably not the problem you're experiencing)

    2) You are locked out of Google Scholar. Google Scholar blocks users for some time after receiving multiple search requests (which is what Zotero does). So if you try to extract PDF metadata from one article in an hour or so, you can see if that was the problem.

    3) Google scholar cannot find any articles based on what Zotero extracts from PDF (I think this is probably it)

    We've made some improvements that address (3) and they will be coming out with Zotero 3.0.12

    Additionally, PDF metadata extraction is a bit crippled in Zotero 3.0.11. It currently only searches Google Scholar and does not consider DOIs, which are a much more reliable method. This is also fixed in 3.0.12

    You can try installing the 3.0 Branch dev XPI, which is very close to what 3.0.12 will be. (To avoid continuing to get dev versions, you can either switch back to 3.0.11 afterward or make a note to do so once 3.0.12 is out.)
  • Changing IPs fixed it for now, so I think I just asked for too much too fast. I was getting the odd "not OCR'd" at first, but those vanished too. Thanks for your help, I'll look forward to the next release with throttling and/or improved error reporting.

    ATB, /Steve
  • edited January 29, 2013
    Simon informs me that 1) the changes to avoid lockouts aren't on the 3.0 branch and 2) are not currently effective. (The code to at least display a better message is also not on the 3.0 branch currently, but we might be able to add that before 3.0.12.)

    As aurimas says, 3.0.12 will have improved detection, including restoring the use of DOIs, which should alleviate lockouts somewhat.
Sign In or Register to comment.