New Query Limit issue

megan.reif · April 16, 2013

I hit the query limit after retrieving metadata in Standalone 4.0 (W7 64bit) for about 50 pdfs. Logged in the next day and had no problems, but after 4 pdfs it started saying I'd hit the query limit again.

I discovered that it tells me I hit the query limit only when files in the group selected are old/scans, books or other files without good metadata.

It seems like it would be better to have an error that indicates that the pdf does not contain retrievable metadata, rather than that I'd hit the query limit. If I check the file first and it comes from a recent journal article, I am able to retrieve the citations.

Is it possible to tell somewhere whether the pdf is OCR-able or not, so one doesnt have to click individually on each pdf to see?

megan.reif · April 16, 2013

I produced the problem here:
The Debug ID is D1125785050

Retrieval stops after the first non-OCR'd file.

adamsmith · April 16, 2013

Zotero won't even try to retrieve metadata for non-OCR items.
The reason this happens more quickly for old PDFs is twofold
1) Zotero often finds a DOI for new articles and it looks those up on a different database
2) Zotero makes up to three(?) attempts for each article on google scholar - if many articles aren't found, it makes those three attempts for every single item & hits the limit more quickly.

dstillman · April 16, 2013

I think you're misdiagnosing this. Metadata retrieval can use a number of different sources. If an ISBN or DOI is found in the file, Zotero will use databases for those, and those lookups are subject to query limits. Only Google Scholar blocks you, and it's entirely up to them when that happens and for how long the block remains in place.

If the file isn't OCR-able, Zotero doesn't try to look it up, so that wouldn't affect the query limit. For OCRed files, the only real differences between old and new files would be that the former might not have ISBNs or DOIs and might be less likely to be in Google Scholar, which would increase the number of requests (since Zotero tries a few times with different parts of the text). The example in your debug output didn't have an ISBN or DOI, and Google blocked you after a few queries.

[Edit: What adamsmith said, basically.]

megan.reif · April 21, 2013

Thanks for the explanation. That makes sense.