Query Limit reached?

ScienceNinja17 · January 10, 2014

Thanks so much for the quick reply! I followed your link, but I'm a little confused (I'm sorry!). Is there something I am supposed to actually download? Or were you just giving me access to a forum that explains that this issue is being worked on and is nearing a solution?

Thanks for your patience.

adamsmith · January 10, 2014

Or were you just giving me access to a forum that explains that this issue is being worked on and is nearing a solution?

this. You can't really install this yet, not even as a beta. But that could happen very soon.

tfjern · January 13, 2014

This is wonderful news -- namely, a solution to Google's frustrating "Query Limit Reached" warning. I can't thank Zotero enough for this fix, if it soon comes to pass.

One quick two-part question, though, that I asked in a previous post, but has been left unanswered: namely, how does Google keep blocking my requests for metadata extracts even though 1) I can use a VPN and therefore change my IP address multiple times, and 2) I clear out all cookies and the like using CCleaner and other tools, but am still blocked?

adamsmith · January 13, 2014

we don't really know the answer to that - the patch doesn't circumvent google's lock-out — it makes it less likely to occur, and when it does occur it helps you to authenticate via captcha, and doesn't fail on items it can still import (via DOI e.g.).

fcheslack · January 13, 2014

Many sites block large swaths of IPs that are known VPNs or common VPS hosting (such as AWS IP blocks) as a matter of course specifically to prevent bypassing controls the way you're attempting to.

shpeet · February 7, 2014

thanks aurimas for your idea: I was blocked with retrieving Metadata, then I opened scholar.google.com, they checked if I was a robot as you guessed, once they finished checking I could retrieve Metadata again. Thank you!

tfjern · February 7, 2014

Shpeet, how does Scholar.google.com check if you are a robot? Please be more specific.

Second, has Zotero made any progress in trying to solve this frustrating "Query Limit Reached" problem?

aurimas · February 7, 2014

The next version of Zotero introduces considerable improvements to this issue. It essentially pops up a CAPTCHA window inside Zotero and AFAICT prevents "permanent" Google Scholar lockouts. A beta version (for Firefox only atm) is available here: http://www.zotero.org/support/dev_builds#zotero_40_beta (not sure when this will be officially released)

tfjern · February 7, 2014

That's good news indeed. We are eagerly awaiting your next version.

shpeet · February 7, 2014

tfjern: when I opened scholar, they asked to recognise letters and numbers (two times) and then it was unlocked. But I guess it's resolved with the new version anyway.

adamsmith · February 7, 2014

as aurimas says above, you'll still have to fill out the captcha, but Zotero shows it to you directly, so you don't need to poke around scholar.google.com to find it.

Matt Jans · February 25, 2014

I'm really happy to see Zotero is working on this. I have standalone 4.0.17, but I don't see the behavior aurimas describes above. When I hit my query limit I still have to go scholar.google.com to enter the captcha. Am I doing something wrong? Also, once I've entered the captcha I case search in Scholar, but I still get the query limit reached warning in Zotero.

I also use the Firefox version (just installed today). In that version, when query limit is reached I just get a message from Scholar that just says the think my computer is sending automated responses and they can't process my requests. No captcha...the search page just doesn't show at all.

This is all in Windows 7 (most recent version of FF).

Any settings I should change or things I should try? Thanks.

dstillman · February 25, 2014

The change will be in 4.0.18. You can try the 4.0 Beta if you want to test it now.

megan.reif · March 5, 2014

Per others on the thread, FWIW, I used CC Cleaner to clear cookies, then logged on to Google Scholar through lib.umich.edu, and used Zotero for Firefox rather than Standalone after a day and a half of query limit after successfully retrieving metadata for about 150 pdfs late one night (the highest number I've ever achieved.

This seemed to work and I've retrieved about 100.

It occurred to me that since my pdfs were in alphabetical order, I kept trying with the same set, so I changed the sort order and did a search for the topics for which I most need articles, which means that it is calling different titles out of alphabetical order, and I don't try again with the same set after a query limit notice.

I have no idea which of these changes is making things work smoother, but wonder if (a) nature of types of documents (author name, discipline, etc., is something included in Google's bot detector; and/or (b) those who log in with a university account and use the Firefox version have better luck. If so, I wonder if there would be a way to integrate university accounts with Zotero.

on a more general note, I've read this thread and am thrilled about the new release. I switched from EndNote to Mendeley (which crashed, had terrible accuracy, was too expensive, etc ect) to Zotero, used EndNote for mass editing of citations for some old citations and imported pdfs that were my own field notes, etc., and have now permanently settled on Zotero. Some of the complaints people have are easily solved with add-ins like Zotfile, etc.

I just want to note that I second the point about frequently needing multiple import for huge numbers of citations (I have 4000 that still need to retrieve metadata. This is due to working with large research groups and coauthors who have massive DropBox folders of pdfs and handle everything manually. One can't click on every pdf or look at their bibliographies and search for each item. I am also working on systematic reviews and meta-analysis and using individual databases and importing/searching on work collected over many years prior to this kind of functionality is not a great solution, and dedicated systematic review software is very expensive with a steep learning curve and quite unnecessary if one uses Zotero effectively. I love so many features about Zotero, this is the only one that slows me down. (well, that and having to drag into subcollections rather than being able to right click and move to subcollections).

adamsmith · March 5, 2014

(a) nature of types of documents (author name, discipline, etc., is something included in Google's bot detector; and/or

the type of documents make a big difference. If Zotero finds a DOI (or, less commonly, an ISBN) it doesn't even query google scholar and the fewer documents Zotero is _unable_ to find metadata for the fewer requests to google scholar it makes (it tries three strings, i.e. three separate gs requests, before giving up).

(b) those who log in with a university account and use the Firefox version have better luck. If so, I wonder if there would be a way to integrate university accounts with Zotero.

unlikely, but also I'd view this as pretty much solved with the next release (still requires occasional captcha entry, but that seems fine). Aurimas tested this with a very substantial set of PDFs.

megan.reif · March 9, 2014

The beta version is working much better--I still get a query limit after entering the captcha 4 times, but I can retrieve meta-data for many more citations -- maybe 250 at a time instead of 30-50. It also seems to go faster through items that do not have OCR text, and continues on with other api calls, apparently, even if G scholars reaches its limit. Thank you!

aurimas · March 10, 2014

I still get a query limit after entering the captcha 4 times, but I can retrieve meta-data for many more citations

Do you mean 4 times in a row or 4 times with metadata being retrieved in-between? If the CAPTCHA pop-up re-appears immediately, that means you didn't enter the CAPTCHA correctly (or maybe something else is wrong and Google doesn't like the response we send them). If you enter it incorrectly 3 times, it should then permanently fail until you restart the process.

If you did mean 4 separate times, then could you visit Google Scholar after getting the "Query Limit Reached" message and tell us if Google Scholar displays some odd page saying that you're a robot?

megan.reif · March 10, 2014

I mean 4 times in a row WITH METADATA being retrieved in-between (50-100 items). I never made a mistake with the Captcha pop-up. I finished the retrievals over 2 days (instead of many-- THANK YOU!!). I didn't see anything strange when I visited Google Scholar. In fact, I logged in via the University of Michigan with no problem. Last night it ran for a very long time with only Captcha messages coming up and no query limit--it was about 10pm Mountain time (midnight ET), so perhaps it has something to do with time of day. In any case, it's a vast improvement and I am finally close to organizing my library completely after Mendeley and EndNote crashes back in 2009. Grateful to Zotero!

janew1 · March 31, 2014

After updating to the standalone version 4.0.19, I was asked for the capcha a few times and allowed to keep going, so I was able to get the metadata for around 100 files (only a guess, because it was getting some from CrossRef, too).

But now I'm stuck again. The message is a little more sophisticated: "Google Scholar query limit reached."

I guess I'll just keep trying. Like many others here, I'm delighted with Zotero, but the problem is importing ~1000 pdf's that I've already got stored away.

adamsmith · March 31, 2014

same questino as above

If you did mean 4 separate times, then could you visit Google Scholar after getting the "Query Limit Reached" message and tell us if Google Scholar displays some odd page saying that you're a robot?

tfjern · March 31, 2014

The latest Zotero update (4.0.19) is a substantial improvement regarding the dreaded "Query Limit Reached" message. After four or so capcha entries things do bog down, but I guess this is the best that can be done.

Usually I can have 50 or so pdfs processed, and as I said this is an improvement, so thanks for the updates.

aurimas · March 31, 2014

I honestly cannot reproduce this. I just ran 595 PDFs through Google Scholar for metadata retrieval and everything worked as expected without any "Google Scholar Query Limit Reached" messages. I got over 10 CAPTCHAs on the way and it took me about 30 minutes (it's probably a very generous estimate of how long this should normally take, because DOI and ISBN were not considered at all for this).

For everyone who reported above and any future reporters, if you're using Zotero Standalone, I would highly recommend helping us debug this using Zotero in Firefox.

Once you're using Firefox and you run into the "Google Scholar Query Limit Reached" message, please go to http://scholar.google.com/ (in FIrefox) and if you see anything but the usual Google Scholar page, take a screenshot, post it on some image hosting website, like imgur.com and link it here. If you don't see anything unusual, retry retrieving metadata. If you continue to get the "Google Scholar Query limit reached" message, please submit a Debug ID (post it here). Then try restarting your browser. If that doesn't help, try clearing your cookies.

Let us know what (if anything) fixes this.

janew1 · April 20, 2014

To follow up, it eventually worked. Google Scholar asked me for a new capcha a whole bunch of times, and I got every possible/valid bit of metadata.

I didn't change anything at all in the interim. I just "tried again [a couple of days] later." A Google mystery?

Enrico68 · June 17, 2014

This is not a 'Google Mystery'. It is an historical (dating back in 2010) problem in zotoero, which is not able to circumvent the daily limit in fetching data from GScholar. Passed more than 24 hrs, I have cleared all the cookies from firefox and still I have that silly cpatcha coming out all the time even if you fill correctly.
Anyway I will post the debug id as request.

Enrico68 · June 17, 2014

This is the debug id: D1149623799. And this is a screen-shot taken when I try to fecth data from a pdf. http://i58.tinypic.com/11kh2yu.png

adamsmith · June 17, 2014

as Dan told you in the other thread, the Zotero code has changed significantly early this year and the google issue is mostly solved, so no, this is not a historical problem.
If you read the posts right above you, this has been tested with multiple hundred PDFs in a day. The mystery that janew refers to is the fact that on one day she got stuck after 100pdfs, but a couple of days later she was able to retrieve data for multiple hundreds in exactly the way that this should work and does in our tests.

When you say the captcha appears all the time - what exactly do you mean by that? For massive retrieve attempts, you'd likely have to fill out a couple of captchas, but there'd be progress in between doing so. Or are you saying it immediately pops back up? That would suggest that either you're not filling it out correctly or somehow it gets lost in translation.

adamsmith · June 17, 2014

Also, to repeat aurimas debugging instructions from above:

Once you're using Firefox and you run into the "Google Scholar Query Limit Reached" message, please go to http://scholar.google.com/ (in FIrefox) and if you see anything but the usual Google Scholar page, take a screenshot, post it on some image hosting website, like imgur.com and link it here. If you don't see anything unusual, retry retrieving metadata.

your screenshot is from Zotero, not from scholar.google.com

Enrico68 · June 17, 2014

This is the screenshot of google scholar (at 17.20) http://i57.tinypic.com/27y516t.png.
Other screen shots taken in order to let you know that problem is still present even if you still want to deny.

Trying to fetch a pdf article of NEJM

http://i57.tinypic.com/zt7rjt.png (1st try)
http://i58.tinypic.com/108767q.png (2nd try)
.......
And after 2 tries more we have the final result.
http://i61.tinypic.com/35a3ww7.png

adamsmith · June 17, 2014

yeah, that's what I said. Either you're entering the captcha incorrectly or for some reason it's not passed on to google.
Dan might be able to tell which by looking at the debug.

dstillman · June 17, 2014

RecognizePDF: No CAPTCHA entered