Thanks so much for the quick reply! I followed your link, but I'm a little confused (I'm sorry!). Is there something I am supposed to actually download? Or were you just giving me access to a forum that explains that this issue is being worked on and is nearing a solution?
This is wonderful news -- namely, a solution to Google's frustrating "Query Limit Reached" warning. I can't thank Zotero enough for this fix, if it soon comes to pass.
One quick two-part question, though, that I asked in a previous post, but has been left unanswered: namely, how does Google keep blocking my requests for metadata extracts even though 1) I can use a VPN and therefore change my IP address multiple times, and 2) I clear out all cookies and the like using CCleaner and other tools, but am still blocked?
we don't really know the answer to that - the patch doesn't circumvent google's lock-out — it makes it less likely to occur, and when it does occur it helps you to authenticate via captcha, and doesn't fail on items it can still import (via DOI e.g.).
Many sites block large swaths of IPs that are known VPNs or common VPS hosting (such as AWS IP blocks) as a matter of course specifically to prevent bypassing controls the way you're attempting to.
thanks aurimas for your idea: I was blocked with retrieving Metadata, then I opened scholar.google.com, they checked if I was a robot as you guessed, once they finished checking I could retrieve Metadata again. Thank you!
The next version of Zotero introduces considerable improvements to this issue. It essentially pops up a CAPTCHA window inside Zotero and AFAICT prevents "permanent" Google Scholar lockouts. A beta version (for Firefox only atm) is available here: http://www.zotero.org/support/dev_builds#zotero_40_beta (not sure when this will be officially released)
tfjern: when I opened scholar, they asked to recognise letters and numbers (two times) and then it was unlocked. But I guess it's resolved with the new version anyway.
as aurimas says above, you'll still have to fill out the captcha, but Zotero shows it to you directly, so you don't need to poke around scholar.google.com to find it.
I'm really happy to see Zotero is working on this. I have standalone 4.0.17, but I don't see the behavior aurimas describes above. When I hit my query limit I still have to go scholar.google.com to enter the captcha. Am I doing something wrong? Also, once I've entered the captcha I case search in Scholar, but I still get the query limit reached warning in Zotero.
I also use the Firefox version (just installed today). In that version, when query limit is reached I just get a message from Scholar that just says the think my computer is sending automated responses and they can't process my requests. No captcha...the search page just doesn't show at all.
This is all in Windows 7 (most recent version of FF).
Any settings I should change or things I should try? Thanks.
Per others on the thread, FWIW, I used CC Cleaner to clear cookies, then logged on to Google Scholar through lib.umich.edu, and used Zotero for Firefox rather than Standalone after a day and a half of query limit after successfully retrieving metadata for about 150 pdfs late one night (the highest number I've ever achieved.
This seemed to work and I've retrieved about 100.
It occurred to me that since my pdfs were in alphabetical order, I kept trying with the same set, so I changed the sort order and did a search for the topics for which I most need articles, which means that it is calling different titles out of alphabetical order, and I don't try again with the same set after a query limit notice.
I have no idea which of these changes is making things work smoother, but wonder if (a) nature of types of documents (author name, discipline, etc., is something included in Google's bot detector; and/or (b) those who log in with a university account and use the Firefox version have better luck. If so, I wonder if there would be a way to integrate university accounts with Zotero.
on a more general note, I've read this thread and am thrilled about the new release. I switched from EndNote to Mendeley (which crashed, had terrible accuracy, was too expensive, etc ect) to Zotero, used EndNote for mass editing of citations for some old citations and imported pdfs that were my own field notes, etc., and have now permanently settled on Zotero. Some of the complaints people have are easily solved with add-ins like Zotfile, etc.
I just want to note that I second the point about frequently needing multiple import for huge numbers of citations (I have 4000 that still need to retrieve metadata. This is due to working with large research groups and coauthors who have massive DropBox folders of pdfs and handle everything manually. One can't click on every pdf or look at their bibliographies and search for each item. I am also working on systematic reviews and meta-analysis and using individual databases and importing/searching on work collected over many years prior to this kind of functionality is not a great solution, and dedicated systematic review software is very expensive with a steep learning curve and quite unnecessary if one uses Zotero effectively. I love so many features about Zotero, this is the only one that slows me down. (well, that and having to drag into subcollections rather than being able to right click and move to subcollections).
(a) nature of types of documents (author name, discipline, etc., is something included in Google's bot detector; and/or
the type of documents make a big difference. If Zotero finds a DOI (or, less commonly, an ISBN) it doesn't even query google scholar and the fewer documents Zotero is _unable_ to find metadata for the fewer requests to google scholar it makes (it tries three strings, i.e. three separate gs requests, before giving up).
(b) those who log in with a university account and use the Firefox version have better luck. If so, I wonder if there would be a way to integrate university accounts with Zotero.
unlikely, but also I'd view this as pretty much solved with the next release (still requires occasional captcha entry, but that seems fine). Aurimas tested this with a very substantial set of PDFs.
The beta version is working much better--I still get a query limit after entering the captcha 4 times, but I can retrieve meta-data for many more citations -- maybe 250 at a time instead of 30-50. It also seems to go faster through items that do not have OCR text, and continues on with other api calls, apparently, even if G scholars reaches its limit. Thank you!
I still get a query limit after entering the captcha 4 times, but I can retrieve meta-data for many more citations
Do you mean 4 times in a row or 4 times with metadata being retrieved in-between? If the CAPTCHA pop-up re-appears immediately, that means you didn't enter the CAPTCHA correctly (or maybe something else is wrong and Google doesn't like the response we send them). If you enter it incorrectly 3 times, it should then permanently fail until you restart the process.
If you did mean 4 separate times, then could you visit Google Scholar after getting the "Query Limit Reached" message and tell us if Google Scholar displays some odd page saying that you're a robot?
I mean 4 times in a row WITH METADATA being retrieved in-between (50-100 items). I never made a mistake with the Captcha pop-up. I finished the retrievals over 2 days (instead of many-- THANK YOU!!). I didn't see anything strange when I visited Google Scholar. In fact, I logged in via the University of Michigan with no problem. Last night it ran for a very long time with only Captcha messages coming up and no query limit--it was about 10pm Mountain time (midnight ET), so perhaps it has something to do with time of day. In any case, it's a vast improvement and I am finally close to organizing my library completely after Mendeley and EndNote crashes back in 2009. Grateful to Zotero!
After updating to the standalone version 4.0.19, I was asked for the capcha a few times and allowed to keep going, so I was able to get the metadata for around 100 files (only a guess, because it was getting some from CrossRef, too).
But now I'm stuck again. The message is a little more sophisticated: "Google Scholar query limit reached."
I guess I'll just keep trying. Like many others here, I'm delighted with Zotero, but the problem is importing ~1000 pdf's that I've already got stored away.
If you did mean 4 separate times, then could you visit Google Scholar after getting the "Query Limit Reached" message and tell us if Google Scholar displays some odd page saying that you're a robot?
The latest Zotero update (4.0.19) is a substantial improvement regarding the dreaded "Query Limit Reached" message. After four or so capcha entries things do bog down, but I guess this is the best that can be done.
Usually I can have 50 or so pdfs processed, and as I said this is an improvement, so thanks for the updates.
I honestly cannot reproduce this. I just ran 595 PDFs through Google Scholar for metadata retrieval and everything worked as expected without any "Google Scholar Query Limit Reached" messages. I got over 10 CAPTCHAs on the way and it took me about 30 minutes (it's probably a very generous estimate of how long this should normally take, because DOI and ISBN were not considered at all for this).
For everyone who reported above and any future reporters, if you're using Zotero Standalone, I would highly recommend helping us debug this using Zotero in Firefox.
Once you're using Firefox and you run into the "Google Scholar Query Limit Reached" message, please go to http://scholar.google.com/ (in FIrefox) and if you see anything but the usual Google Scholar page, take a screenshot, post it on some image hosting website, like imgur.com and link it here. If you don't see anything unusual, retry retrieving metadata. If you continue to get the "Google Scholar Query limit reached" message, please submit a Debug ID (post it here). Then try restarting your browser. If that doesn't help, try clearing your cookies.
This is not a 'Google Mystery'. It is an historical (dating back in 2010) problem in zotoero, which is not able to circumvent the daily limit in fetching data from GScholar. Passed more than 24 hrs, I have cleared all the cookies from firefox and still I have that silly cpatcha coming out all the time even if you fill correctly.
Anyway I will post the debug id as request.
as Dan told you in the other thread, the Zotero code has changed significantly early this year and the google issue is mostly solved, so no, this is not a historical problem. If you read the posts right above you, this has been tested with multiple hundred PDFs in a day. The mystery that janew refers to is the fact that on one day she got stuck after 100pdfs, but a couple of days later she was able to retrieve data for multiple hundreds in exactly the way that this should work and does in our tests.
When you say the captcha appears all the time - what exactly do you mean by that? For massive retrieve attempts, you'd likely have to fill out a couple of captchas, but there'd be progress in between doing so. Or are you saying it immediately pops back up? That would suggest that either you're not filling it out correctly or somehow it gets lost in translation.
Also, to repeat aurimas debugging instructions from above:
Once you're using Firefox and you run into the "Google Scholar Query Limit Reached" message, please go to http://scholar.google.com/ (in FIrefox) and if you see anything but the usual Google Scholar page, take a screenshot, post it on some image hosting website, like imgur.com and link it here. If you don't see anything unusual, retry retrieving metadata.
your screenshot is from Zotero, not from scholar.google.com
This is the screenshot of google scholar (at 17.20) http://i57.tinypic.com/27y516t.png.
Other screen shots taken in order to let you know that problem is still present even if you still want to deny.
Trying to fetch a pdf article of NEJM
http://i57.tinypic.com/zt7rjt.png (1st try)
http://i58.tinypic.com/108767q.png (2nd try)
.......
And after 2 tries more we have the final result.
http://i61.tinypic.com/35a3ww7.png
yeah, that's what I said. Either you're entering the captcha incorrectly or for some reason it's not passed on to google. Dan might be able to tell which by looking at the debug.
Thanks for your patience.
One quick two-part question, though, that I asked in a previous post, but has been left unanswered: namely, how does Google keep blocking my requests for metadata extracts even though 1) I can use a VPN and therefore change my IP address multiple times, and 2) I clear out all cookies and the like using CCleaner and other tools, but am still blocked?
Second, has Zotero made any progress in trying to solve this frustrating "Query Limit Reached" problem?
I also use the Firefox version (just installed today). In that version, when query limit is reached I just get a message from Scholar that just says the think my computer is sending automated responses and they can't process my requests. No captcha...the search page just doesn't show at all.
This is all in Windows 7 (most recent version of FF).
Any settings I should change or things I should try? Thanks.
This seemed to work and I've retrieved about 100.
It occurred to me that since my pdfs were in alphabetical order, I kept trying with the same set, so I changed the sort order and did a search for the topics for which I most need articles, which means that it is calling different titles out of alphabetical order, and I don't try again with the same set after a query limit notice.
I have no idea which of these changes is making things work smoother, but wonder if (a) nature of types of documents (author name, discipline, etc., is something included in Google's bot detector; and/or (b) those who log in with a university account and use the Firefox version have better luck. If so, I wonder if there would be a way to integrate university accounts with Zotero.
on a more general note, I've read this thread and am thrilled about the new release. I switched from EndNote to Mendeley (which crashed, had terrible accuracy, was too expensive, etc ect) to Zotero, used EndNote for mass editing of citations for some old citations and imported pdfs that were my own field notes, etc., and have now permanently settled on Zotero. Some of the complaints people have are easily solved with add-ins like Zotfile, etc.
I just want to note that I second the point about frequently needing multiple import for huge numbers of citations (I have 4000 that still need to retrieve metadata. This is due to working with large research groups and coauthors who have massive DropBox folders of pdfs and handle everything manually. One can't click on every pdf or look at their bibliographies and search for each item. I am also working on systematic reviews and meta-analysis and using individual databases and importing/searching on work collected over many years prior to this kind of functionality is not a great solution, and dedicated systematic review software is very expensive with a steep learning curve and quite unnecessary if one uses Zotero effectively. I love so many features about Zotero, this is the only one that slows me down. (well, that and having to drag into subcollections rather than being able to right click and move to subcollections).
If you did mean 4 separate times, then could you visit Google Scholar after getting the "Query Limit Reached" message and tell us if Google Scholar displays some odd page saying that you're a robot?
But now I'm stuck again. The message is a little more sophisticated: "Google Scholar query limit reached."
I guess I'll just keep trying. Like many others here, I'm delighted with Zotero, but the problem is importing ~1000 pdf's that I've already got stored away.
Usually I can have 50 or so pdfs processed, and as I said this is an improvement, so thanks for the updates.
For everyone who reported above and any future reporters, if you're using Zotero Standalone, I would highly recommend helping us debug this using Zotero in Firefox.
Once you're using Firefox and you run into the "Google Scholar Query Limit Reached" message, please go to http://scholar.google.com/ (in FIrefox) and if you see anything but the usual Google Scholar page, take a screenshot, post it on some image hosting website, like imgur.com and link it here. If you don't see anything unusual, retry retrieving metadata. If you continue to get the "Google Scholar Query limit reached" message, please submit a Debug ID (post it here). Then try restarting your browser. If that doesn't help, try clearing your cookies.
Let us know what (if anything) fixes this.
I didn't change anything at all in the interim. I just "tried again [a couple of days] later." A Google mystery?
Anyway I will post the debug id as request.
If you read the posts right above you, this has been tested with multiple hundred PDFs in a day. The mystery that janew refers to is the fact that on one day she got stuck after 100pdfs, but a couple of days later she was able to retrieve data for multiple hundreds in exactly the way that this should work and does in our tests.
When you say the captcha appears all the time - what exactly do you mean by that? For massive retrieve attempts, you'd likely have to fill out a couple of captchas, but there'd be progress in between doing so. Or are you saying it immediately pops back up? That would suggest that either you're not filling it out correctly or somehow it gets lost in translation.
Other screen shots taken in order to let you know that problem is still present even if you still want to deny.
Trying to fetch a pdf article of NEJM
http://i57.tinypic.com/zt7rjt.png (1st try)
http://i58.tinypic.com/108767q.png (2nd try)
.......
And after 2 tries more we have the final result.
http://i61.tinypic.com/35a3ww7.png
Dan might be able to tell which by looking at the debug.