Fixing "Google Scholar query limit reached. Try again later"

bjensen1 · January 4, 2017

Hi! I am uploading a lot of documents at once and I know that you are supposed to do them in small batches, but I still am getting this message "Google Scholar query limit reached. Try again later". Is there a way to fix this so that I can retrieve metadata for more documents without having to wait days for google to figure out I'm not a robot?

DWL-SDCA · January 4, 2017

In effect, "a lot of documents at once" is what GS defines as robot-like behavior. I've seen discussions about a look-up option that once started will s-l-o-w-l-y process a list of requests. I don't recall the status of that idea. Maybe someone will develop a Zotero plug-in that could take a list of items and process it overnight.

What is missing from Google Scholar pages compared with regular search pages? Adverts as a source of revenue. Although Google has astonishing computer power; imagine if every person in the world had unfettered access to their system to make unlimited queries.

dstillman · January 4, 2017

If this is Zotero for Firefox, you can try clearing google.com prefs, but that's only a temporary fix at best.

We're hoping to decrease our reliance on Google Scholar in future versions.

bjensen1 · January 5, 2017

Ok! Thank you both for your input. @DWL-SDCA you make a good point about the problem with everyone in the world having unlimited query capabilities.

WarthogARJ · January 5, 2018

I've had this issue a few times. I understand why it happens, but I think that Zotero's response about it has been unsatisfactory.
No offence to those who work for Zotero, but the response HAS been pretty lame.
We get told that "it's not our fault, is Google".
And when people have suggested some workarounds, the replies have been pretty weak.

If you are trying to use Zotero to organize a lot of existing pdf's, you need to query them. It's easy to have 100's if not several 1,000. Importing them one-by-one would take a VERY long time.

I really like Zotero, and understand that for me, it's free.
Thank you.
But someone IS paying the bills.
So we are still the "customers".
This is a legitimate issue.

I think Zotero could do a few constructive things:
(1) Put in a counter that shows how many queries you've made, if you are high (I think googles' main limit is 1,200 queries/24 hours) then it warns you
Once you get blocked it can be DAYS before you get unblocked.

If Google uses number of Queries per time, then add that too. Surely it is known how the Google robot decides you are a robot? Otherwise just ask Google? It is a legitimate question

(2) Contact Google, and ask if they can have an exception for Zotero. The number of Zotero clients who need it on any given time will be EXTREMELY small, but over time if you use Zotero as your main archiver it likely will affect you

And it's in Google's interests, because after all, Zotero itself says they are working on a way not to use Google. And Google wants you to use it

Perhaps Zotero could request for a paid per day exemption from Google, that clients could use

(3) Write a plug-in that cues queries and runs them over a Google acceptable period

(4) Add to the FAQ about this, and make some workarounds that actually work when you are locked out. So far nothing that's been suggested works very well. Using VPN to create a new IP address would work I think: I'm going to try that

This has been an issue for a very long time, I don't understand why nothing effective has been done to address it.

adamsmith · January 5, 2018

Unfortunately it's just not that easy.
Google is perhaps the most sophisticated web service provider in the world. If they want to prevent a certain usage of their service, they can and there's indeed nothing Zotero can do about it. I agree it's frustrating (and it's not like no work has gone into it -- there was significant progress a while back, but a lot of that was made obsolete when google changed their algorithm), but I'm also not sure what type of reaction you're hoping for.

Software development is resource constrained and takes time. Dan is the lead developer and has stated above that they're hoping to improve this by reducing reliance on google scholar, so it's not like anyone is claiming this isn't a legitimate issue. It's just not an issue with an easy, quick to implement solution, nor is it the only issue Zotero has to address.

For your ideas

2) many people have tried, e.g. also for the popular Publish or Perish software, with no success whatsoever. This is partly because Google doesn't have to, partly because they in turn have to be

1 and 3 assumes that there is a fixed and known threshold that's acceptable to google. That's not the case. Google's algorithm for determining someone as a bot suspect is unknown and adaptive. You can be blocked after as few as 20 queries. So I'm not sure what could be done here that would actually be useful enough to justify the labor

4) we don't really know what works. Changing your IP has some effect, but it's going to be limited.

DWL-SDCA · January 5, 2018

1) 1200 queries/24 hours: GS doesn't have an established limit per day but regulates useage depending upon any or each of several factors -- time interval between queries; number of queries within a short period of time; total number of queries (ever) from an IP range; current load on the GS system, etc. GS will offer a CAPCHA if you repeatedly enter an incorrect response or if you ignore the prompt your IP will be labeled as a probable robot.

2) Contact Google...in Google's interest...etc. -- There is currently no way to contact a person at GS (publishers and database operators are sometimes contacted by GS but never the other way around). Google doesn't derive any revenue from GS. Thus, it isn't "in Google's interest" fot the service to be used. Google doesn't have a mechanism to collect user fees. Zotero is free and doesn't have a way to collect user fees depending on the level of demand that user makes on a particular type of service. Collecting that kind of fee seems to be counter to the philosophy of offering Zotero as a free service.

3) This has been discussed for a l-o-n-g time but no one has done it. I presume that the task isn't easy or straightforward.

4) Unless there has been a recent change to Zotero documentation, I believe that there is already a warning about GS limits. Google frowns on attempts to sneak around their terms of service. Your VPN idea is likely to not only be ineffective but also cause you other problems.

Alphabet offers GS as a public service. The engineering and operating costs of GS are quite large. The purpose of GS is to facilitate users need for identifying scholarly publications. A secondary benefit to us Zotero users is that we can sometimes get added value in the form of metadata missing from records in our library. Although GS offers item metadata for the listings on its pages, that GS metadata is often incomplete or even wrong. GS intends that its users follow the link to the publishers' sites where the document (and definitive metedata) is available.

dstillman · January 5, 2018

We will indeed have improvements in this area in the near future.

sdspieg · May 3, 2018

I seem to recall reading here that some new pdf feature would reduce this problem. Yes I still keep getting the GS captcha's even when I am just trying to get citations and pdfs from sources (like EBSCO) that seem to have everything they need without having to query GS. Can somebody please explain?

bwiernik · May 3, 2018

Do you have the Google Scholar Citations plug-in installed?

sdspieg · May 3, 2018

I did. Disabled it now. Thanks!

dstillman · May 3, 2018

Zotero no longer uses Google Scholar, so closing this discussion.