Google Scholar blues (blocked from GS during PDF metadat import)

myildi · March 10, 2011

Hi,

After having followed the corresponding Zotero screen cast, I have decided to import some articles in PDF in my data base. I have followed the indicated procedure: drop a bunch of PDFs (10-20) on Zotero's window, select them ask the import of the Metadata. This worked for the first two bunches but then stopped. When I have tried to connect to GoogleScholar to check if something strange is going one, I have received a message telling that my computer is accessing GS as a robot and it has been blocked. I have not been able to get a Captcha to prove that I am a human...
I do not know how long this blockage will continue, but it is already quite painful. If one have to be put out of GS as soon as s/he tries to import metadata, this is a problem.

Could Zotero include a small delay to eliminate this problem? Manually, I can avoid it by importing files one by one, with some delay between them, but if Zotero could handle this automatically, that would be much better (especially during the initial import where one needs to handle an important set of PDFs).

If Zotero was smarter than GS, we could drop a big set of PDFs on the Zotero window, ask it to import the metadata and let it work alone all night... This seems impossible for now, since GS blocks quite quickly the computer.

Any suggestion?

myildi · March 10, 2011

By the way, I get the following error message from the GS page:

When Google detects that a computer on your network may be sending automated traffic to Google we may show the following message: "Our systems have detected unusual traffic from your computer network." Automated queries are against our Terms of Service.

The last sentence seems to imply that this function in Zotero violates the terms of service of GS. Hmmm...

myildi · March 10, 2011

Again, I do not know how it handles this problem, but Publish and Perish is able to avoid this problem when making large queries about the publications of an author. SO, there must be a way to avoid it.

dstillman · March 10, 2011

An automatic delay is probably a good idea, though it might be hard to figure out the particular algorithm GS uses to determine blocks. But some trial and error might give us a general idea.

dstillman · March 10, 2011

Note that Google Scholar is only one of the methods Zotero uses to determine metadata. If the PDF has a DOI, it will use that first. There might be some other methods it can use before resorting to Google Scholar.

myildi · March 10, 2011

It seems that my PDFs use a lot Scholar :-( This is now more than 40 minutes that Scholar has been blocked for me (I have stopped Zotero immediately, but infrequently, I try to refresh the Scholar page). In a few minutes I will shut down the computer and go to sleep. We will see if I am unblocked tomorrow morning. This is already very annoying :-(

sldn · April 6, 2011

I have the same problem and it seems that Zotero relies only on GS. I have many papers with DOI codes in, but Zotero always fails and just works when GS admits!

sammyatu · July 14, 2011

I've been having the same problem. Has anyone found a solution?

canmeral · August 26, 2011

Same problem.
Another solution possibility: GS can ask Zotero a Captcha, which Zotero than prompts to the user. I'm willing to type in a few letters rather than having a blockage.