Zotero Being blocked by Anubis
Hi All,
We've been running a couple of servers and we've had to install Anubis onto some of these servers. We want to be able to continue to use Zotero, however, the crawlers will be getting blocked as they are unable to perform the proof of work that Anubis challenges you with.
We are unable to remove it as without the software the server goes down due to abusive crawlers, so I was hoping that we could either get a list of the IP addresses that Zotero use or can we start identifying the Zotero crawlers with a user agent string please!
Kind regards,
Ryan.
We've been running a couple of servers and we've had to install Anubis onto some of these servers. We want to be able to continue to use Zotero, however, the crawlers will be getting blocked as they are unable to perform the proof of work that Anubis challenges you with.
We are unable to remove it as without the software the server goes down due to abusive crawlers, so I was hoping that we could either get a list of the IP addresses that Zotero use or can we start identifying the Zotero crawlers with a user agent string please!
Kind regards,
Ryan.
What I mean is from this previous discussion: https://forums.zotero.org/discussion/102332/identify-as-zotero-in-user-agent-header
Maybe Im missunderstanding how it works but without it being able to identify itself its getting blocked by Anubis.
Kind regards.
If there's some particular problem you're experiencing with Zotero software, you should provide exact steps to reproduce it.
So what we are running is a normal LAMP stack server which has Anubis installed ontop of it. Here is a link to the docs: https://anubis.techaro.lol/docs/
This software is designed to stop bots from hitting the server by putting a proof of work in front of the site which your computer needs to solve.
Unfortunately, the Zotero Browser connector, when we are trying to scrape our website referencing data is getting blocked.
Reading the thread that I have linked too, this appears to be through the WebDav which isn't explictly providing the user agent string to identify itself. This then causes the connection to be met with the Anubis challenge which it cannot solve. Therefore, causing it to error and be unable to work.
So what I am asking is can this connection identify itself with a bit in the User Agent string to say its Zotero, so we can add a exception into the rules to allow it through onto the server.
Regards.
We are in the process of implementing Anubis (https://anubis.techaro.lol/docs/) at https://erudit.org and we are facing the same problem.
Anubis is a proxy designed to block scraping bots by different means, including what they call challenges: https://anubis.techaro.lol/docs/admin/configuration/challenges/
The problem is that it's blocking Zotero because it cannot complete the challenges. This could be circumvented if there was a way to identify Zotero, eg. by a specific User Agent.
Béranger
PS: A solution like Anubis is a must for us since bot traffic has climbed to about 80% of our total traffic lately.
It looks like it's the Zotero application that's trying to fetch the PDF:
2025-09-17 10:15:10 45.44.169.84 GET /fr/revues/refuge/2018-v34-n1-refuge03925/1050854ar.pdf HTTP/2.0 - 80 - 45.44.169.84 "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0" "-" 200 11831
And it fails (don't believe the 200, the byte count indicates that the Anubis challenge page is returned)
Current versions of the Zotero Connector and Zotero should download PDFs from the browser, not the app.
citation_pdf_url
meta tag in the head element, so the Zotero Connector fails to find the PDF on the page. But the Zotero Client manages to find the PDF file via DOI resolvers.To fix this, your system administrator needs to add the
citation_pdf_url
meta tag to the pages, which should fix this. You can point them to https://www.zotero.org/support/dev/exposing_metadata