Zotero Being blocked by Anubis

Hi All,

We've been running a couple of servers and we've had to install Anubis onto some of these servers. We want to be able to continue to use Zotero, however, the crawlers will be getting blocked as they are unable to perform the proof of work that Anubis challenges you with.

We are unable to remove it as without the software the server goes down due to abusive crawlers, so I was hoping that we could either get a list of the IP addresses that Zotero use or can we start identifying the Zotero crawlers with a user agent string please!

Kind regards,
Ryan.
  • What crawlers? There are no Zotero crawlers.
  • Hello,

    What I mean is from this previous discussion: https://forums.zotero.org/discussion/102332/identify-as-zotero-in-user-agent-header

    Maybe Im missunderstanding how it works but without it being able to identify itself its getting blocked by Anubis.

    Kind regards.
  • edited 2 days ago
    But as in that thread, you'd have to say what you're actually referring to. We don't know what kind of service you're running, what Zotero functions you're concerned about, etc.

    If there's some particular problem you're experiencing with Zotero software, you should provide exact steps to reproduce it.
  • Hello,

    So what we are running is a normal LAMP stack server which has Anubis installed ontop of it. Here is a link to the docs: https://anubis.techaro.lol/docs/

    This software is designed to stop bots from hitting the server by putting a proof of work in front of the site which your computer needs to solve.

    Unfortunately, the Zotero Browser connector, when we are trying to scrape our website referencing data is getting blocked.

    Reading the thread that I have linked too, this appears to be through the WebDav which isn't explictly providing the user agent string to identify itself. This then causes the connection to be met with the Anubis challenge which it cannot solve. Therefore, causing it to error and be unable to work.

    So what I am asking is can this connection identify itself with a bit in the User Agent string to say its Zotero, so we can add a exception into the rules to allow it through onto the server.

    Regards.
  • Unfortunately, the Zotero Browser connector, when we are trying to scrape our website referencing data is getting blocked.
    Again, if you'd like us to say more, we would need steps to reproduce a specific problem using Zotero software — URL, specific Zotero action, etc. The Zotero Connector is a browser extension — it doesn't have its own user agent separate from the browser.
    Reading the thread that I have linked too, this appears to be through the WebDav which isn't explictly providing the user agent string to identify itself
    What? WebDAV is for Zotero file syncing. If you're posting this on behalf of someone more familiar with Zotero, please just ask them to post here instead. I appreciate that you're trying to help make your service (though you haven't said what service or even what kind of service) compatible with Zotero, but if you're not actually familiar with Zotero functionality, I don't think we're going to get very far here.
  • Hello,

    We are in the process of implementing Anubis (https://anubis.techaro.lol/docs/) at https://erudit.org and we are facing the same problem.

    Anubis is a proxy designed to block scraping bots by different means, including what they call challenges: https://anubis.techaro.lol/docs/admin/configuration/challenges/

    The problem is that it's blocking Zotero because it cannot complete the challenges. This could be circumvented if there was a way to identify Zotero, eg. by a specific User Agent.

    Béranger

    PS: A solution like Anubis is a must for us since bot traffic has climbed to about 80% of our total traffic lately.
  • Could you be more specific about what Zotero features are being blocked by Anubis? Are you not able to save items to Zotero from your site at all when Anubis is enabled? Or is it just blocking PDF or snapshot saving?
  • Sure! When I click on "Save to Zotero" in Firefox, the article's metadata and the snapshot are saved in Zotero, but not the PDF.

    It looks like it's the Zotero application that's trying to fetch the PDF:
    2025-09-17 10:15:10 45.44.169.84 GET /fr/revues/refuge/2018-v34-n1-refuge03925/1050854ar.pdf HTTP/2.0 - 80 - 45.44.169.84 "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0" "-" 200 11831

    And it fails (don't believe the 200, the byte count indicates that the Anubis challenge page is returned)
  • We can probably fix this once you have something live (even on a separate staging instance). We already handle Cloudflare captchas for PDF saving, and Anubis should work similarly.
  • We have something live, but we'll need to communicate privately. Can you email me? beranger.enselme at our domain.
  • What versions of Zotero and the Zotero Connector are you running?
  • It's a fresh install from the latest versions I could find yesterday
  • Can you provide a Debug ID from the Zotero Connector for reloading the page and trying to save?

    Current versions of the Zotero Connector and Zotero should download PDFs from the browser, not the app.
  • To be more precise: Zotero 7.0.24 (64-bit) and Connector 5.0.181
  • Thanks! Here's a debugID, hope this helps: D1189258986
  • And another one: D2057127296
  • The Erudit website doesn't provide the citation_pdf_url meta tag in the head element, so the Zotero Connector fails to find the PDF on the page. But the Zotero Client manages to find the PDF file via DOI resolvers.

    To fix this, your system administrator needs to add the citation_pdf_url meta tag to the pages, which should fix this. You can point them to https://www.zotero.org/support/dev/exposing_metadata
Sign In or Register to comment.