Beaver: Zotero AI plugin to chat with your library, organize, discover research, and read papers

edited yesterday at 9:40pm
Hi Everyone!

We are happy to introduce Beaver. Beaver is a research agent with native Zotero integration. What, another AI plugin for Zotero? Yes, indeed! But I think the feature set makes it compelling. There are other options and you should explore them all. Here are some of Beaver’s features:

1. Chat with your Entire Library: Ask questions about your research and get answers drawn from your entire library. Beaver searches across metadata, topics, and the full content of your PDFs to give grounded answers. Learn more

2.Reading Assistant: Beaver is integrated inside your PDF reader. Select complex equations or highlight text to get explanations. Need more context? Ask how a claim compares to the rest of your library without ever leaving the page. Learn more

3. Organize & Edit Your Library: Beaver can help you manage collections, add tags, fix metadata, and keep your library organized. All changes require your approval by default so you stay in control. Learn more about library organization and metadata editing.

4. Discover New Research: Search over 240 million scholarly works outside your Zotero library. Understand citation patterns and find papers to expand your collection. Learn more

5. Precise citations: Beaver supports page or sentence-level citation depending on the mode. Hover over a citation to see a preview or click on it to open the pdf and highlight the relevant passage.

6. Continuous improvement via benchmarks and evaluations. We use benchmarks and evaluations to continuously improve the performance of Beaver including prompt and context engineering, retrieval pipeline, file processing, citation behavior and more.

7. Free version with unlimited use with your own API key. Supports access to frontier models from OpenAI, Anthropic and Google with your own API key (documentation). Other providers such as Deepseek, Z.ai or any compatible custom endpoints are supported as advanced options.

Preview: Beaver is in beta and available for free right now. You can learn more and sign-up here. The frontend code is open source on GitHub. Remember that Beaver is in beta and you might run into bugs!

Please leave feedback or let us know about feature requests here or on Github.

System Requirements

Zotero 7 or 8
Internet connection for cloud features
Modern web browser for account management
  • I have no involvement with this tool, but as context for those interesting in trying it -- Joscha (apart from being an eminent sociologist and data scientist) was the creator and long-term developer of ZotFile, that many people here may have used.
  • Looks great and seems to be working well. Are you planing to support group libraries?
  • Thanks poettli!

    And yes. Group libraries and OCR support are probably on top of the list in terms of next features. Both are pretty far.

    Please let me know if you run into any issues.
  • Sounds intriguing. Thank you, Joscha.

    I wanted to mention that I couldn't use the download button on the Beaver webpage on Firefox? "Save Link as" does not show up when right-clicking on the Download button, and the download triggered gets rejected by Firefox, since the .xpi file is not for Firefox (obviously). Had to switch to Chrome to download. Will be trying it out soon.
  • Thanks for the report, @enozkan!

    The Firefox download issue should have been resolved earlier today (4-5 hours ago) so I hope it's working now. Please let me know if you tried after that and still had the same issue.
  • Syncing fails with bigger libraries it seems? Mine has 22k PDFs.
  • @jeremyvancleve: yeah, I can imagine that 22k PDFs fails right now. We don't have a test library of that size and I can guess the pain points (e.g. the "Loading library statistics" probably takes forever). I am interested in making it work though. It might require some back and forth. If you are interested in that, please email contact@beaverapp.ai. If not, I hope you try again after the beta.
  • Hi, i just tested out Beaver using Google's API key and it's really great in searching my libraries based on my input. It has the easiest set up and really user friendly interface from the others that I had tested. Furthermore, output functionality was great as it will direct me to exact file for easier tracing, saving me time to conduct my research and reading for my school assignment.

    One thing that I've noticed that some journal articles may be skipped due to identification as 'insufficient text' and was wondering how to go around it?
    https://s3.amazonaws.com/zotero.org/images/forums/u13814293/k5gm55tvzpfcsy6xxp2e.png

    Furthermore, will Beaver be considering to include API key functions from Openrouter or Ollama (free functionality compared to the 3 major ones).
  • @jowinlee Glad that it is working well for you.

    On the PDFs: Would you be open to emailing me one of the pdfs? Either at contact@beaverapp.ai or my own address. If not, I can talk you through trying some things. As background: Beaver skips attachments when the extracted text is less than 150 characters. In your case, I don't think that is the case. My guess is that these PDFs do not have a text-layer and require OCR but are not recognized as such and therefore misclassified as "insufficient text" and not "Requires OCR". If OCR is the issue (and I am only guessing), support for that is coming but will still take a little.

    On supporting other providers: yes. I thought about Openrouter (because it covers so many models) and mistral (as a European provider) next. I will revisit after finishing a bigger change that is currently in the works. I do want to say that the frontier labs still have an edge for agentic applications so the differences you see might be bigger than in simple question-response applications.
  • Fantastic tool. a couple of comments:
    -On the top, to the left of the user account symbol is a symbol of stacked coins or a cake with dripping icing (??), but hovering over it does not show a title and clicking it does not do anything. Can you explain what this is supposed to do?
    -In my case, it did not scan something like 1700 pdfs because they were above the limit. That is fine as a limit, but it would be helpful if Beaver told me which ones are not included, or better, the logic of which ones are not included (does it run alphabetically per author name? Or start with the smallest files? or by date added?).
    Ideally, it would be even better, if the user could give Beaver a logic to scan. In my case for example, not all my items are in folders, but all those that are in folders are more important to be scanned than those that are not. Thus scanning items in folders, follwed by those not in folders would be way more helpful than alphabetically. Or alternatively, scanning those added to the library most recently.
    best
    m

  • - Database Icon: That is supposed to show the database or sync status. The green dot turns red when there was a sync error. Clicking on it starts a manual sync. None of that is very clear and I just added a tooltip as a start (will be in next version). Later, I might remove it and only show when there was an error.

    - Large libraries: Yes, absolutely. I think the support for large libraries (about 5% of users right now) and how the limit is handled is not good right now. I am considering a special Beaver collection like "My Publications" or "Duplicate Items" (maybe with a an option to add all recent items or add by collection during onboarding). Not sure yet about the best implementation though but I will prioritize this after the next release.

    - Processing status: You should already be able to see which files were skipped or failed to process with clear reasons. You can also list them but with 1700 files that is a mess and pretty unusable to get an overview.
  • If anyone is thinking of installing this plugin but is worrying about whether their files will be indexed, I'm in the process myself and have had a progress message -

    https://s3.amazonaws.com/zotero.org/images/forums/u3113719/evem98e1vctfbj48lq5d.png
  • Hi Joscha,

    sorry for late reply, but regarding the processing status: Can you tell us what the default processing logic is at the moment? I am able to to see the ones omitted but it is impossible for me to understand what the logic is from looking at them, and without knowing the logic of exclusion it is hard to make sense of the results of Beaver.
  • Hi!

    I added more details here so there is a dedicated documentation page that explains the process. Let me know if there are any open questions or anything unclear on that page.

    The short version:
    - Files are roughly processed by modification date of the Zotero attachment (only after ~Oct 20, no clear sorting before)
    - We increased the free page balance during beta from 75k pages to 125k pages (over 4,000 articles at 30 page per article). This increase improves support for large libraries. Processing that many files can not and will not remain free after the beta.
    - Support for very large libraries: When your library exceeds 125k pages, support during the beta is limited! Additional files are not processed. You can list unprocessed files but there is no good way to get a complete overview if there are a lot of files. I am open to suggestions here that would make the experience better but remember that this is (hopefully) only temporary during the beta.
    - Beaver supports restricting sync to specific libraries. At least right now, collection-level filtering is not planned. It adds a lot of complexity and degrades the user experience. Instead, the goal is to support extremely large libraries so you don't have to worry about granular controls. I recognize that we initially planned to support this feature and changed plans after thinking about different implementations. Still open to input and suggestions here.
  • Hi thanks, this is helpful but also confusing:
    -First of all, to help other users: To restrict and/or expand the libraries that are scanned, go to "settings" (which is under the user account, on the top right.
    -on the github help page that you link above, it says: "If you signed up before this change, files that were not processed because of insufficient balance should be processed automatically when you sent a chat message." I am one who signed up before Oct 20th. So what do I need to do? Send any *any* chat message in Beaver? Or a specific prompt to rescan the library?
    -I cannot speak for others, but I would imagine that it is good in principle that beaver operates on the level of a whole library. But then for those with large libraries, if Beaver has not scanned everything, then it would be preferable to operate on defined collections.

    many thanks for your help!
    mm
  • 1. Thanks. You can select libraries during the onboarding process or at any time in Beaver under settings.

    2. Yes. Any message will do it. It should have already happened if you send any message in the last 3 weeks. If not, email the address on the help page and we will get it resolved. This is only relevant for users who signed up more than three weeks ago and ran into the page limit issue.

    3. Yes, completely agree. However, there are real tradeoffs. We explored it and decided against it for now. That might change and is a temporary issue during beta.
  • edited yesterday at 3:35am
    @Joscha How to use API keys from other open source AI models like deepseek, siliconflow or Qwen?
  • @arenal : Other providers such as Deepseek, Z.ai or any compatible custom endpoint are supported as advanced options. Some models will not work because they don't support certain features. The error messages are not always clear so it might require some experimentation but there are plenty of users who rely on Deepseek or other open models.

    Let me also highlight two recently added features that we are pretty excited about:

    Organize & Edit Your Library: Beaver can help you manage collections, add tags, fix metadata, and keep your library organized. All changes require your approval by default so you stay in control. Learn more about library organization and metadata editing.

    Discover New Research: Search over 240 million scholarly works outside your Zotero library. Understand citation patterns and find papers to expand your collection. Learn more
  • What is the news on groups? All our content is in groups, the only reason to use the personal library here is for copyright management.
  • @sdflewrit783 : Group libraries are fully supported since version 0.5 (late September). It works differently in the Free and Beta version:

    - Free: Because your libraries do not sync with Beaver and file processing is local, Beaver can access everything on demand. You can restrict search to specific libraries by asking or by adding a library as a filter using the @ menu. Beaver also “knows” the currently open library so you can ask things like “can you organize the items in THIS library into collections”.
    - Beta: You select the libraries Beaver works with explicitly. Beaver can only access these libraries. The selected libraries sync for cloud-based file processing and search.

    You can read more here. Let me know if you have any questions about this or anything is missing in how we support libraries.
  • Thanks. Shall explore. With a large database of some gigabites and private content the local processing seems a better option. What is the advantage of the pro plan supposed to be?

    Any word on the safety of this all? Our database is an effort by many people over many years. Definitely would not want to mess it up. E.g., generated collections sound fun but without versioning or undo also scary.
  • @sdflewrit783 : I am really happy about how the free version turned out. I honestly like it more at the moment. It's ready to use within minutes and supports semantic search even for very large libraries. Do keep in mind that your library does not sync and files do not upload but that doesn’t mean everything is local (see details here). There are still key advantages of beta/pro: Better document understanding because of cloud-based processing, sentence instead of page-level citations and full-document search (e.g. find this paragraph, not just find item on this topic).

    Safety for library organization and metadata edits was a top priority and is implemented in two main ways:
    - All edits to your Zotero database require your approval by default (e.g. adding a collection, moving items, tagging items, editing metadata). So nothing happens without you clicking "Approve" or "Approve All". You can set this to "Auto-apply" but that is your choice and of course can lead to changes you don't want or like.
    - All edits have an undo function that should revert all changes.

    I also recommend starting smaller (organize/edit 10-40 instead of 200+ items). Always happy to hear what works well or what does not work well. One word of caution: Beaver is still in beta so there might always be issues. I think the safety net protects from that but nothing is absolute.
Sign In or Register to comment.