Beaver: Zotero AI plugin with agentic search over your library, sentence-level citations and more
Hi Everyone!
We are happy to introduce Beaver. Beaver is a research agent with native Zotero integration. What, another AI plugin for Zotero? Yes, indeed! But I think the feature set makes it compelling. There are other options and you should explore them all. Here are some of Beaver’s features:
1. Research Agent: Beaver uses agentic search over your entire Zotero library. It iteratively combines metadata, related reference search based on semantic similarity and full document search using keyword and semantic similarity (hybrid). Together, these tools allow the agent to find relevant references, documents, and even specific paragraphs.
2. Seamless Zotero integration: Beaver is a Zotero plugin that adds a sidebar in Zotero so you can use it directly from the library view or while you are reading a PDF. Beaver sees your current page and can respond based on your entire Zotero library.
3. Precise, sentence-level citations: Beaver supports sentence-level citation! Hover over a citation to see a preview or click on it to open the pdf and highlight the relevant passage.
4. Unlimited use and access to frontier models from OpenAI, Anthropic and Google with your own API key.
5. Benchmarks and evaluations. We use benchmarks and evaluations to consistently improve the performance of Beaver including prompt and context engineering, retrieval pipeline, file processing, citation behavior and more.
Preview: Beaver is in beta and available for free right now (for a limited number of users). You can learn more and sign-up here. The frontend code is open source on GitHub. Remember that Beaver is in beta and you might run into bugs!
Please leave feedback or let us know about feature requests here or on Github.
How does Beaver work?
Beaver is a cloud-based plugin. That means it syncs your Zotero data with our servers to processes your files and provide all it’s functionality. We have a strict privacy policy (no training or other use of your data unless you explicitly opt-in) and will implement additional privacy-focused features. We are also interested in developing a local version but it is not the highest priority right now (follow discussion here).
Prefer a local‑only approach? Consider some of the current Zotero plugins like A.R.I.A. or Zotero MCP.
How does library search work?
Beaver uses agentic search: the AI can choose among different search tools, filter based on metadata and iterate to explore your Zotero library. Currently, Beaver supports three search tools:
1. Metadata Search: Finds items by metadata (author, year, title).
2. Related Reference Search (Semantic): Find all references related to a specific topic.
3. Full-document Search (keyword and semantic): Beaver uses hybrid search with reranking to search the content of your documents and retrieve relevant passages. Hybrid search combines keyword and semantic search based on embeddings to find relevant passages even without exact terms.
During the preview, full-document search is free for up to 75,000 pages (~2,500 articles). After the preview, full-document search will likely be part of the paid version simply because of the cost associated with processing and storing the data. Metadata + related reference search will likely remain unlimited and free.
How does pricing work after the preview?
Beaver is published by academic researchers at Harvard. The goal is not to make a profit. We are just having fun working on this and building a useful research tool. That means two things for pricing after the preview: a) We always want to offer a free version (with your own API key) and we are trying to pack as many features into it as we can. b) There will be a paid version with additional features priced to cover costs. For example, processing thousands of files in a way that supports sentence-level citations, generating embeddings, storing the data and making it searchable is not cheap.
System Requirements
Zotero 7.0 or later (including Zotero 8 beta)
Internet connection for cloud features
Modern web browser for account management
We are happy to introduce Beaver. Beaver is a research agent with native Zotero integration. What, another AI plugin for Zotero? Yes, indeed! But I think the feature set makes it compelling. There are other options and you should explore them all. Here are some of Beaver’s features:
1. Research Agent: Beaver uses agentic search over your entire Zotero library. It iteratively combines metadata, related reference search based on semantic similarity and full document search using keyword and semantic similarity (hybrid). Together, these tools allow the agent to find relevant references, documents, and even specific paragraphs.
2. Seamless Zotero integration: Beaver is a Zotero plugin that adds a sidebar in Zotero so you can use it directly from the library view or while you are reading a PDF. Beaver sees your current page and can respond based on your entire Zotero library.
3. Precise, sentence-level citations: Beaver supports sentence-level citation! Hover over a citation to see a preview or click on it to open the pdf and highlight the relevant passage.
4. Unlimited use and access to frontier models from OpenAI, Anthropic and Google with your own API key.
5. Benchmarks and evaluations. We use benchmarks and evaluations to consistently improve the performance of Beaver including prompt and context engineering, retrieval pipeline, file processing, citation behavior and more.
Preview: Beaver is in beta and available for free right now (for a limited number of users). You can learn more and sign-up here. The frontend code is open source on GitHub. Remember that Beaver is in beta and you might run into bugs!
Please leave feedback or let us know about feature requests here or on Github.
How does Beaver work?
Beaver is a cloud-based plugin. That means it syncs your Zotero data with our servers to processes your files and provide all it’s functionality. We have a strict privacy policy (no training or other use of your data unless you explicitly opt-in) and will implement additional privacy-focused features. We are also interested in developing a local version but it is not the highest priority right now (follow discussion here).
Prefer a local‑only approach? Consider some of the current Zotero plugins like A.R.I.A. or Zotero MCP.
How does library search work?
Beaver uses agentic search: the AI can choose among different search tools, filter based on metadata and iterate to explore your Zotero library. Currently, Beaver supports three search tools:
1. Metadata Search: Finds items by metadata (author, year, title).
2. Related Reference Search (Semantic): Find all references related to a specific topic.
3. Full-document Search (keyword and semantic): Beaver uses hybrid search with reranking to search the content of your documents and retrieve relevant passages. Hybrid search combines keyword and semantic search based on embeddings to find relevant passages even without exact terms.
During the preview, full-document search is free for up to 75,000 pages (~2,500 articles). After the preview, full-document search will likely be part of the paid version simply because of the cost associated with processing and storing the data. Metadata + related reference search will likely remain unlimited and free.
How does pricing work after the preview?
Beaver is published by academic researchers at Harvard. The goal is not to make a profit. We are just having fun working on this and building a useful research tool. That means two things for pricing after the preview: a) We always want to offer a free version (with your own API key) and we are trying to pack as many features into it as we can. b) There will be a paid version with additional features priced to cover costs. For example, processing thousands of files in a way that supports sentence-level citations, generating embeddings, storing the data and making it searchable is not cheap.
System Requirements
Zotero 7.0 or later (including Zotero 8 beta)
Internet connection for cloud features
Modern web browser for account management
And yes. Group libraries and OCR support are probably on top of the list in terms of next features. Both are pretty far.
Please let me know if you run into any issues.
I wanted to mention that I couldn't use the download button on the Beaver webpage on Firefox? "Save Link as" does not show up when right-clicking on the Download button, and the download triggered gets rejected by Firefox, since the .xpi file is not for Firefox (obviously). Had to switch to Chrome to download. Will be trying it out soon.
The Firefox download issue should have been resolved earlier today (4-5 hours ago) so I hope it's working now. Please let me know if you tried after that and still had the same issue.
One thing that I've noticed that some journal articles may be skipped due to identification as 'insufficient text' and was wondering how to go around it?
https://s3.amazonaws.com/zotero.org/images/forums/u13814293/k5gm55tvzpfcsy6xxp2e.png
Furthermore, will Beaver be considering to include API key functions from Openrouter or Ollama (free functionality compared to the 3 major ones).
On the PDFs: Would you be open to emailing me one of the pdfs? Either at contact@beaverapp.ai or my own address. If not, I can talk you through trying some things. As background: Beaver skips attachments when the extracted text is less than 150 characters. In your case, I don't think that is the case. My guess is that these PDFs do not have a text-layer and require OCR but are not recognized as such and therefore misclassified as "insufficient text" and not "Requires OCR". If OCR is the issue (and I am only guessing), support for that is coming but will still take a little.
On supporting other providers: yes. I thought about Openrouter (because it covers so many models) and mistral (as a European provider) next. I will revisit after finishing a bigger change that is currently in the works. I do want to say that the frontier labs still have an edge for agentic applications so the differences you see might be bigger than in simple question-response applications.
-On the top, to the left of the user account symbol is a symbol of stacked coins or a cake with dripping icing (??), but hovering over it does not show a title and clicking it does not do anything. Can you explain what this is supposed to do?
-In my case, it did not scan something like 1700 pdfs because they were above the limit. That is fine as a limit, but it would be helpful if Beaver told me which ones are not included, or better, the logic of which ones are not included (does it run alphabetically per author name? Or start with the smallest files? or by date added?).
Ideally, it would be even better, if the user could give Beaver a logic to scan. In my case for example, not all my items are in folders, but all those that are in folders are more important to be scanned than those that are not. Thus scanning items in folders, follwed by those not in folders would be way more helpful than alphabetically. Or alternatively, scanning those added to the library most recently.
best
m
- Large libraries: Yes, absolutely. I think the support for large libraries (about 5% of users right now) and how the limit is handled is not good right now. I am considering a special Beaver collection like "My Publications" or "Duplicate Items" (maybe with a an option to add all recent items or add by collection during onboarding). Not sure yet about the best implementation though but I will prioritize this after the next release.
- Processing status: You should already be able to see which files were skipped or failed to process with clear reasons. You can also list them but with 1700 files that is a mess and pretty unusable to get an overview.
https://s3.amazonaws.com/zotero.org/images/forums/u3113719/evem98e1vctfbj48lq5d.png