Search is slow and unreliable

Today, I noticed that I couldn't find a phrase that was in a document that I had just added to my library, and I decided to investigate

The way I use the search: basic search box, scoped to "Everything", search phrase between double quotes

First of all, I tried other search tools: both grep and Windows Search could find it, both in the document itself and in Zotero's full text cache file

Then, I tried to clean the search index, which failed with error "Transaction timeout, most likely caused by unresolved pending work"; I made a note of it, because it's pretty disturbing that Zotero can turn a long wait (which I'm willing to endure) into a critical failure. I tried a couple times, and always got the same error

Then, I tried to rebuild the search index. It took a long time, it inexplicably failed to index a lot of files (a list would help, instead of a mysterious number labeled "Unindexed"), including several corrupted PDFs that I had to re-download (it would help if Zotero alerted me that I had corrupted files in my library). It was also useless, because search kept failing

Then, I noticed for the first time that, during a search, Zotero takes up literally all available RAM. If 8 GB is available, it will take 8 GB; if 10 is available, it will take 10, almost instantly. My library isn't even that big (about 9000 entries, and a 400 MB database; the *entire* storage is a little over 14 GB). On a hunch, I freed as much memory as possible, and tried the search again: with about 10 GB available, search works again (but very, very slowly). Searching a subcollection instead of the entire library also works, although it, too, takes all available RAM and runs very, very slowly

This is pretty disturbing, because I rely on Zotero a lot, and I use full text search *a lot* to look for unexpected connections between articles that I haven't thought of putting in the same subcollection, and I can only wonder how many connections I've missed this way. Is there something wrong with my setup, storage or database, or is this expected behavior? If Zotero is hitting a limit, like a timeout, is there a way to increase it?

Finally, an unrelated issue: a small number of HTML documents just won't index. They aren't particularly big or complex, but their indexing status shows as "Unknown" and won't budge

I'm running 7.0.8 x64 on Windows 10. 7.0.9 beta has the same issue (I downgraded to make sure it wasn't an issue in the beta)

When search fails, an error is printed in the console: "Could not allocate buffer: NS_ERROR_OUT_OF_MEMORY". No other information, no call stack, nothing. Once it happens, search fails, most used memory isn't freed, and I have to restart Zotero

When reindexing one of the "unindexable" HTML documents fails, nothing is printed
  • 1) Make sure you're testing in Troubleshooting Mode (Help → "Restart in Troubleshooting Mode…”), which temporarily disables all plugins.

    2) Provide a Debug ID for a search that takes up lots of memory.
  • edited October 23, 2024
    D1461979477

    This was the original search query that I mentioned in the OP: it should match a document that I archived yesterday, but it runs for a long time, allocates several GBs of memory, and then throws a NS_ERROR_OUT_OF_MEMORY error. After this error, Zotero shows up in Task Manager with several GBs of allocated memory (less than the memory that was allocated at peak usage, but still a lot), and it never seems to get freed; I have to restart Zotero to get my memory back

    I don't know if it's relevant, but I see from the log that Zotero looks through lots of .zotero-ft-cache files, but none of them is the recently added document's
  • edited October 25, 2024
    Situation is pretty bad at the moment, search is basically unusable now. Any query takes a very long time and it's a crapshoot whether it will actually search something or run out of memory and then leak a lot of memory

    This did not use to happen until very recently. No earlier than last week, I believe. Nothing special happened, except that I had recently raised all full text indexing limits to the maximums allowed (2147483647 characters, 2147483647 pages), but no file in my library comes close to those limits
  • @hackbunny: Can you provide a second Debug ID for reproducing this, after a fresh Zotero restart?

    If you go to C:\Users\[…]\Zotero\storage\Y7J8K7QN\, what's the size of the .zotero-ft-cache file? (You may have to adjust your File Explorer settings to show hidden files. You can also just check the size of the folder and subtract the size of the PDF.)
  • @dstillman: the issue no longer reproduces. I don't know what changed, but that exact search is now almost instant, *and* it finds what it's supposed to. What I did was lowering full text cache limits (50000000 maximum characters, 10000 maximum pages) and rebuilding the index. I looked in the relevant places in Zotero source code and I can't see how those values could possibly disrupt indexing and searching, so I suspect it's a red herring

    I even restored a backup of the Zotero directory from the day I originally reported the issue and tried again, but it still doesn't reproduce. A different, similar search string raises memory usage up to 7 GB and takes a lot of CPU, but it works on both the old and new database

    The only difference I can see is that I have more free RAM compared to when I reported the issue. Maybe running out of memory at some point during the search results in pathological behavior where even more memory is allocated? Disk I/O is much lower now, too: at the time, there were system tasks running in the background

    No idea, I'm stumped. Relieved that I can use Zotero normally again, but stumped

    Just in case, I submitted logs for both library versions and both search strings:

    Old library, search string #1: D1739371043
    New library, search string #1: D923264508
    Old library, search string #2: D284826724
    New library, search string #2: D1783320333

    In both libraries, storage file Y7J8K7QN\.zotero-ft-cache is 53722 bytes
Sign In or Register to comment.