Search is slow and unreliable
Today, I noticed that I couldn't find a phrase that was in a document that I had just added to my library, and I decided to investigate
The way I use the search: basic search box, scoped to "Everything", search phrase between double quotes
First of all, I tried other search tools: both grep and Windows Search could find it, both in the document itself and in Zotero's full text cache file
Then, I tried to clean the search index, which failed with error "Transaction timeout, most likely caused by unresolved pending work"; I made a note of it, because it's pretty disturbing that Zotero can turn a long wait (which I'm willing to endure) into a critical failure. I tried a couple times, and always got the same error
Then, I tried to rebuild the search index. It took a long time, it inexplicably failed to index a lot of files (a list would help, instead of a mysterious number labeled "Unindexed"), including several corrupted PDFs that I had to re-download (it would help if Zotero alerted me that I had corrupted files in my library). It was also useless, because search kept failing
Then, I noticed for the first time that, during a search, Zotero takes up literally all available RAM. If 8 GB is available, it will take 8 GB; if 10 is available, it will take 10, almost instantly. My library isn't even that big (about 9000 entries, and a 400 MB database; the *entire* storage is a little over 14 GB). On a hunch, I freed as much memory as possible, and tried the search again: with about 10 GB available, search works again (but very, very slowly). Searching a subcollection instead of the entire library also works, although it, too, takes all available RAM and runs very, very slowly
This is pretty disturbing, because I rely on Zotero a lot, and I use full text search *a lot* to look for unexpected connections between articles that I haven't thought of putting in the same subcollection, and I can only wonder how many connections I've missed this way. Is there something wrong with my setup, storage or database, or is this expected behavior? If Zotero is hitting a limit, like a timeout, is there a way to increase it?
Finally, an unrelated issue: a small number of HTML documents just won't index. They aren't particularly big or complex, but their indexing status shows as "Unknown" and won't budge
I'm running 7.0.8 x64 on Windows 10. 7.0.9 beta has the same issue (I downgraded to make sure it wasn't an issue in the beta)
When search fails, an error is printed in the console: "Could not allocate buffer: NS_ERROR_OUT_OF_MEMORY". No other information, no call stack, nothing. Once it happens, search fails, most used memory isn't freed, and I have to restart Zotero
When reindexing one of the "unindexable" HTML documents fails, nothing is printed
The way I use the search: basic search box, scoped to "Everything", search phrase between double quotes
First of all, I tried other search tools: both grep and Windows Search could find it, both in the document itself and in Zotero's full text cache file
Then, I tried to clean the search index, which failed with error "Transaction timeout, most likely caused by unresolved pending work"; I made a note of it, because it's pretty disturbing that Zotero can turn a long wait (which I'm willing to endure) into a critical failure. I tried a couple times, and always got the same error
Then, I tried to rebuild the search index. It took a long time, it inexplicably failed to index a lot of files (a list would help, instead of a mysterious number labeled "Unindexed"), including several corrupted PDFs that I had to re-download (it would help if Zotero alerted me that I had corrupted files in my library). It was also useless, because search kept failing
Then, I noticed for the first time that, during a search, Zotero takes up literally all available RAM. If 8 GB is available, it will take 8 GB; if 10 is available, it will take 10, almost instantly. My library isn't even that big (about 9000 entries, and a 400 MB database; the *entire* storage is a little over 14 GB). On a hunch, I freed as much memory as possible, and tried the search again: with about 10 GB available, search works again (but very, very slowly). Searching a subcollection instead of the entire library also works, although it, too, takes all available RAM and runs very, very slowly
This is pretty disturbing, because I rely on Zotero a lot, and I use full text search *a lot* to look for unexpected connections between articles that I haven't thought of putting in the same subcollection, and I can only wonder how many connections I've missed this way. Is there something wrong with my setup, storage or database, or is this expected behavior? If Zotero is hitting a limit, like a timeout, is there a way to increase it?
Finally, an unrelated issue: a small number of HTML documents just won't index. They aren't particularly big or complex, but their indexing status shows as "Unknown" and won't budge
I'm running 7.0.8 x64 on Windows 10. 7.0.9 beta has the same issue (I downgraded to make sure it wasn't an issue in the beta)
When search fails, an error is printed in the console: "Could not allocate buffer: NS_ERROR_OUT_OF_MEMORY". No other information, no call stack, nothing. Once it happens, search fails, most used memory isn't freed, and I have to restart Zotero
When reindexing one of the "unindexable" HTML documents fails, nothing is printed
2) Provide a Debug ID for a search that takes up lots of memory.
This was the original search query that I mentioned in the OP: it should match a document that I archived yesterday, but it runs for a long time, allocates several GBs of memory, and then throws a NS_ERROR_OUT_OF_MEMORY error. After this error, Zotero shows up in Task Manager with several GBs of allocated memory (less than the memory that was allocated at peak usage, but still a lot), and it never seems to get freed; I have to restart Zotero to get my memory back
I don't know if it's relevant, but I see from the log that Zotero looks through lots of .zotero-ft-cache files, but none of them is the recently added document's
This did not use to happen until very recently. No earlier than last week, I believe. Nothing special happened, except that I had recently raised all full text indexing limits to the maximums allowed (2147483647 characters, 2147483647 pages), but no file in my library comes close to those limits
If you go to
C:\Users\[…]\Zotero\storage\Y7J8K7QN\
, what's the size of the .zotero-ft-cache file? (You may have to adjust your File Explorer settings to show hidden files. You can also just check the size of the folder and subtract the size of the PDF.)I even restored a backup of the Zotero directory from the day I originally reported the issue and tried again, but it still doesn't reproduce. A different, similar search string raises memory usage up to 7 GB and takes a lot of CPU, but it works on both the old and new database
The only difference I can see is that I have more free RAM compared to when I reported the issue. Maybe running out of memory at some point during the search results in pathological behavior where even more memory is allocated? Disk I/O is much lower now, too: at the time, there were system tasks running in the background
No idea, I'm stumped. Relieved that I can use Zotero normally again, but stumped
Just in case, I submitted logs for both library versions and both search strings:
Old library, search string #1: D1739371043
New library, search string #1: D923264508
Old library, search string #2: D284826724
New library, search string #2: D1783320333
In both libraries, storage file Y7J8K7QN\.zotero-ft-cache is 53722 bytes