(auto-) index

Hi,
I am confused about the indexing features of zotero and the relationship of the bwiernik plugin called "auto-index" and the zotero indexing feature.
I recently realized that in my settings "search" tab, around half of my items appear as not indexed. I began to search on zotero and found a) the mentioned plugin, but b) a forum item https://forums.zotero.org/discussion/comment/239221#Comment_239221 where a user is advised to uninstall said plugin to make indexing work.
Otherwise, there is an absence of documentation of what this plugin is needed. So I am confused as to:
a) why are my items only partially indexed
b) would the plugin solve the issue or rather the opposite?
thanks for help
  • edited February 10, 2023
    Should also add that it adds to the confusion that on the zotero support page https://www.zotero.org/support/searching it says that indexing is automatic, but on the plugin page https://www.zotero.org/support/plugins it says that auto index is "A Zotero extension which keeps the full-text index updated." But if indexing is already working in background, why does anyone need the extension? At a minimum, I suggest that if the extension merely duplicates functionality of zotero, it is removed from the official plugin page. Or, if it adds functionality, that both the search support page and the plugin page clarifies what the plugin does that zotero does not already do.
    Also confusing: on the search support page mentioned above, there is mention that items can only be indexed if they contain searchable text, which obviously makes sense. But the settings do not make clear whether the "unindexed" items are or include those that cannot be indexed because they simply do not contain any searchable text.
    In my specific case, I thought I might figure this out from the count, but the count does not add up: My library contains 11000ish items, of which my settings count says 3700ish are indexed, 4600 unindexed and 234 partially indexed. Which altogether means that around 2000 are missing from the count altogether. is the answer that the 2000 missing are the ones that do not contain searchable text, that 3700 could be indexed and that for 4600 the indexing failed? Or is it that 3700 could be indexed, 4600 do not contain searchable text and 2000ish failed? Or something else?
  • edited February 10, 2023
    I have now tried to rebuild the index by selecting "indexing unindexed items" with the result that this has worked well, until it stopped at roughly 7000 indexed and 1000 unindexed with 400 partially indexed. Which means that the total has not changed (which therefore leaves me to assume that the 2000ish missing from the count are the ones that do not contain searchable text?). But I receive no alert as to why the remaining 1000 unindexed cannot be indexed. Even repeated clicking "index unindexed items" changes nothing, also closing and restarting zotero does not help.

    Another issue: It is not clear to me:
    a) why the max characters per item and the max pages are set so low? Is there any reason for this? clearly 100 pages is less than most books, so for people with books as pdf this does not make sense? Given that most people will never see this, this seems problematic?
    b) the max character per item (500 000=200 pages) and the max pages (100) default seem to be very different: Why is this? And which one has precedence? I.e. if a book is say 450 000 characters on 160 pages, will it stop indexing at 100 pages? Or not index at all? Or is this the reason why items appear as "partially indexed"?
    Given the above, would it not make sense to first of all explain in more detail how this works and second, and more importantly, to set the max values higher and that they roughly match each other, say the max characters to 1 mio, and the max pages at 400? Or is there any reason for these low numbers?

    Finally, it is unclear what happens if I increase the max character/pages numbers. I thought that if I increase the numbers then maybe it will index the unindexed items because they were too long. But nothing seems to happen. Maybe its a bug, or maybe this is how it is indended, and zotero simply indexes the rest of the text of already indexed items in the background? Or do I need to rebuild the entire index from scratch to have the rest of already indexed items indexed?

    Again, some help/explanation would be good. And apologies for the long posts, but I thought others might be similarly confused.
  • As I have not received any comment, I would like to bump this
  • I have the same questions. Have you found a solution to the problem in the meantime? If so, would you describe this solution here?
  • No, I have not found a solution, nor have I ever found an explanation. It is mystifiying, in particular because this covers a whole page of settings, which can be user configured. Any help by the devs would really be appreciated.
    I should also add that my library index has moved now to 57 partial, 1100 not indexed, and I have no idea why the partially indexed have gone down.
Sign In or Register to comment.