Indexing stats

Is there a way to find out more details about what precisely has been and has not been or is being indexed?

I have a whole bunch of Zotero (private) group libraries. When I look in my preferences, I see my Index statistics. I understand that 'partial' means that the document is larger that the the maximum N of pages to index per file - which I have just increased. As to the 'Unindexed' - I presume this may be related to the fact that I do not have the full text of many items. But when I browse through the storage folders, I sometimes see folders that only have a pdf and no .zotero-ft-cache. Like here - and that pdf is only 14 pages and has no pdf restrictions. So is there a way to find out how many pdfs are in the libraries and how many have been indexed? Could it be that some pdfs are restricted or are image-only and that this is why it cannot extract the text? In that case, would it be possible to get a list of those 'problem' file in order to 'fix' them? 

Finally, is there a way to visually see indexing progress somewhere?

Thanks much!

  • edited June 3, 2018
    Unindexed' - I presume this may be related to the fact that I do not have the full text of many items.
    unindexed means you have a pdf attachment but that it hasn't been indexed. A common reason for this is that it is a scan with no text layer, though indexing may fail for other reasons or may have been turned off at sometime in the past.

    See https://forums.zotero.org/discussion/comment/255983/#Comment_255983 for a nice work around to find unindexed PDFs.
  • Great! Thanks for that - I've asked an additional question there. But so I see that identifies. I guess we'll now have to find a way to identify the categories of problems that led to those ones not having been indexed, and to find a solution for those. I'll report back if we find a way to do this.
  • A few more questions:


    • do the stats reflect both My library and Group libraries?

    • is there a way to see the breakdown (and even better: by library/sublibrary)?

    • is there a way to re-index only certain libraries and not everything (we're talking 40 gigs...)

    • do files like .zotero-ft-cache also get synced to all group members?

    Thanks!

  • do the stats reflect both My library and Group libraries?
    Yes
    is there a way to see the breakdown (and even better: by library/sublibrary)?....is there a way to re-index only certain libraries and not everything (we're talking 40 gigs...)
    You can run searches that return items for particular collections of unindexed pdf, create a smart folder, and then select all/reindex items.
    do files like .zotero-ft-cache also get synced to all group members?
    No.
  • Very useful - thanks!
  • do files like .zotero-ft-cache also get synced to all group members?
    That file doesn't get synced directly, but full-text content does get synced if you have full-text content syncing enabled in the Sync prefs.
  • Thanks Dan. But so where is that information stored locally then on other members' computers?
  • When they have the full-text content syncing setting enabled on their computers.
  • But so where is that information stored locally then on other members' computers?
    It's written back to .zotero-ft-cache.
Sign In or Register to comment.