Feature Suggestion: Indexing status in Attachments column

This is a minor feature request--something that could be useful, but isn't urgent.
Currently, the Attachments column in Zotero can be in one of three states: blank (no attachment), solid blue dot (attachment present), unfilled blue dot (attachment missing). There is another piece of attachment-related information that could be conveyed by this column, with no additional space needed: the attachment's indexing status.

My suggestion is to use the colour of the dot to indicate the status--for example, red for "not indexed," orange for "partially indexed," and the current blue (or green) for "fully indexed." For the attachment itself, when the item is expanded, this will be straightforward. For the item record, when there is more than one attachment, there will need to be a logic for summarizing the indexing status, for example red if any attachment is unindexed, orange if all attachments are partially or fully indexed, and blue if all attachments are fully indexed.

The presence or absence of the attachment could still be indicated by the filled/unfilled state of the dot.
  • I doubt we'd expose indexing status at this level of prominence. It just shouldn't really be relevant, and there's no need to distract people with the extra information. The PDF tools have been bundled with Zotero for years, so unless there's some system restriction that's preventing proper indexing, files that can be indexed should be indexed. (There's an issue, currently, where an item added online never gets indexed unless you trigger indexing manually after downloading it, but that's just something we need to address. It shouldn't be something users should have to worry about.)

    Providing better ways to find unindexed files via an advanced/saved search (e.g., so that people could run OCR on a file) might be valuable, though.
  • I placed a comment on a earlier request the same as this one. I used the work-around method of finding unindexed items as suggest by Adam Smith in "FInding unindexed items" on April of 2009.

    This method did not work for me. If I used a period, I got too many returns. If I used 'a' then I got returns that were both indexed and unindexed.

    For the ones that were partially indexed, when I search for them and redid the index, then they were fully indexed. This contrast dstillman's suggestion that there is some 'system restriction' with the file. That could be true, and could only be determined if it is a 'system' problem or zotero limitation via testing.

    Out of 1100 entries in zotero there are 99 which remain unindexed after removing and rebuilding my index. That's almost a tenth of the files. Granted, some of those will never be indexed for various reasons.

    My take away is that, some percentage of PDFs are not being indexed, but once you find them individually they can be indexed.

    If would be great to find them, and within the Advanced Search window be able to index them.
  • (FWIW, the period method should really work and does for me -- follow up in the other thread with details)
  • I agree with adamsmith. Using the 'a' seemed to work for me with some exceptions. For example while searching for and re-indexing the partials, I found other partials which were not on the list.

    There were other exceptions, but I have to conclude that it is not worth going into great detail, because ultimately the problems were user error and the lack of communication from zotero regarding what exactly it is doing when indexing, ie there is no feedback that indexing is even running - unless I've totally missed this in the doc, ie how to know.

    Anyway, the user error part was running indexing with a page limit of 10. At one time I set it to 100, but within all of the testing it got reset to 10. That explains the majority of the earlier partials (I think), and now have to agree with dstillman (somewhat) that giving access to the indexing status is less of a concern, though I still would like to have that ability and it is not a major change to the UI to do so (from a user POV).

    The main take away is that because of the lack of feedback from Zotero the user has no way of knowing if indexing takes 5 minutes or 5 hours. I rebuilt my index (with Max Char per file 500000 max pages 100 and let it run all night. The next morning my numbers were much better at 745 indexed, 92 partial, and 99 unindexed with 177K words indexed. These numbers seem reasonable to me.

    Most apps today will display some short of 'processing icon' when taking more than a few seconds to complete a user command. A quick workaround would be to simply update the documentation informing the user that when indexing is done the new indexing numbers will appear on the Search Preference screen, otherwise indexing is still running. In this case, because I was focused on indexing problems, I simply assumed that indexing wasn't running because there was no feedback saying otherwise.

    My apologies if this scenario could have been avoided by reading the doc more carefully.
Sign In or Register to comment.