disable PDF indexing?

asplundj · March 13, 2008

How do I disable PDF indexing?

It's annoying with all the results the quick search yields when pdfs are indexed

zeigerpuppy · August 18, 2008

any solutions to this as yet?
I am finding pdf indexing slows down firefox too much and interrupts my workflow.

I guess deleting pdftotext and pdfinfo would work.... but there should be something a little more elegant.

dstillman · August 18, 2008

Set the max char and page values to 0 in the Search pane of the Zotero prefs and clear the index.

timtak · March 19, 2012

> Set the max char and page values to 0 in the Search
> pane of the Zotero prefs and clear the index.

Thank you for that information.

I wonder if setting the max pages to 1 it would get the metadata.

Generally, so far at least, it is the metadata that I would like and if that is stored in the first page, then that would help.

Also (sorry if this is "hijacking") like zeigerpuppy I am find that indexing interupts my workflow, so ideally I would like it done offline, when I am not using zotero, over night or over lunch.

So, if I set the max pages to a small number (zero or 1) and then reset it back to 100 before I go to lunch would
a) zotero know to reindex those files that it had not indexed fully
or
2) Would I have to click on the green "re-index" arrows on all the files that I had not index prior to lunch reindexed? (I could move them to a special collection to facilitate this I guess, but as noted elsehwere zotero does not respond during indexing so no, that would not allow me to have indexing done off line since I would have to wait for each index to be completed before I could press the next reindex button).

Ideally there would be:

1) a feature (like the manual sync button) to only index on command, when one presses an "index now" button.

2) settings to allow automatic indexing for metadata and manual indexing for the full text of the pdf since I guess it is the latter that slows workflow, and metadata is the only thing I want indexed up-front.

Tim
Zoterowing full time recently

adamsmith · March 19, 2012

Indexing will be moved to the background during idle times in future Zotero versions (I don't have an exact timeline, but Dan has indicated it's planned).

Currently, Zotero will _not_ re-index partially indexed files, so setting the number of pages to 1 like you describe won't work - you'd have to manually re-index every one of them later.

timtak · March 19, 2012

Thank you for your response.

I did some experimentation.

Setting "maximum pages" to 1 allowed me to retrieve metadata but the I page "partial" index still took about 25 seconds.

Fortunately if I set Maxpages to 0 then indexing is not carried out so their is no wait at all after the pdf downlaods. However I can still "Retrieve Metadata from PDF." It seems to take little more time to get the metadata than it does if a complete or partial index has been carried out.

So, in order to achieve "offline indexing" in the absence of background indexing, for the time being I think I can

1) Set max pages to 0
2) Enjoy an even quicker Zotero experience, only retrieving metadata when one needs it
3) Delete the index
4) Reindex when need be, overnight

I am not sure if (3) is essential.

In any event, the max pages to 0 solution works for me.

Also I think that my background google desktop search - which I still use even though discontinued - will index my pdfs and allow searching.