Index end of long pdf files
Hello Zotero devs,
I wanted to make a recommendation to change the default behaviour of the Zotero pdf indexing.
For long files (those exceeding the default 50k words/100 page indexing limits) the program should use the final (eg) 10k words or 20 pages to index the end of the file. I think this would make it so that the program by default would capture the index of books stored in the library - arguably the most important thing to index in a text book.
Regards,
sethmg
I wanted to make a recommendation to change the default behaviour of the Zotero pdf indexing.
For long files (those exceeding the default 50k words/100 page indexing limits) the program should use the final (eg) 10k words or 20 pages to index the end of the file. I think this would make it so that the program by default would capture the index of books stored in the library - arguably the most important thing to index in a text book.
Regards,
sethmg
I'm not sure that's desirable?
This is possibly true, but in my experience documents of such long length are more often terminated with an index than a bibliography.
@sethmg - I'm not sure your experience is generalizable - in the non English world indices are a lot less common, for example - I have a bunch of long pdfs in Spanish and German ending with a bib. Also, most documents by international organizations - many of them above 100p. - don't have indexes.
@adamsmith: Sounds reasonable, I can certainly see that. With the current method the table of contents is captured, which is likely enough.
Thanks!
But, re. "the non-English speaking world:" 'Bibliography' and 'Index' are Latin words, meaning the same in German, Spanish, Russian, and adopted by the rest of the world. One means 'Books' (etc.), while the other means, 'Let your index finger do the walking in a printed topical search database.' (And yes, you're correct--many documents, in any language, don't contain an index. Most, in any language, do contain either citations, a bibliography, or both.
American Heritage Dictionary says, of Index: ETYMOLOGY:
Middle English, forefinger, from Latin; see deik- in Indo-European roots
So it probably dates to ancient Sanskrit database engineers, who didn't even have palmtops, so how could they know anything? Good luck.