Zotero (MacOS) not actually indexing PDF contents

I am a recent convert over to the Zotero platform (from EndNote). After importing my library, and ensuring all PDFs I am importing have been OCR'd, I have found that Zotero is not indexing the content of several PDF files. The file shows as indexed when I check the status of each entry, but when I search for specific phrases or unique terms, the results do not include the file I took the phrase/term from.

Is there something special I have to do to ensure my files are properly indexed so I can use the search function effectively?
  • (If this is just some individual PDFs, are you certain you can search them in another program? Sometimes OCRed PDFs become corrupted if you modify them using certain programs such as using highlighting in OSX's Preview app. If that happens they appear to still have text (you can select words for example) but the character mapping is basically randomized so it no longer corresponds to the actual text. Again, this may be unrelated to your situation, but if it's just a few PDFs while others work, I'd check on that.)
  • I have very few PDFs that I have personally OCR'd. The ones that I typically have to OCR are pre-2000 PDF files. 95% of the PDFs I add to my library are from databases such as IEEE Explore, ProQuest, Emerald, and Sage and are already searchable when I obtain them. For example, I downloaded the proceedings for a conference from last year which has text that is selectable, able to be copied, and fully searching. When adding this 8-page document to my library, the word count in my index only increased by 300 words despite there being a couple of thousand words. Additionally, I can select a sentence or unique term directly from the PDF and search for it in Zotero and the result I want will not appear.

    As an example, I can search for SCADA looking for a document that I know has SCADA in it, but the result does not show up. When I search the document manually, it shows up fine and, likewise, it is identified in other tools like Mendeley or EndNote.
  • How are you searching? If you’re using the search bar, are you in “Everything” mode?
  • Yes, I am using the search bar in everything mode. For example, I just searched for "targeting" in a PDF that is, in itself, searchable and didn't require OCR. The file I took the text from did not show up in the results but shows as "indexed"
  • Can you provide a Debug ID for 1) clicking Reindex in the right-hand pane for the PDF and 2) searching?
Sign In or Register to comment.