searching pdfs

Is it possible to search in Zotero and include text contained in pdfs in the search? If so, is there a fast way to convert all pdfs into searchable pdfs if they are already in Zotero (attached to each item as a child item)
  • edited September 18, 2012
    1- Quick answer: yes, see http://www.zotero.org/support/searching#fulltext_pdf_indexing

    2- But Zotero does not OCR your PDFs. It has to be done with a third-party program.
  • So I can search the pdfs within Zotero only? Not sure what you mean by #2 above.
  • Error report ID 743631462 when I tried to index. Maybe I have too many files (235 unindexed and 3 indexed)?
  • Meaning what? There are no Zotero errors in there. Are you actually getting an error message?

    http://www.zotero.org/support/reporting_bugs#provide_steps_to_reproduce
  • edited September 18, 2012
    no, not even close - several thousand files are no problem.

    Not sure what you mean by "search the pdfs within Zotero only" - Zotero doesn't change the content of your PDFs. If a PDF contains a text layer (standard for downloads from journals, but not for your own scans), it indexes your PDFs so you can search the content from within Zotero. Other applications, such as spotlight for Mac, index all files on your HD and allow you to search for them. That includes files you store in Zotero.

    edit: (removed request to post error to new thread).
  • sorry, did not know about the detailed procedure. Yes I got an error message. Will do next time.
  • Seemed to work this time after a restart. Ended up with 191 indexed, 18 partial and 29 not indexed. Is there a way to figure out which ones were not indexed so I can try a 3rd party text recognition program or will that not make a difference? thanks.
  • 3rd party tools that perform OCR will definitely make a difference.
    IIRC there is currently no way to search for unindexed files, though you can see whether a file/attachment is indexed in the right hand column when it's selected.
  • ah yes, I see the indexed "Yes" on the side. thanks
  • Does "Yes" include the partial ones? Or does it say partial. I looked around for a bit but couldn't find one and don't want to check them all.
  • partially indexed files say partial. Those are most often long files, beyond the page limit set under preference-->search.

This is an old discussion that has not been active in a long time. Before commenting here, you should strongly consider starting a new discussion instead. If you think the content of this discussion is still relevant, you can link to it from your new discussion.

Sign In or Register to comment.