I get this message on a high proportion of occasions I attempt to 'retrieve metadata', and in most cases these are pdfs with real text, not images (ie. I can search for and find text when opening them up in a pdf reader). Also, these same pdfs *are* getting indexed, and I can find text within them using zotero.

Is this a problem or limitation with pdftotext? As an aside, I also note that adding pdfs is the single slowest operation in my use of zotero, taking minutes. Perhaps it's worth considering a different method of extracting pdf text.
  • I'm having this exact problem as well.

    I have many PDF books that were created directly from a software application- (these pdfs were not scanned and then OCRed). Zotero can index these pdfs, but when I try to "retrieve metadata", a message pops up to say "PDF does not contain OCRed text".

    Without the metadata, I cannot select the function "Rename File from Parent Metadata". Is there a solution to this?
  • Is there a solution for this?
  • You'd have to provide a link to an example PDF for which you're seeing this for us to tell you more.
  • Just picking up on the previous conversations since I also would like to resolve this issue. Below is a link to an example pdf giving me the pop-up 'PDF does not contain OCRed text'

  • that PDF indeed does not contain OCRd text. The easiest test is to check whether you can select and copy text in your PDF reader with the text selection tool. If not, there is no way Zotero will be able to read it (and even if you run OCR on such a file yourself, it'd be unlikely for Zotero to find metadata for the PDF).
