Some OCR PDF text not recognised in Zotero

Hello,

I have a previously scanned and OCR'd PDF in which some sections of the text are unselectable to annotate in Zotero, although the PDF text is selectable in Acrobat. I have the (up-to-date) OCR plugin for Zotero and have run it on this PDF, but the same sections remain unselectable. If I highlight within these problematic areas within Acrobat, then import that PDF into Zotero, these highlights appear as annotations, but the content of the annotation says "No extracted text". Highlights in the non-problematic areas of the same PDF import correctly.

This seems to be a consistent problem with this particular PDF, and resaving it does not fix things. Has anyone else come across a problem like this?

Many thanks for you help!
Edward
  • If you email the PDF to support@zotero.org with a link to this thread, we can take a look.
  • I have a similar problem with a PDF I OCRed using PDF-Xchange. It's only short, so I can manually copy the annotations, but I wonder if there's a setting or something I should be aware of to avoid this in future?
  • If text is not aligned to horizontal or vertical axis, it can't be selected or highlighted in Zotero PDF reader. OCRing software takes care of this normally.
  • @martynas_b I ran the file through Acrobat's OCR as well as through the Zotero OCR add-on, and both left the file with the same unselectable text areas when viewed in Zotero.
Sign In or Register to comment.