Zotero 7 Beta: Annotated text is missing spaces between words, when the page is slightly skewed

edited August 31, 2023
When annotating text on a slightly skewed OCRed pdf in the pdf reader, then there will be no space between words when looking at and copying this annotated text from the left-hand sidebar. Also, the highlights itself on the page will have spaces between words, although usually no such spaces appear when highlighting text. Other pdf readers, like Evince on Ubuntu, do properly annotate slightly skewed text and have proper spacing between words, when copying them.

I've sent an example to support@zotero.org, where the first page is slightly skewed and therefore the annotation is behaving as described above, while on the second page everything works fine.

Happens on Zotero 7.0.0-beta.38+b79e0b3d7 (64-bit) on Ubuntu 23.04.
  • I think I do encounter this problem as well when working with PDFs that are slightly skewed; however instead of no space between words, Zotero inserts a space between each letter of the word. It also displays the handles for modifying the annotation at the top of the first and the bottom of the last word instead of at the beginning/end. The problem does not occur in Firefox's pdf viewer with the same PDF.





  • Could you send an example PDF file to support@zotero.org with a link to this thread?
  • I just did, hope it helps!
  • I run into this on occasion as well, even with non-skewed texts. The odd spacing appears to be an artifact of the OCR, as I sometimes get it the Abbyy Finereader Ocr program. It usually goes away with redoing the OCR.
  • In this case, though, other readers (Firefox, Okular) are able to copy the text without the additional spaces (this is not to say that the OCR quality couldn't be better).
  • Yes, that is interesting, because I see the same spacing problem in Acrobat as well for those files. So there must be some difference in the way pdf readers treat the text layers.
  • I still encounter the problem of each letter being separate when annotating slightly skewed and OCRed texts. I've noted, however, that this mostly concerns text that skew downwards (from left to right). Another example:

  • A relatively small skew is already enough for the problem to occur:


    Other PDF viewers like Okular handle this more gracefully and let me copy the normal OCRed text. Firefox's pdf.js copies the text with every word on a new line, which is still easier to correct than Zotero's handling (every single letter separated by a space).
  • I still see this quite frequently in 7.0.12beta. It happens with almost any scanned PDF that show the slightest downward skew. It's a bit of a hassle to manually fix each annotation.




Sign In or Register to comment.