PDF annotation highlight text drops (some) spaces

Zotero 6.0.30 on Windows 10 (Home)

Selecting text with the Zotero PDF viewer to highlight and annotate, the quoted text drops spaces in some PDFs (this is the first time I have noticed this happening a lot in a single document)

Within Adobe reader, selecting the same region results in text that contains spaces.

Zotero: Machinesare definite: anythingwhichwas indefinite or infinite we shouldnot countas a machin

Adobe: Machines are definite: anything which was indefinite or infinite we should not count as a machine

Perhaps this already resolved in V7, but I note it for future reference.

Another example:

  • Can you provide the URL from where you downloaded the file?
  • edited March 5, 2024
    I have tested in Zotero 7 this file and the last one you mentioned:
    Lucas, J. R. (1961). Minds, Machines and Gödel. Philosophy, 36(137), 112–127.
    https://doi.org/10.1017/S0031819100057983
    https://www.jstor.org/stable/3749270

    The problems I can see are mostly due to bad OCR. Making a new OCR on the file fixed the problems I have tested.
  • Thanks for looking into it. You don't explicitly acknowledge the specific issue ("the problems I can see..." but I assume you do include this particular issue (though "mostly" adds some perplexity)

    > Making a new OCR on the file fixed the problems I have tested.

    What OCR do you recommend for Windows 10, and how do I replace the OCR'd text in an existing PDF?

    More significantly, whilst the cause might be clear, this is not a problem that >=1 other PDF reader has, so it could be addressed in the PDF handler, couldn't it? That would help everyone who might have "OCR" issues.
  • If you provide specific examples of problematic text where you see problems, in specific PDF files, others can have a look at the behaviour in Zotero 7.
    You do not mention any specific problem in the second example you give.

    For the specific problem you mention in the first post (guessing that it is the correct paper), the paper has two different versions published by CUP or JSTOR.

    * The CUP version is fine is both Adobe Acrobat Pro and Zotero 7. So if that is the version you are using, it is already fixed in Zotero 7.


    * For the JSTOR version, I cannot observe the problem you mention about the missing spaces. So if that is the version you are using, it is already fixed in Zotero 7.
    It still has problems:
    - Zotero 7:

    - Adobe Acrobat Pro (I have copied the text produced by the selection in the comment):


    After OCR (I have used Acrobat Adobe Pro, but there are probably other ways to do it):
    - Zotero 7:

    - Adobe Acrobat Pro (I have copied the text produced by the selection in the comment):


    I have put "mostly" because the PDF Reader in Zotero could still have problems not observed in other readers. If you find such problems, please give precise examples. The only problems I could see are also observed in other PDF readers.
    So you can use Zotero 7 if you need to fix the problems you see in Zotero 6.
Sign In or Register to comment.