Garbled text when exporting PDF annotations (highlighted text) to notes

edited April 3, 2022
Hello,

Congratulations to the developers on the new major version, this brings great improvements and new features!

Just wanted to report a small issue I encountered while testing the new version. With some PDFs, exporting a highlighted passage results in incorrect text (seemingly due to issues with spaces), while copy-pasting the same passage from the PDF reader in Firefox (and also in Evince) works.

This happened to me with a PDF from the following article (paywall, but I could share a copy privately to help with debugging, if needed):
https://muse.jhu.edu/article/270553

For instance, highlighting and exporting the first sentence results in:

ThomasEdison isahouseholdnam e,thesubjec tofcou ntless press articles and biographies,and theobjectofadulation asagreat American inventor, holder of1,093 patents.

Instead of:

Thomas Edison is a household name, the subject of countless press articles and biographies, and the object of adulation as a great American inventor, holder of 1,093 patents.

If you need more from my side to help with this, I'd be happy to help make the PDF annotation workflow even better.

(By the way, it seems from the documentation that reporting bugs on the forums is preferred, but I would be happy to create an issue on Github.)
  • You say that opening that PDF file if Firefox allows you to copy/paste correct text. Whether I use the open-in-Firefox reader or Mac Preview I get the same flawed text that you report. When I use PDF Pen the situation is somewhat improved but still far from perfect. I find this problem with several Muse-listed journals. I fear that this is a problem with the publishers' creation of the PDF file. If someone can make a suggestion to fix this, I too will be very pleased. For articles that only have abstracts in the PDF file and not in html on the web page, I use dictation to bring the abstract into Zotero.
  • I can reproduce that. Preview.app for me correctly copies the text, while Zotero PDF reader doesn't. It should be fixable. Thanks for reporting.
  • edited November 17, 2022
    I am having the same problem with another Muse-journal: DOI: 10.1353/tech.0.0396

    Zotero PDF reader gives me:

    > Soin1956“sourcematerials”includeduranium ore,which inturn seemednuclearenoughtotrump theincreasingly vocal opposition ofpostcolonialnationstotheapartheidsta

    Adobe Acrobat gives me:

    > So in 1956 “source materials” included uranium ore, which in turn
    seemed nuclear enough to trump the increasingly vocal opposition of postcolonial
    nations to the apartheid state.

  • Hello, I also experience this issue when highlighting in the Zotero app in this paper: https://pubs.acs.org/doi/10.1021/acs.jctc.3c00372. Looking forward to a resolution!
Sign In or Register to comment.