Retain formatting when copying from Zotero PDF Reader

edited December 29, 2022
Is there an option that allows choosing whether to retain the formatting of copied text from Zotero PDF Reader?

The tested version was Zotero 6.0.18 on Fedora 36. There, it seems to have this option turned to FALSE – formatting is not retained.

Related posts:
* https://forums.zotero.org/discussion/98239/some-questions-about-the-pdf-viewer

EXAMPLE:
When I try to copy (any) text using CTRL+C and then CTLV+V to text editor, I get ugly one-liners like the following snippet, without any formatting retained.

1) snippet from Programming in Scala, incorrectly retained formatting
```
for file <- filesHere if file.isFile if file.getName.endsWith(".scala") do println(file)
```

while the text in PDF is formatted like this

2) correctly retained formatting
```
for file <- filesHere
if file.isFile
if file.getName.endsWith(".scala")
do println(file)
```
  • There's no option. It just depends on the PDF. If you provide a link to the PDF, or email it to support@zotero.org with a link to this thread, we can take a look.
  • @dstillman

    It happens with all PDFs. Lets take a look at paper
    * https://onlinelibrary.wiley.com/doi/10.1111/j.1475-3995.2009.00701.x

    Version of PDF uploaded from my computer (should be identical to the one from previous link)
    * https://drive.google.com/file/d/1nbAWhevhBEVzK287PEsa0NKDXuOi-DiU/view?usp=share_link

    For example, first paragraph from "1. Introduction" using CTRL+C inside Zotero PDF Reader and then CTRL+V inside the text editor produces

    1) incorrectly retained formatting

    ```
    Two-dimensional cutting and packing problems (C&P) are highly relevant in production and logistics. 2D cutting problems are found in customizing material in the glass, steel, wood and paper industries. 2D packing problems arise, for example, where goods have to be packed on pallets in horizontal layers. And the space-saving arrangement of adverts on the pages of newspapers, or the effective positioning of components on chips when designing integrated circuits, lead to 2D packing problems.
    ```

    2) However, it should contain line breaks. When opening the same PDF with Okular and using the CTRL+C, it correctly retains like breaks

    ```
    Two-dimensional cutting and packing problems (C&P) are highly relevant in production and
    logistics. 2D cutting problems are found in customizing material in the glass, steel, wood and
    paper industries. 2D packing problems arise, for example, where goods have to be packed on
    pallets in horizontal layers. And the space-saving arrangement of adverts on the pages of
    newspapers, or the effective positioning of components on chips when designing integrated
    circuits, lead to 2D packing problems.
    ```
  • That's by design. The line breaks being hard-coded is an artifact of the PDF format, but that doesn't mean they're desirable, any more than you would want line breaks when copying a paragraph from this webpage.
  • And to be clear, if Okular includes the line breaks, that's just because it's unsophisticated, not because it's correct. Most PDF readers will strip line breaks, because that's what most people want to happen.
  • @dstillman

    Thanks for your reply. That was what I was asking – is there an option inside Zotero PDF Reader to change this behavior? Now I see that there is not.

    So it will remain a huge pain for everybody who wants to copy anything using Zotero PDF Reader with retained formatting, say a snippet of code. Workaround is to open another PDF reader which can do that.
  • I mean, I asked for an example for a reason. Stripping line breaks within a paragraph is pretty obviously "correct" for the vast majority of people, which is why most PDF readers do it (and, as far as I know, generally don't have options to change the behavior — I'm actually not aware of any that do). For something like a snippet of code, it's not desirable, so we'd want to look at an example and see if it's possible to better detect when this should or shouldn't be done. We could consider an option to control this, but that's not the first thing we'd do.
  • Example of a book with a lot of code snippets that get copied with stripped line breaks inside Zotero PDF Reader is
    * https://people.cs.ksu.edu/~schmidt/705a/Scala/Programming-in-Scala.pdf

    ... if it ever comes to implementing detecting snippets of code and retaining formatting inside Zotero PDF Reader or another feature that will make copying code snippets easier.
Sign In or Register to comment.