Strip newlines from PDF when pasting in Notes?
When I copy-paste some text from a PDF to the note editor, the pasted text contains a lot of newlines, e.g.
in stead of
This is a
piece of
pasted text
Would it be possible to automatically strip all these newlines - converted to breaks in HTML - upon pasting? A special 'Paste from PDF' would do as well :-)This is a piece of pasted text
I am sorry, but I am not sure whether I understand what you mean. Do you mean I should use a third-party tool like http://www.textfixer.com/tools/remove-line-breaks.php to remove the linebreaks?
I use the copy-paste feature a lot to copy pieces of the PDF to the Note editor, so I can add my own comments to those pieces.
Or, for OS X users, Preview in Snow Leopard is much smarter about copying text from PDFs (supposedly—I haven't really tested it).
Thanks anyway.
(1) at the point of copying. (By Adobe reader, Sumatra, OSX's Preview, Evince or the like when the copying command is invoked)
(2) at the system level (either automatically, which would have side effects you might not always want) or manually, by means of a keyboard shortcut that runs a linebreak removing routine.
(3) at the point of pasting, by the app into which the text is pasted.
I agree with Dan that the best place for this to happen is not (3). And individual apps don't typically have this functionality. The last time I checked, Open Office and Word didn't do it either. I'd love to see it happen at (1), since copying from PDFs is almost the only time I run into this problem these days. Most times when you copy a bit from a PDF you *want* soft-wrapped text. It's the format that we work in in almost all our tools these days. But unfortunately I've never (in Linux and Windows) seen a PDF reader that does this.
Dan is suggesting (2). Perhaps we could use this thread to gather a few ideas for the various operating systems. The ideal, it seems to me would be a tool that would do a cleanup on text in the clipboard before pasting, it would be activated by a global keyboard shortcut, for example.
Having said that, Zotero's notes application is a place where pasting from PDFs is probably a major activity. If there are not good system tools available, and since PDF viewers are still lame at this point, it doesn't seem completely unreasonable to think that Zotero might be made to 'do the friendly thing' and make up for the weaknesses of the PDF readers and clipboard tools. For the users' sake.
Once I got this script to do what I wanted I stopped tweaking it. I'm not that great with Autohotkey, and really bad with regular expressions, so this is far from a complete solution, but here it is in case it helps.
; REMOVES LINE BREAKS WHEN COPYING
#Persistent
#NoEnv ; Recommended for performance and compatibility with future AutoHotkey releases.
SendMode Input ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir %A_ScriptDir% ; Ensures a consistent starting directory.
;Define hot key: control alt c
^!c::
Send ^c ;copy
sleep 100
; Note for following code that `r`n = newline
;Code the paragraph breaks with a special combinations
StringReplace clipboard, clipboard, .`r`n, -.-, All
StringReplace clipboard, clipboard, `r`n`r`n, -*-, All
;Remove a dash followed by newline, since that's probably a single word across a linebreak
StringReplace clipboard, clipboard, -`r`n, , All
;Replace a single newline with a space
StringReplace clipboard, clipboard, %A_Space% `r`n, %A_Space%, All
StringReplace clipboard, clipboard, `r`n, %A_Space%, All
;Replace multiple adjacent spaces with a single one
clipboard := RegExReplace(clipboard, "\s+" , " ")
;Replace the paragraph break codes with newlines
StringReplace clipboard, clipboard, -.-, .`r`n, All
StringReplace clipboard, clipboard, -*-, `r`n, All
return