Zotfile extract highlighted text from Skim, but not notes

gabrielledepooter · December 10, 2021

Hi,

I am using Zotero 5.0.96.3 with the latest Zotfile and Skim (downloaded yesterday). When I extract annotations made in a PDF using skim, it extracts the highlighted text, but not the comments made.

How do I fix this?

Thank you.

aaaaaaaaaaaaaaa · July 20, 2022

@gabrielledepooter I haven't been able to reproduce this in Zotero 6. I can't even get Zotero to extract the highlights I made with Skim. How are you extracting the annotations?

When I open one of my attachments from Zotero in Skim PDF reader and make annotations then save the file, if I right-click the item in Zotero and choose "Add Note from Annotations", none of the text highlights I made in Skim appear.

dstillman · July 20, 2022

@aaaaaaaaaaaaaaa: This thread predated Zotero 6, so it's no longer really relevant.

But Zotero, like ZotFile previously, should be able to extract any standard PDF annotations saved to the file. If the annotations are saved to a separate sidecar file, it won't (and ZotFile wouldn't have been able to either). You'd need to import them into the PDF file in Skim first.

aaaaaaaaaaaaaaa · July 20, 2022

@dstillman Got it, i think. Annotations made in Skim are saved in the Extended Attributes of the PDF file as net_sourceforge_skim-app_notes#S net_sourceforge_skim-app_rtf_notes#S, and net_sourceforge_skim-app_text_notes#S. I'm assuming Zotero doesn't look for these EAs, so in Skim I would have to Export as PDF with Embedded Notes for Zotero to detect my highlights and annotations.

dstillman · July 20, 2022

Oh, yeah, it looks like it. I have no idea why they do that — generally PDF readers either use embedded annotations or a sidecar file — but yes, you'd have to convert them to embedded annotations in Skim before importing them in Zotero.

aaaaaaaaaaaaaaa · July 20, 2022

@dstillman I can tell you exactly why they do that.
Quoting from: Why are the notes not stored in the PDF?

…
Moreover, in many cases it is nice not to change the PDF itself. In some cases, such as when the PDF is password-protected, this is not even possible. Another reason is that Adobe's PDF specifications do not allow a note such as Skim's Anchored Note, featuring rich text and an attached image. So saving notes in the PDF would always lead to data loss.

Have others shown interest in making Skim annotations more compatible with Zotero's native PDF reader / is it on the roadmap add support for importing annotations directly from EAs? I think it makes a lot of sense to improve the compatibility between Skim and Zotero's native PDF viewer because they have a lot in common in the way that they deviate from Adobe's PDF specification.

1. They both allow richer text styling in the comments of annotations.

2. Modifying annotations does not change to the actual content of the PDF file. This makes saving annotations much faster and less likely to cause errors that could corrupt the PDF document.

EDIT: Here is a sample of the contents of the net_sourceforge_skim-app_notes#S Extended Attribute. It contains one annotation spanning two lines of text.

[
  0 => {
    "bounds" => "{{133.38999999999999, 536.8962555999999}, {207.94360279999995, 23.9866356}}"
    "color" => [
      0 => 1
      1 => 1
      2 => 0
      3 => 1
    ]
    "contents" => "This is a comment on my annotation!"
    "modificationDate" => 2022-07-20 22:51:14 +0000
    "pageIndex" => 0
    "quadrilateralPoints" => [
      0 => "{0.37800000000001432, 23.9866356}"
      1 => "{74.571971200000036, 23.9866356}"
      2 => "{0.37800000000001432, 12.031435600000009}"
      3 => "{74.571971200000036, 12.031435600000009}"
      4 => "{0, 9.9625999999999522}"
      5 => "{207.94360279999995, 9.9625999999999522}"
      6 => "{0, 0}"
      7 => "{207.94360279999995, 0}"
    ]
    "type" => "Highlight"
    "userName" => "aaaaaaaaaaaaaaa"
  }
]

(The code sample was generated by running this in the command line: xattr -px net_sourceforge_skim-app_notes#S "path/to/file.pdf" | xxd -r -p | plutil -p - ")

ZVT · July 21, 2022

(I realize it's not a feature request, but I'd add a +1 if it were... I like Skim a lot and would be happy if it integrated more easily with Z.)

dstillman · July 21, 2022

I don't really see us adding support for the proprietary formats of individual PDF readers, any more than I would expect Skim to add support for importing annotations from Zotero's database or API. Embedded annotations are the universal format here and the logical format for transferring files between programs, even if they might be slightly lossy when it comes to program-specific features.

In any case, we don't currently have a good way of extracting extended attributes. We'd likely be better equipped to do that in Zotero 7 (which will have subprocess stdout-reading support), and if someone wants to work on importing of Skim annotations at that point, we could consider accepting a patch for it. But I doubt we'd work on it ourselves.