Zotfile doesn't extract all annotations
Hi all,
I've been using Zotero & Zotfile for a while and with a few exceptions of weird PDFs that it didn't like, Zotfile has mostly worked really well.
However, recently it started showing a weird bug: it does not reliably extract all annotations anymore. With some texts, it extracts all of them just fine, but with many others, it behaves weirdly and leaves out about 60-70% of them (roughly). It always catches the first bit of highlighted text on a page but usually not the last paragraph on the bottom of each text and only sometimes the only in the middle section of the page.
I've waited to report this until a new update came out but after updating Zotero yesterday, it's still doing the same thing today. I'm using Zotero 5.0.88 on a Mac and I use Preview as a software to read and highlight the PDFs.
Any advice would be very much appreciated :)
Thanks in advance!
I've been using Zotero & Zotfile for a while and with a few exceptions of weird PDFs that it didn't like, Zotfile has mostly worked really well.
However, recently it started showing a weird bug: it does not reliably extract all annotations anymore. With some texts, it extracts all of them just fine, but with many others, it behaves weirdly and leaves out about 60-70% of them (roughly). It always catches the first bit of highlighted text on a page but usually not the last paragraph on the bottom of each text and only sometimes the only in the middle section of the page.
I've waited to report this until a new update came out but after updating Zotero yesterday, it's still doing the same thing today. I'm using Zotero 5.0.88 on a Mac and I use Preview as a software to read and highlight the PDFs.
Any advice would be very much appreciated :)
Thanks in advance!
You can also try tweaking the ZotFile settings to use poppler instead of or in addition to pdf.js — I don't know the details of that, but see the ZotFile documentation.
If your PDF file is causing problems, this could be due to one of these issues:
1) There's no extractable text. You could check whether a proper text can be extracted with this online tool.
2) The PDF version is incompatible. If your file is PDF version 1.6 or higher, it could be that Zotfile has some issues with it. You can check the PDF version in the document properties. In Acrobat, they should be accessible with Cmd+D.
See also the suggestions in this discussion.
If someone wants to share an example PDF where highlights aren't all extracted to support@zotero.org with a link to this thread, we can take a look. ZotFile is using an ancient version of pdf.js, so I'd guess that updating pdf.js in ZotFile would help.
I'll send a PDF I've had lots of trouble with today via email. Thanks for offering to take a look! A functioning ZotFile would save a looot of time, so that would be wonderful.
When troubleshooting issues with poor highlight extraction by Zotfile, it could help to exclude potential issues related to the annotation tool. With Acrobat, PDF Expert (Mac), or PDF-XChange Editor (Windows), one can be fairly confident that the annotation tool is not causing issues.
BTW, Skim doesn't seem ideal for using with Zotfile, see this discussion.