ZotFile - Advanced PDF management for Zotero

  • edited June 26, 2015
    As a follow-up to my previous comment, I ran the same drill on my home computer. It also fixed the problem there.

    Please note that while my office computer connects to the NAS drive via WebDAV, my home computer connects via the default protocol Apple uses for local networks. I believe this is SMB2 for a computer running Yosemite.

    On both computers, I first tried running Disk Utility to inspect the hard drive and to fix permissions. This didn't seem to have any effect on the problem.

    So from this empirical evidence I conclude that there was nothing wrong with the pdf files themselves, the drives, the file systems, or with the network connections. Somehow the interface between the operating system's file access subsystem and Zotero or Zotfile got screwed up.

    This may be a hard problem to replicate, but careful inspection of the relevant code may be in order. When I searched for the error messages on the web, most discussions I found dealt with other software and pointed out that the code in question was not using the currently recommended methods to access files. Something similar may be going on here.

    Another thought is that on both machines I have Zotero & Zotfile installed as both Standalone and Firefox extension. Maybe the settings on one are affecting the other, although I would think they'd be independent of each other.

    Thankfully, if this method of fixing the problem continues to be effective in all cases, the problem is relatively easy to work around.
  • ZotFile/Zotero rely entirely on built-in mozilla functions to access and save files, so if there's a problem, it's there.
  • I have just upgraded to Zotero 4.0.27 (Firefox 38, Ubuntu) and Zotfile pdf comment extraction seems to be broken. Extraction process starts (green sign), though nothing happens... Fortunately I have a previous version of Zotero (standalone) with the same variant of Zotfile, which works OK.
  • Can't confirm. Extraction works for me on Zotero 4.0.27...
  • edited July 1, 2015
    With Firefox 39 is working again. Some strange FF bug, maybe in my profile. Sorry...
  • Zotfile is great - but is it standard behaviour that it makes a new note with the extracted annotations every time you run "extract notations"? I would expect the new note to overwrite the old one?
  • Yes, that is on purpose. You can edit the note with the extracted annotations (e.g. add headlines, your own thoughts, resort citations by topic etc) and zotfile doesn't want to overwrite your modifications.
  • Thanks Joscha, yes that makes sense for many people's workflows I guess. I never do that though - if I wanted to add something I would add it on the PDF itself and re-import. I often read several PDFs at a time and often return to them to read more or add more notes. I don't suppose there is a setting to make "overwrite the old note" the default behaviour?
  • edited July 6, 2015
    I justed found this topic: https://forums.zotero.org/discussion/23036/reports-formatting-unreliable-often-html-tags-displayed-instead-/

    Is this issue known or has anything been done about it? I justed noticed the same behavior with some extracted annotations.

    Same behavior meaning: The Zotero report shows the notes that have been extracted with ZotFile, but some of them are not formatted correctly, they are full of HTML-Tags. Technically they get wrapped up in "plaintext" <p>s I think. If i change something (e.g., press space and delete it again) inside the notes in the Zotero editor they get formatted correctly.
  • I am having trouble getting zotfile to extract annotations from pdfs in linux. I highlight my pdfs using Okular, and then right click on the pdf in zotero and click Manage Attachments->Extract Annotations.

    When I do this, the little zotero message box pops up in the right corner of my screen and has a little red 'x' next to the name of the pdf, to indicate that extraction failed, and I have no extracted annotations.

    This is in zotero standalone, although my zotero library is in my firefox folder.

    Any ideas on how I could fix this?
  • if you open those PDFs in Firefox directly, do you see the annotations?
  • No, they didn't show up.

    I found the solution though: I have to force Okular to overwrite the original pdf by clicking 'Save As' and then saving it with the same name. Otherwise I guess it saves the annotation information in some other file somewhere.

    When I do that, the notes show up in firefox, but the highlights do not. However, the pdf extraction works, including extracting the highlighted text.

    Annoying, but at least it works now.
  • Hey

    This may be a question already answered, but on skim through couldn't see it referenced.

    Is there a way of getting the extracted PDF highlights text file to automatically import or even better automatically sync into Evernote (or possibly devonthink / tinderbox?).

    That way a the highlights / quotes database could be searched / interacted with.

    Any similar sync processes?
  • edited July 13, 2015
    Hi, is there a setting that removes punctuation characters from filenames? There is the "remove special characters" in the Advanced Settings, but it doesn't seem to handle characters like quotation marks or brackets which can cause problems when transferring to other file systems.
  • Hi,

    is there a way to modify the way of extracting annotations in Zotfile. Unfortunately, it extracts the notes and highlights in the order of their creation. But I'd rather have them with respect to their appearance. I.e. sometimes I read a later chapter before an earlier one, but the annotations appear in that order instead of by pages. Do I have to change something in the extract.js?

  • I can't confirm this. With the pdf.js-based extraction, the annotations are in the order they appear in the pdf file NOT in the order I added them.
  • Hm, I checked it too. It works fine too. Seems I was judging too early. The problem only appeared in one pdf where the formatting turned out to be strange...
    So case closed. Thank you!
  • zotfile.pdfExtraction.openPdfLinux does not seem to do anything. I am trying to get okular to open the annotation in the right page, but when I set `/usr/bin/okular -p` nothing happens and the PDF is not even opened. In fact, passing anything like just `/usr/bin/okular` or `/usr/bin/evince` leads to the same problem (and both okular and evince are in /usr/bin).

    I am under linux with ZotDile 4.1.6, and using zotero standalone ( and I am modifying zotfile.pdfExtraction.openPdfLinux via about:config.

    What am I doing wrong?
  • RE: Zotfile Rename - Show File?


    I initially had the same problem with "Show File" not working on Windows 10. However, the problem was solved when I removed the leading backslash from the "Use subfolder defined by" option in ZotFile (ie. I used "%w\%y" instead of "\%w\%y"). Hopefully this is the cause of your issue too!
  • I had a number of files moved to a single layer folder using zotfile. I've since moved the files to a new folder and changed the directory in zotfile. However it doesn't recognize the files.
    The attached file could not be found.

    It may have been moved or deleted outside of Zotero
    If I select the file in the new zotfile directly it then works.

    Is there anyway to update all the file links to the moved folder?

    Windows 10, Zotfile 4.16
  • Yeah, don't move files around outside of Zotero, that'll definitely break any existing links.
    You can use the Zutilo add-on to batch-change link paths. That should help you here.
  • When move one item(which was used zotfile to move to another place) to trash,the attachment PDF would not be deleted.
  • correct, Zotero doesn't delete linked files when you delete the parent item.
  • Long time user, first time poster. :)

    Zotfile is truly a dream come true for anyone wanting to annotate PDFs using a tablet, and the annotation extraction is a gift. I am wondering whether it might be possible to extract the logical page number. Currently, it simply extracts the absolute page number of the file. But many PDFs also have logical page numbering (for example, for an article that runs from page 33-56, the first absolute page is page 33).

    My apologies if this has already been covered.
  • ahmontgo: Zotero Actions (gear icon) -> ZotFile Preferences -> Advanced Settings -> check/uncheck "Use actual article/book chapter page for highlighted text snippets"
  • Gracile: Thanks! Seems to work for articles, but not books. Too bad we can't use "number of pages" in the same way.
  • I wanted to follow up on AlexHunt's questions on Jun. 4th and Jun. 10th about nested folders:

    With the escaped slashes, using the wildcard “%c”, the best that could be achieved at the moment is:

    Collection 1Subcollection1 > Collection 1Subcollection1 > PDFs
    Collection 1Subcollection2 > Collection 1Subcollection2 > PDFs
    Collection 1Subcollection3 > Collection 1Subcollection3 > PDFs

    Whereas I am looking to perform the following nesting:

    Collection 1 > Subcollection1 > PDFs
    Collection 1 > Subcollection2 > PDFs
    Collection 1 > Subcollection3 > PDFs

    Therefore I need wildcards for "Collection" and "Subcollection" however I'm not sure they exist.
    Since there haven't been any further replies, I'm guessing no one knows of a way to accomplish this? Joscha, would this be feasible as a feature to build into future versions?
  • that's actually just a bug in the Windows version. It works in Zotfile on Linux&Mac.
    Details here:
    patches, I'm assuming, welcome.
  • Got it, thank you! I wish I knew enough to work on a patch myself, but that's out of my league. I did have success with the Windows-specific workaround, though.
  • edited September 4, 2015
    Problem: Zotfile does not extract PDF annotations – Bug?

    Here is what I did.

    1. Scanned a chapter from a book with 600 dpi.
    2. Used Adobe Acrobat XI Pro to OCR the file with the following settings:
    - a. Primary OCR Language: English (USA)
    - b. PDF Output Style: ClearScan
    - c. Downsample To: 300 dpi
    3. I then highlighted and annotated the text Adobe Acrobat XI Pro.
    4. When finished, I imported the file into Zotero.
    5. I then selected “Manage Attachments >> Extract Annotations”
    6. I get a notice “Zotfile: Extracting Annotations…” but no note with extractions is produced.

    Information on my system:

    • Windows 7 Professional
    • Zotfile 4.1.6
    • Zotero
    • Firefox 40.0.3

    I can copy text from the PDF. It works fine when I do it manually but it would be a lot of work to extract all my annotations. I have tinkered with Firefox about:config in the past but am not aware of any changes that might temper with Zotfile’s operation.

    Please help me troubleshoot this problem. Thanks.

    UPDATE: I have been able to extract the annotation with Zotfile on another PC. I therefore think that it might have to do with my about:config configurations. It would be very cumbersome to change everything back to its original state. Do you have any ideas which Firefox configurations have to be in which state for Zotfile extractions to work?
