ZotFile - Advanced PDF management for Zotero

1111214161763
  • Star7, it might the case that the tag wasn't removed properly. Someone else reported on a similar problem a while ago but didn't follow up. You should try to reproduce it again starting with just one file and check the firefox error console for any problems. The link the instructions is in one of my posts above. Let me know if anything shows up in the console and also whether the annotations get extracted (if that option is enabled).
  • Every so often (every other time I start Zotero?) Zotfile seems to disappear. The properties window does not come up and in contextual dialog I see "Warning" displayed several time in-between the standard options. Firefox's error log console show this error: zotfile is not defined. Source file: javascript%20zotfile
    To fix, I reinstall Zotfile and it seems to work (for a while at least...). I observed it on Mac OS X (10.6) with Firefox 10 and Win XP with Firefox 3.x.
  • Is Zotfile and ZotReader copying PDFs outside Zotero (to a base dir) compatible? So that if the base dir is the same for both and if one copies files with Zotfile one could get the back with ZotReader?

    Thanks - Jacek
  • jacekg, I think that happens when you close the FF window and reopen it (without quitting the application). If that is the case, I know about the problem but wasn't able to fix it. Other plugins have the same bug. I think it only occurs on Mac/Unix systems because you can close windows without quitting applications, which is not possible on Window.

    jacekg, ZotReader is pretty outdated. It was only a temporary addon, which never made it out of beta. All the ZotReader features and more are part of zotfile 2.x. You should install this version maybe that already answers your question...
  • Hey Joscha, thank you so much for the help! Its all okay now, the get/send to tablet issue. Not sure why it was an issue back then. The annotation extraction just blew me away!

    Now that everthing is so good, I'm worried that I might lose this. I have zotero on my laptop with zotfile renaming and moving pdfs to a folder in sugar sync so everything is backed up. I also backed up zotero files (preference - advanced - show directory) regularly.

    However, if my laptop were to crash, I'm worried that the "links" from the zotero items to the pdfs will no longer be there. How do you guys back up all these?

    Meaning I can get all the zotero items back. I can still have my pdfs in sugar sync. But will the links be backed up somehow?
  • Star7, as long as you backup your zotero library, you are fine. Zotfile saves everything in the zotero database so that you don't need to backup any additional files. The only thing that is not part of the zotero library are the preference (including the user-defined folders), which are part of the FF profile and they shouldn't be hard to reproduce.
  • I also noticed that multiple words get merged together when I extract annotations from a pdf file as follows:

    "Inalargeworld, as emphasized by both Savage and Simon, one can no longer assume that “rational” models automaticallyprovidethecorrectanswer." (Gigerenzer and Gaissmaier 2011:453)

    I don't know what is the cause of this problem.
  • Yes, I know about this problem but can't do something about it right now. Devietti implemented most of the pdf.js extraction stuff and he won't be able to look into it right now. The poppler based extraction is much better with spaces but mac only (and has problems with special characters)...
  • Would it be feasible to implement a 'path prefix' setting, so we could access attachments on cloud storage from multiple OS's?

    I would like to be able to view and save attachments from both Windows and Mac. However, since the paths are hard coded this is not possible.

    If Zotero doesn't force a full and valid path (it might not sure) then could the path have an optional prefix added by ZotFile as set in the preferences for each instance of the plugin? So, Zotero would only store the Filename and not the path but ZotFile would add the path when file access was requested. It might require a different menu option or file open (double click) override in Zotero.

    If there is another or better way to get the same result please let me know.

    Thanks for a great plugin to Zotero!
  • That is a very common request but I wont do it on the zotfile level. There is, however, some development on the zotero level, which goes in this direction:
    https://github.com/zotero/zotero/pull/51
    Not sure whether this will be integrated soon but you should express your support there...
  • Fair enough, thanks for the link. I will monitor it for changes. Hopefully this patch will be incorporated.
  • (Moved from "Troubleshooting")

    I've recently been using Zotero with the Zotfile extension, and have been very pleased with the overall system. However, I have been unable to extract annotations from PDF files.

    Setup:
    Macbook, OSX 10.4.11
    Firefox 3.6.28
    Running Zotero 3.0.3 (as Firefox Add-on) and ZotFile 2.1

    I have tried to extract Annotations added through Preview (ver 3.0.9), however I get an error message:
    - "...not supported with Firefox 3.6..." when attempting with pdf.js

    or
    - "Annotation extraction failed" when attempting with poppler (ver 1.1)

    I was hoping for some recommendations to help fix this. I'm afraid that it's going to be "Get a new computer with a current OS" - i'm at the most current version of firefox I can run with 10.4.11

    In the meantime, I'm using Skim, which works fine, but as I understand from browsing these forums, won't work with Zotfile for annotation extraction period.

    Appreciate any help you can provide.

    Thanks,
    -J
  • Hi jratuszn, sorry to disappoint you but the problem simply is FF 3.x for the pdf.js based extraction and OSX 10.4 for the poppler extraction. Without upgrading either FF or OSX, the extraction is not going to work. Sorry that I can't be of more help.
  • Hello Joscha,
    Thanks for this great extension. I have a little trouble with it.
    I had all the attached file stored in a dropbox folder on a D: volume in my hard disk. Unfortunately, my computer was stolen. I bought a new one, but I was unable to re-install the dropbox folder on the same volume (Toshiba apparently reserves the D: volume for system backup files).
    Therefore, I have all (1400+) my references back, thanks to Zotero Sync, and all my filed back, thanks to Zotfile and Dropbox. Nevertheless, I'm unable to relink the ones with the others, as the path is not valid anymore.
    I tried to replace it manualy, going to my Zotero folder and replacing all the "D:\Dropbox\Bibli-Zotero" (the old location of my dropbox folder) with "C:\Users\Florent\Dropbox\Bibli-Zotero" (the new location of my dropbox folder) inside zotero.sqlite. But once I've done that, Zotero sends me a message indicating the DB was corrupted.
    Do you (or other users) have an idea of how I could proceed to have my links back.
    Best regards,
    Florent
  • Just to clarify, you are using linked attachments and when you double click on the link in Zotero you get this 'wrong location' message, right?
    I am not sure whether there is an easy solution to this problem. If you have some programming experience, it's not very hard to write a small script that relinks all files but I don't recommend taking this approach if you don't feel comfortable about it. I would rather bother a friend with this and misuse his/her computer. You could install zotero, sync your database, add dropbox as D (alternatively, you can just copy all the files to an external hard drive and connect it as D), and then use zotfile to relink your files to a location you can also use on your new computer.
    Maybe others have better solutions...
  • Hello Joscha and thanks for the quick answer. Indeed, this I get a wrong location message (in French : "Le fichier joint n'a pu être trouvé. Il peut avoir été effacé ou déplacé hors de Zotero.")
    I believe the solution your proposing is indeed the most simple and I hadn't thought about it. Thanks a lot !!!
  • I use Zotfile to manage my Zotero library and as part of my iPad workflow (the "tablet" function with my Dropbox account is wonderful!). However, I have been having a problem with the "Extract Annotations" function. There have been two types of problems.

    1) For some files it doesn't seem to work at all. The "Extract Annotations" box pops up briefly then disappears, without any note being generated. This has happened with both articles I have scanned, OCR'd, and annotated on my iAnnotate and with articles I have gotten from JSTOR, so I am not sure what the problem would be.

    2) In other cases, the extract function hangs. The status bar might get to the just under the r in "Extract" or to the S in "press ESC to Cancel", but then advance no further. It just stays there until I hit ESC and unfreeze it. The file sizes here vary.

    I believe I am up to date on the latest releases for Zotfile and Zotero. I have Zotfile's preferences set to use pdf.js as the extraction tool.

    Any assistance would be greatly appreciated!
  • I have just (last night) installed Zotfile with standalone Zotero on OSX 10.7.3.

    Unfortunately it does not extract PDF annotations as notes. I have checked the forum and configuration as specified - but don't see anything obvious. After selecting 'Extract Annotations', a dialog pops up to say 'Extract PDF annotations (press ESC to cancel), the progress bar moves to the end, then dialog eventually closes without extracting anything.

    Thought it may be an issue with pdf.js, so also installed poppler and tried that. This approach doesn't present any dialog and nothing appears to happen.

    I have tried a linked PDF file and stored PDF file - no difference.

    It doesn't appear that I am doing anything wrong / strange.

    Could you please help.

    I can send you the PDF file I used if you think it would be helpful.
  • edited July 12, 2012
    pdf.js simply does not support all pdf standards so that some files won't work. You can test whether the extraction of annotations works in general by downloading this file and extracting the annotations. If it works, your files are probably the problem and there is little I can do about it right now. For my personal use, almost all relative recent journal articles work. pdf.js sometimes has problems with the correct spaces. Older pdfs, however, often make problems. For some pdf's the extraction is simply impossible (if you are not able to select and copy/paste text from the file, the extraction probably won't work either).
    dazzur, you should check the file above to test the extraction. If the file does not work, try to provide more details following the instructions here. Otherwise, your files are probably the problem or the annotations type you are using.
  • The sample file you linked to worked just fine, so the problem must lie in my files somehow. In the case of the one I have the most trouble with is a more recently produced item where I can copy/paste text with no problem, so I don't know it that's it. It is a large file (60mb), so maybe that is part of the issue. In any case, I will see if anything changes in the future.

    Is it possible that poppler might work more successfully? I have not tried using that mode of extraction but would try it if it might help.
  • yes, poppler might work better. In my experience, the poppler based extraction supports more pdf standards, is faster and has less problems with spaces. It only works for Mac though (until someone compiles it for other platforms), has problems with special characters in notes (not highlighted text, I think), and might fail with other kind of pdf files. Size shouldn't matter but maybe there is a certain page that makes pdf.js fail. You could try breaking the file in smaller chunks to pin down the problem.
  • Poppler solved the problem for the longest, and most problematic file. (It was 800 pages, so breaking it down into smaller chunks would have taken quite some time.) Quick, easy, and it looks like it recognized everything at least as well as pdf.js has. Thanks!
  • Hi Joscha & the others,

    if I have an attachment opened, e.g. a PDF in a PDF viewer, it is not possible to move the file to the custom location which I have chosen in the settings, because the file is locked (on Windows XP). The corresponding error message in the Javascript console is:

    uncaught exception: [Exception... "Component returned failure code: 0x8052000e (NS_ERROR_FILE_IS_LOCKED) [nsIFile.moveTo]" nsresult: "0x8052000e (NS_ERROR_FILE_IS_LOCKED)" location: "JS frame :: chrome://zotfile/content/zotfile.js :: <TOP_LEVEL> :: line 920" data: no]


    So, 'cause it is a clear error message, I sticked into the source and wrote a try-catch exception routine around the move (and also the rename) command to treat this error with a well formed error message in the Zotero info window.

    Just have a look in the code (also in the comments): https://gist.github.com/2499686


    Thanks Joscha for writing this smart and helpful Zotero extension.


    Regards,

    Dominik
  • Thanks for the bug report and the fix!! That are the best bug reports... :)
  • Joscha,

    I've incorporated a page numbering ID parameter for renaming in the beta version of zotfile (jlegewie-zotfile-v2.1-0-ge2086ef)(it was actually pretty straightforward!). I have, however, discovered a problem when using customized subdirectories. I'm currently using "\%w PDFs", which should give me subdirectories with "Journal PDFs" or "Publisher PDFs". I've discovered that any characters which match those used in the renaming functions are not used (so that I am getting Journal PDF/Publisher PDF). I've tried a range of different characters, with both capital and lower cases, and these text characters are consistently omitted from the subdirectories. I'm able to recover the correct function by deleting the problematic letters, but this is obviously less than ideal. I'm assuming that characters should only be recognized as parameters if they are preceded by a % character - has anyone else reported a similar error?
  • I am preparing for a trip but I will get back to your question in two weeks or so...
  • Hi everybody,

    I'm very interested about "Extracting annotations" from PDF function, but it doesn't work for me at all, even with different files/computers/systems. Every time I choose to "extract annotations" scale reaches 100% and stays that way for ever. If I press ESC it cancels process correctly, but no annotations or errors appear. I use highlighting of text in PDF and I want to have a summarized text in Zotero. What am I doing wrong?
  • You should check some of my earlier replies about pdf extraction and a) test the extraction with the example pdf (which I link to in one of the replies), b) look for errors in the FF error console (also described in an earlier post), and c) try poppler if you are using Mac. If the test file works, the problem is connected with your file. My earlier posts also provide some more information about that.
  • Thank you for reply. I tried all steps:

    a) If you ment this one: http://www.columbia.edu/~jpl2136/zotfile-test.pdf it doesn't work either.

    b) Error says: zitem.getType is not a function and is related to the code:
    // Function replaces wildcard both for filename and subfolder definition
    replaceWildcard: function(zitem, rule){
    // get item type
    var item_type = zitem.getType();
    var item_type_string = Zotero.ItemTypes.getLocalizedString(item_type);

    c) I am currently on Windows 7 x64, but tried also on Windows 2003 Server x64 and Ubuntu 10.04... all of tested in CZ localization.
  • edited May 23, 2012
    Hi I just downloaded Zotero standalone tonight and am new to it.

    I have installed Zotfile but whenever I download a PDF to my specified download folder, neither Zotero nor Zotfile does anything. I wish I could get it to work because it sounds awesome.

    Is there any other set up that needs to be done besides identifying the source folder and the location folder. I'm just looking to get automatic renaming working.

    This is on Mac Lion.
This discussion has been closed.