ZotFile - Advanced PDF management for Zotero

fdenger · February 16, 2012

bentle: are you using it on a group library or on a local library? When I do it on a group library, I get the same behavior (it sits there until I hit ESC). When it sits there until you hit esc, after hitting ESC, go into the firefox menu -> tools -> web developer -> error console and paste into here the last error message it is showing at the end.

When it returns very quickly with no note, it usually means there is no annotations found, which might mean that the feature annotated is unsupported. If your on a mac, you can try to install the alternate extractor plugin (go into zotfile preferences under zotero preferences).

Hope this helps

bentle · February 16, 2012

fdenger: Thanks very much. I read your comment after mine and realized that the never-ending Extract Annotations behavior was linked to trying to use a file in a Group Library.

The other behavior is more mysterious. They are PDF files marked up using Acrobat X, without any different properties that I can identify from PDF files that work fine. Any ideas would be greatly appreciated.

Thanks!
Tom

Joscha · February 17, 2012

Heckscher, I assume that you set the 'Location of Files' in the 'General Settings' tab to the location on your tablet. Is that correct? You have to set the tablet location 'Location of Files on Tablet' under the 'Tablet Settings' tab, NOT the other one (you can change that to 'Attach stored copy…'.

Thierry.C, thanks but I still can't reproduce it. The complete filename is actually checked for invalid characters. My second thought was that it's the subfolder definition but that work in my quick test too. Can you reproduce the problem in the zotfile group and also upload a screenshot of your settings?
I have added a ticket for the missing fields problem and will address that.

fdenger & bentle, I can reproduce the problem in group libraries and will certainly address it.

bentle, you can upload one of the pdfs to the zotfile zotero group (see my comment above). Maybe they just use standards that are not supported by pdf.js yet but maybe something else is going on.

fdenger · February 17, 2012

Joscha: thanks for the great program and taking the time to look at my report.

I have found another issue with extraction, which might be more complex. When I highlight a PDF article with 2 columns per page, the order the text is included in the note is a "top to bottom" ordering, rather than an ordering based off the flow of the article (should be ordered left column top to bottom, then right column, top to bottom).

Do you have any thoughts on a fix for this issue? In this circumstance, I am highlighting the section headings, then a few quotes out of each section. The problem is now that the note contains quotes from multiple sections under the wrong headings.. I hope this makes sense.

Thank you!

Joscha · February 18, 2012

Thierry.C, you are using windows, right? Can you send me the subfolder setting you are using? The default is '/%w/%y' but that only works for mac/unix. I think I already fixed the problem with subfolders and missing fields but I just want to make sure for Windows.

fdenger & bentle, the group library bug is fixed and everything should work in the next version (probably next week).

fdenger, pdf.js sorts them by y-position (I think) and poppler in the order they were added to the pdf. I know that neither is perfect but it's probably not going to change in the near future. The correct way would be to determine the order based on x and y location, while taking into account that a highlight might not span a whole line (which makes things more complicated).

pdfs in zotfile group, someone uploaded two pdfs to the zotfile group. Both of them work in poppler but not in pdf.js. I have forwarded them to Joe, who is working on the pdf.js extraction. No promises though...

realtime99 · February 19, 2012

Hi, I love Zotfile and have heavily relied on it in the past. However, I am currently having a problem with Zotfile. After installation and restart, I see the Zotfile preferences in the Zotero drop-down menu, and I can change those, but I don't see the Zotfile options in the right-click menu (other right-click items appear normally). It worked previously, but not with the newest version of both. My info:

I have disabled all add-ons except Zotero 3.0.1, Zotfile 2.0, and Zotero Word for Windows Integration 3.1.5 - tried uninstall and reinstall on all---no difference.
Firefox 10.0.2
Win 7 x64

Here are screenshots of the settings menu:

http://imageshack.us/photo/my-images/213/clipboard01bc.jpg/

And the right-click menu:

http://imageshack.us/photo/my-images/818/clipboard02bv.jpg/

One line of the error console says:

"Error: file.exists is not a function
Source File: chrome://zotfile/content/options.js
Line: 600"

I also get a bunch of X "is not defined"

One possible other factor is that my zotero documents are stored in a folder that is a symbolic link to a Dropbox folder. It hasn't been an issue before, however.

Thanks for reading and for any suggestions.

Joscha · February 20, 2012

quickfold11, first you should install the most recent version from here. You are still using a 2.0 beta, which had some bugs. Second, the menu is there. It's 'Manage Attachments'. When you install 2.0 you also see the 'Attach new file' again.

bentle · February 20, 2012

Joscha: I put the files in the Zotfile folder. Thanks for giving me access and looking into them. I'm running about 50% in terms of files from which Zotfile will extract annotations. (Windows 7, Firefox 10.02, Zotero 3.03, Zotfile 2.0, annotations made in Acrobat X).

Thanks!
Tom

Heckscher · February 20, 2012

If you can keep straight all these problems and answers!... Yes, your response has solved my problem. You were right, I had set the tablet location in the "General Settings" tab as well as the "Tablet" tab. I guess others have found the choices confusing, but I don't have a great idea about how to improve them; it's just conceptually tricky.

realtime99 · February 20, 2012

Thanks! So I guess the main issue was that I didn't realize that the Zotfile menu had changed and no longer included 'Zotfile' in the menu choices. Out of curiosity, why did you choose to do that? I may not be the only one who finds that confusing.

Also, didn't an older version of Zotfile have an option to delete the original file that was attached to the selected item? I found that very useful--does that option still exist, or will it be put into a future version?

schuessi · February 21, 2012

I also have trouble to use Zotfile with Zotero Standalone (all in current Versions).

I set Zotfile to move attacments to Directory and creating subfolders. It is also set to use the Zotero renaming.

Whenever i try to attach a file from my download directory the file is only attached to the entry, not renamed and not moved.

Intresstingly, when i try this with the zotero-plugin in Firefox, everything works as it should.

joatmon · February 21, 2012

Similar to looplog above, I am unable to extract annotations from pdf.

The behaviour is that when I click on an attachment, select manage attachment>extract annotations then a small bar comes up briefly on screen and it seems to complete extraction. But nothing is actually added to the notes,or anywhere else.

I have tested this on 2 different pdf documents, both annotated using okular 0.13. I used highlight, pop-up notes, inline notes and bookmarks. Neither document shows anything extracted.

I am on Linix Mint 10, Firefox 10.02, Zotero 3.0.3 and zotfile 2.0, and I have attempted extraction also via Zotero standalone, again with no success.

I have Poppler libraries installed, but as far as I can tell, this is only supported for Mac, so I suppose my system is using the pdf.js

Here is all the things I have tried so far, without any success in extraction:

1. Reinstalling extension
2. Deactivating and reactivating NoteFullCite preference
3. Restarting firefox, standalone etc

In the error console there is no error that shows up after attempting annotation extraction, but there are a couple of potentially connected errors:
"No chrome package registered for chrome://zotfile/locale/zotfile.dtd"

"Error: Zotero is not defined
Source File: chrome://zoteroautotranslating/content/include.js"
Line: 2
[Contents of the error link are]
// Only create main object once
if (!Zotero.AutoTranslator) {
const loader = Components.classes["@mozilla.org/moz/jssubscript-loader;1"].getService(Components.interfaces.mozIJSSubScriptLoader);
loader.loadSubScript("chrome://zoteroautotranslating/content/hello.js");
}

I'm dying to try out this new feature, so it would be wonderful if you could figure out what's going wrong

Joscha · February 21, 2012

quickfold11, when you use the 'Attach new file' function, the old file from the source dir should be removed. If not, try to provide some debug information. But maybe you mean something different?

schuessi, I wasn't able to reproduce your problem. Can you check whether the options in Z FF and Z Standalone are really the same and if they are describe the exact steps to reproduce the bug?

joatmon, you should check out some of my of earlier posts about pdf.js and the zotfile zotero group. Below are the steps you should follow, which I will add to the zotfile webpage. poppler only works for Mac right now. It would still be great if someone can compile it for Linux (which shouldn't be very hard) or even Windows (which is probably difficult). Let me know if you have compiling experience and want to give it a try.

1) Download this pdf and try to extract the annotations. If you are not able to extract the annotations, the problem does not seem to be your pdf file. Follow the normal debugging steps described above. If you are able to extract the annotations, your pdf files seem to be the problem.
2) Apply for membership in the zotfile zotero group, wait until I approve your account and upload the pdf that does not work. Put the pdf in the folder 'pdfs with extraction problems' and add it to a zotero reference item together with a note that includes your zotero forum name and a description of your problem. Please post in the zotfile thread on the zotero forum after uploading pdfs. Otherwise, I won't check the folder.

Note that I only have about 15 MB left on my 100MB free account. Should be enough for a couple of pdfs but no random or very large files. Usually the error can be pinned down to specific pages so that you can remove the other pages from the pdf.

joatmon · February 22, 2012

@Joscha I have uploaded the relevant files in the library. The test pdf that you gave me extracted annotations just fine.

Based on my research, the problem is Okular. It saves the annotation data locally, though the data can be exported as an archive, it is readable only through okular. The rendering in Okular is through the Poppler libraries so maybe that has a connection too.

I tried annotating the same file using an online pdf editor at crocdoc.com and then downloading the annotated pdf. I can see the highlights in the browser and in the pdf viewer, but zotfile does not extract highlights, and can only extract a comment (file uploaded to the library).

Compiling poppler for linux is above my capabilities, so it's better if more experienced users can pitch in for that. The problem for linux is that there is no decent alternative to Okular for annotating pdf, and it saves the data outside the pdf. Xournal is the one program I can think of which saves annotation onto the pdf, and therefore might make it possible to extract annotations, but it's not really suited to the task.

So perhaps the problem is to see how zotfile and linux can work together. If anyone has suggestions on the best way to get pdf annotation on linux to work together with zotfile, then perhaps it could be added to the documentation. I think the zotero devs were talking about using libertexto, so maybe we should see if linux+libertexto+zotfile will do the job.

Heckscher · February 22, 2012

I notice a wonderful capability buried in zotfile which I would love to see made more useful: it can identify the collections to which an item belongs. Currently, I believe, this works only when you set the Tablet options to "Create subfolders from Zotero collections", then start to send the item to the tablet: it will list the existing zotero collections for the item in the dialogue. This is a great capability for those with complex collection structures, but it's not the most useful way to get this information. Could it be made a standalone menu item ("List zotero collections for this item")?

Heckscher · February 22, 2012

In regard to extraction of annotations: I have a number of times had the problem described above by joatmon -- the annotation extraction notice flashes for a few seconds but nothing actually gets extracted. I have opened those files in Acrobat, made some small change, and then re-saved them, and the extraction has then worked. I sometimes use Foxit for pdfs; I'm not sure whether it's annotations created in those files that create the problem, but it may be that the extraciton has problem with non-Adobe formats.

Joscha · February 23, 2012

In general, annotations have to be saved in the standard pdf format (and Adobe, of course, does). Otherwise nothing can be extracted...

joatmon, your pdf software does not seem to use the pdf standard for annotations or it saves the annotations in a separate file. I didn't see any of them in preview (mac) or Adobe Acrobat. In that case, nothing can be extracted because there are simply no annotations in the standard format. I have no idea about Linux software. Let me know if you find something that works. Maybe there is just an option in your software to save the annotation in the pdf file!? (Skim for Mac, for example, has such an option)
I created a copy of your file (ep4 copy.pdf) and added a normal annotation to the file. poppler extracts this annotation without any problem but pdf.js only gets garbage (you can see the notes with the extracted text). I forwarded the file to Joe who is working on the pdf.js extraction. Maybe he has a chance to look at it but no promises…

ps: crocodoc.com also does not seem to create proper highlights - Adobe does not lists them as an annotation (you can look at a list of all annotations in Adobe). They are visible though so that I am not sure what is going on…

Heckscher, I am not sure what you are asking for with the zotero collections. Do you mean that this feature is not only available for sending pdfs to the tablet but also for linked attachments in zotero? In any case, I won't get into this right now but maybe at some point in the future.

Thierry.C · February 24, 2012

Joscha,
Sorry for not responding earlier. I didn't receive notifications.

Can you reproduce the problem in the zotfile group and also upload a screenshot of your settings?
>> I uploaded screenshots of my ZotFile settings.
However, I think I cannot reproduce the problem in the zotfile group since the renaming operation in the group doesn't make a link with the file (i.e., moving the file to a folder path). It only renames the file.

you are using windows, right? Can you send me the subfolder setting you are using? The default is '/%w/%y' but that only works for mac/unix. I think I already fixed the problem with subfolders and missing fields but I just want to make sure for Windows.
>> Yes, I'm using Windows (XP). My subfolder setting is '\%T\%w\%y'.

Joscha · February 25, 2012

2.1 is pretty much ready but I would like that some people test it before I release it. I changed the tablet tag from '_READ' to '_tablet' and particularly want to know whether the transition works flawlessly. A full list of changes is below.
The beta is available from the link below. I just want to hear from 2 or 3 people whether the transition works. So please post here when everything works!

http://www.columbia.edu/~jpl2136/zotfile_21b.xpi

Changes in 2.1
- Important: the tag for tablet files was changed from '_READ' to '_tablet'
- New saved search for modified files on tablet
(updates automatically, replaces 'Scan Tablet Files' function, which has been removed)
- Zotfile menu items only appear for bibliographic items and attachments (not for notes)
- Bug fix: allow the extraction of annotations in group libraries
- Other bug fixes

Thierry.C, I fixed the missing field bug for subfolders. Can you check whether it works on Windows?

Thierry.C · February 25, 2012

Joscha, thanks a lot for the update.
The missing field bug for subfolders seems to be fixed (I get folders named "undefined" instead of no folders at all). Moreover, now I can locate the file without the previous error in that particular case. Great!
However, the renaming process seems longer and more prone to instabilities (the Firefox window turns all white for 1-3 seconds). Did you notice a similar behavior?

Then, there is still the problem with unauthorized characters in fields other than the title (e.g., publication):

Erreur : uncaught exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIFile.moveTo]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: chrome://zotfile/content/zotfile.js :: :: line 919" data: no]

Is there anything else I could check?
(Unfortunately, I don't have a tablet yet.)

Joscha · February 26, 2012

Thierry.C, I think I fixed the problem with unauthorized characters in fields. I uploaded the version with the bug fix to the zotfile zotero group. Can you check whether the fix works?

I didn't notice any changes in the renaming process and I also don't know why that should have changed. Can you post again if the problem persists after restart? (In general, the renaming is slower with linked attachments)

Any other reports about the transition to 2.1 for people who use the tablet features would be good! (particularly the tag change from '_READ' to '_tablet')

http://www.columbia.edu/~jpl2136/zotfile_21b.xpi

Changes in 2.1
- Important: the tag for tablet files was changed from '_READ' to '_tablet'
- New saved search for modified files on tablet
(updates automatically, replaces 'Scan Tablet Files' function, which has been removed)
- Zotfile menu items only appear for bibliographic items and attachments (not for notes)
- Bug fix: allow the extraction of annotations in group libraries
- Other bug fixes

Thierry.C · February 26, 2012

Joscha, unfortunately it doesn't work either with the version 2.1b2.
However, the error I get is a bit different:
Erreur : uncaught exception: [Exception... "Could not convert JavaScript argument arg 0 [nsIFile.moveTo]" nsresult: "0x80570009 (NS_ERROR_XPC_BAD_CONVERT_JS)" location: "JS frame :: chrome://zotfile/content/zotfile.js :: :: line 919" data: no]

stefanmeir · February 27, 2012

Joscha, thank you very much for your great plugin.

I'd like to comment on afabl's suggestion on Feb 9th 2012. Joscha, you're right that Zotero can't handle links fo files within group libraries. But it can handle links to URIs even within group libraries!
ZotFile can already copy files to network shares (at least if they are mounted locally). If additionally this network share is web accessible the files thereon can obviously be linked to by an URI. Therefor I think it might be feasible to let ZotFile convert attached files into links to URIs. To do so the user will probably have to let ZotFile know a prefix for the URI (e.g. the web server's IP address).

Afaik such functionality would be the only facile way to sync files within group libraries if you don't want to use Zotero Storage. Many research groups (mine included) can't use Zotero Storage for files and are desperately looking for file sync within groups.

Joscha · February 27, 2012

Thierry.C, I am not sure what is going on. 2.1b2 didn't create any problems either, right? Can you pin it down to a specific character in one of the meta data fields? I might just release 2.1 and fix this later...

stefanmeir, maybe there is a suboptimal work around but I won't get into that right now. You should raise the issue for Zotero in general, which I think would be a better solution than some hack.

schuessi · February 28, 2012

@Joscha

I found the "bug"...it was me :D

The mistak was that i was using "/" in the Folderstructure that zotfile should create instead of "\".

Now it works as it should.

Thierry.C · February 28, 2012

Joscha,
In fact, yes, I've just noticed 2.1b2 created a problem: now I cannot rename/link any other file that were not problematic before (I tested with very simple entries, with "test" as fields.)
The error in the console is:
Erreur : uncaught exception: [Exception... "Could not convert JavaScript argument arg 0 [nsIFile.moveTo]" nsresult: "0x80570009 (NS_ERROR_XPC_BAD_CONVERT_JS)" location: "JS frame :: chrome://zotfile/content/zotfile.js :: :: line 919" data: no]

For sure you can fix this bug later and release 2.1 as it was before 2.1b2 (2.1b I think).

To answer your second question: yes, when I was able to rename/link (2.1) "normal" entries, I pinned down the problem to the following:

In the article put in the zotfile's group, the publication name is Transportation Research Part D: Transport and Environment. When I removed the colon (:), everything seemed to work flawlessly (in 2.1).
Therefore, I thought that if the code processed such a field in the same way the title is processed (cutting the part after ':'), then it should work.

Hope this helps.

playagiron · February 29, 2012

Hi,

I just discovered zotfile and it is awesome!!
Now I have been trying out a dozen of PDF tools that edit PDFs to make them more readable on an ebok reader. After long tryouts with scientific papers, I finally figures that simply looking for two columns layouts and otherwise just cropping the white margin was the best one could do given formulae etc.
So I ended up only using this tool BRISS (http://sourceforge.net/projects/briss/
which is opensource and exists in a commandline version.

So I was wondering if it would be feasible to integrate it with zotfile so that exported PDFs get automatically parsed thru this script using the standard automatically optimal settings (worked perfect on 99% of my docs!).

Given the 6 inch display, the hole margins are wasted and if you have two columns papers, this adds additionally to the readability.
Now I know for annotations this might be a problem but maybe one could make it optional.

Thoughts?

gm2240 · March 1, 2012

I am lost here. Even after reading comments and download page, when I add a file from zotero to transfer it to the tablet, nothing happens. I have setup the dropbox folder correctly. Kindly help.

Thank you.

Joscha · March 2, 2012

Thierry.C, you should revert to 2.1b. I will release 2.1 soon but I will fix this bug later.

playagiron, sounds interesting but I have no intentions to add this right now. Any potential new features has to wait right now...

gm2240, please be more specific. Did you follow these steps exactly? At which point are you stuck? Z FF or Z Standalone? Upload screenshots with your settings somewhere...

Thierry.C · March 2, 2012

Already reverted to 2.1b which otherwise works flawlessly. :)

Thanks.