ZotFile - Advanced PDF management for Zotero

Joscha · March 18, 2014

Using the color of the extracted annotation to format the extracted text in Zotero notes is currently not supported. It's possible, however, and shouldn't be too hard. A user friendly implementation, however, requires some effort. I am not going to do it for now. If you are interested in implementing it, open an issue on github and I am happy to help out.

auster · March 18, 2014

I have an error report that pops up every time I save an entry from the web (like amazon.com).

Just after the entry is saved, a small pop up window appears on the right, on the bottom. The window says:

"ZotFile error!
Unknown error!
(Click here to get details unknown
errors to the clipboard)"

I opened a notepad, clicked "paste" and got the following message:

"TypeError: fname is undefined
(chrome://zotfile/content/zotfile.js, 1458)"

Every time it is the same text in the notepad.

My ZotFile version is 3.1. I have it both on Zotero standalone and Zotero for Firefox.
I am on mac.
Firefox's version is 28.0.
I have not changed ZotFile's settings except for the changing the folder in Zotero standalone, as it is recommended on the ZotFile's web-site.

Joscha · March 19, 2014

Thanks for the detailed bug report. Can you try the 3.2 beta and the steps that lead to the error including an example page from amazon or whatever?

You can download the beta here.

thisisjoe · March 20, 2014

I like paulma's idea. I barely have same reference item in multiple collection except for the fact that My Library will show everything in the whole collection. I am fine with having multiple copies of the same file. The option to create subfolders in the file system would be great. The option can be disabled by default or even hide inside the about:config so that only those people who knows what they are doing will make those changes.

auster · March 22, 2014

Joscha, hi,

Thanks a lot! I've tried it and got this error report:

"TypeError: fname is undefined
(chrome://zotfile/content/zotfile.js, 1522)"

Joscha · March 28, 2014

auster, sorry for the delay. Can you tell me exactly the steps that lead to this error (with example webpage and what you click)?

zotfile 3.2: Unfortunately, zotfile was rejected by the mozilla review process because it bundles a modified version of pdf.js. The current version 3.1 is still available and I hope to convince them to change their mind about future releases.

d.bobak · March 30, 2014

Hi Joscha,

I have a problem extracting annotations from the following pdf:
https://drive.google.com/file/d/0B_UDYcGh0SY5SnktcmcyZDNlSmM

I'm using Zotero, version 4.0.19 on Windows 8.1 (in Polish language), the Zotfile version is today version from Github (it shows that it is 3.3 version).

Joscha · March 30, 2014

Same here. Poppler based extraction works. Do you mind when I report this to pdf.js with a link to the file? Thanks!

d.bobak · March 31, 2014

I don't of course! Please, report - I hope that they solve the problem...

Joscha · April 1, 2014

Here is the issue: https://github.com/mozilla/pdf.js/issues/4537
It would be great if you can leave the pdf were it is because I am using that link. Don't expect any quick result but there is a good chance that they will fix it at some point (I will still have to port the fix to my fork though).

d.bobak · April 1, 2014

Joscha, thank you for reporting the issue! Pdf will stay where it is.

Joscha · April 2, 2014

d.bobak, the problem is fixed in zotfile 3.3 (dev version on github). The version also includes some improvements for word spacing but it's still going to take some time until it comes out (for now, 3.2 is stuck in the review process). Note that the github version is under development and not always stable!

d.bobak · April 2, 2014

Great! Joscha, thank you again :)

Peter100 · April 7, 2014

My apologies - perhaps this should be a new thread.

I would like to have Devonthink index my Zotfile annotations. I have DT currently indexing Zotero's pdf storage folder. Does anyone know if Zotero reports could be generated / customized for each ref/pdf and saved to the corresponding Zotero pdf storage where DT could index them? Would the annotation links still work if accessed from DT?

Any ideas for this kind of workflow will be much appreciated!

kithairon · April 8, 2014

Gave the latest github version (3.3) a spin. Appreciate the extraction of contents/bookmarks – a feature too little used in many pdfs. It works excellent and faster at that than anticipated. Tried extracting in chunks of 50, 100, 200 pdfs at a time. (500 brought FF to its knees). My 1600+ pdfs were handled quickly and about a quarter of them now have live content-links in the notes of the pdf – a great feature. Thanks!
Currently make use of the hidden option to use both poppler and pdf.js and have found poppler to be occasionally better with querky spacing in pdfs (that I assume comes from OCR).
There is one case in which I annotated in Acrobat, extracted the notes with zotfile (they looked good), then opened from the note's link in Preview, added some notes and highlighting in Preview, saved the file and extracted again – and found the notes scrambled: highlighting was all jumbled letters, the text doubled in italic and underlined (this was version 3.2) I suspected Preview, but trying the same procedure with other pdfs worked flawless, so it must be the specific pdf that is to blame.

Joscha · April 8, 2014

Peter100, I have no idea about Devonthink but I guess it would be possible if zotfile saved all extracted annotations as a text file in some location. Here is a github issue about this: https://github.com/jlegewie/zotfile/issues/139
As I mentioned there, I am happy to give some suggestions when someone wants to implement it.

kithairon, thanks for testing the new version. Let me know if there are any pdfs that made the toc feature hang. About the annotations: The most recent version on github (last commit two days ago) also includes some improvements for handling spaces in the pdf.js based extraction. Let me know if you have a file that is still problematic (or worked better in earlier version). A second problem that comes up for some older pdfs is incorrect word order but I haven't worked on that and it's pretty rare for my own pdfs. Your scrambled results sound different though. In my experience, it sometimes helps to simply re-save a file.

Peter100 · April 8, 2014

Cheers Kithairon, Joscha,

Yes, I am interesting in implementing the export of zotfile notes to an external file location. However it will be important to extract a separate report for each pdf by default. Only one report would make keyword searching and tracing ideas (trains of thought) less effective in DevonThink. I am also concerned that the pdf page links remain intact on export.

I'm eager for your reply! :)

Joscha · April 9, 2014

Peter100, let's move this to github. Just open a ticket or maybe hijack the other ticket with a description of what your goal is. I will let you know whether it fits into zotfile and point at the relevant code passages to get you started with the implementation.

Peter100 · April 10, 2014

Hi Joscha,

I was away from my computer yesterday and only getting to your reply now. I will add what i have written above to the github ticket. I should note however that I have no programming skills so I perhaps I misinterpreted the word "implementation." What I meant was that I would be interested in having Zotfile work like I described but I have no possibility of doing the programming to get it to work. Does that make sense?

adamsmith · April 10, 2014

yeah - "implement" in this context does mean "code" - so if that's not something you can do, don't worry about the github ticket.

swaldman · April 15, 2014

There is a paper that was once on my tablet, and is no longer on my tablet, that continues to show up in the Zotero search for "Tablet Files".
I've removed the "_tablet" tag from it, but it's still there... how might this be, please, and how might I fix the situation?!

Thanks :-)

adamsmith · April 15, 2014

that's pretty much impossible - the search is for "Tag --> contains ---> _tablet" if such a tag doesn't exist, the item wouldn't be in the search.
Make sure you deleted the tag for the attachment, too and maybe for good measure restart Zotero/Firefox.

swaldman · April 15, 2014

That was what I thought too! Here's a screenshot:
https://www.dropbox.com/s/80n6l109huij37b/zoteroscreenshot.png

The lilac colour is assigned to the _tablet tag; you can see that this item doesn't have the tag. I hadn't realised until you mentioned it that attachments, notes, etc., could have tags that didn't apply to the parent item, but I've been through those and each one reports having 0 tags.

I've tried restarting Zotero & Firefox, and I've tried deleting and recreating the searches... any ideas?!

Thanks :-)

adamsmith · April 15, 2014

odd. That doesn't look like a ZotFile issue then, since ZotFile just uses regular Zotero saved searches. Please start a new thread (include a link to here) and provide an error report ID. https://www.zotero.org/support/reporting_bugs#provide_a_report_id

MikeDacre · April 25, 2014

ZotFile: Move linked files on more than one device
===============================

Is there any way to move Zotfile files on more than one device?

I just moved my ZotFile folder from my dropbox to google drive on one device, I also changed the naming rules. I then used ZotFile to rename all of the papers.

I went to my second device, made exactly the same changes, but it cannot locate any of the papers now. I get the following error:

"File Not Found

The attached file could not be found.

It may have been moved or deleted outside of Zotero."

I can't figure out any way to batch rename linked files when the files have already been moved, ZotFile just skips all files that don't exist.

Any ideas?

adamsmith · April 26, 2014

You don't want to do that in ZotFile at all. The changed filenames&filepaths sync via Zotero data sync. The files, obviously, sync via google drive. You want to set the relative file base directory (advanced tab of Zotero prefs under files&folders) to the google drive location - first in the computer where you moved the files using Zotfile. Then make sure to sync. Then do the same thing on the other computer.

qsolis · April 27, 2014

The extraction of highlighted text does not include authors when there are two authors. I tried with 3.1 and 3.3. It only shows year:page. With one author it works fine, as well as with more than two authors. (Linux, FF 28, German language, APA style)
The Libre Office plug-in has the correct format (A & B) and in Zotero the creator field shows A und B (= A and B).

Anything i'm missing?

adamsmith · April 27, 2014

works for me with two authors. Have you tried different examples?
Also, note that this is completely independent of the citation style selected in Zotero.

qsolis · April 28, 2014

Hm, i tried it with another entry and that showed both authors.
Do you know which .js has the citing function?

bwiernik · April 28, 2014

zotfile.js