ZotFile - Advanced PDF management for Zotero

  • @darcyparks @jadoff I just pushed a new version with the fix (5.0.15). Let me know if that fixes the problem. Thanks @dstillman.
  • @Joscha Auto-update seems broken for 5.0.14. The Error Console shows these messages:

    [JavaScript Error: "XML Parsing Error: prefix not bound to a namespace
    Location: moz-nullprincipal:{...}
    Line Number 4, Column 5:" {file: "moz-nullprincipal:{...}" line: 4 column: 5
    source: " <RDF:Description about="urn:mozilla:extension:zotfile@columbia.edu">"}]

    onUpdateCheckComplete failed to determine manifest type
  • @Joscha @jadoff The new version fixes it for me. Thanks!
  • @Joscha @dstillman The fix works for me as well! Thanks so much for the quick fix!
  • Dear Zotfile experts,
    I am trying to migrate from the Zotero file structure to a custom file structure mirroring my library structure in Zotero using Zotfile. It happens that I have multiple files per Zotero entry e.g. multiple pdfs, xls, jpgs, etc. When Renaming/moving them to the new location all files get renamed and file names on the disk are differentiated by adding a number. In Zotero, however, the multiple and different file types all appear with the same same name and no possibility to differentiate foo1.pdf, foo2,pdf, foo.jpg, foo.xls etc.
    3 related questions:

    Is there a possibility to add the file extension in the name (both in zotero and in the file name) and differentiate multiple pdf - at leas as foo1.pdf, foo2.pdf within zotero?

    Renaming does not work (always) for some files (generally jpgs but I although had some scanned pdfs for which it does not work). The Dialog "zotfile: Unnamed attachments" freezes/remains gray. This happens more often using the Zotfile renaming format but also when using the standard zotero renaming format.

    How can I automatically list or exclude entries with multiple attachments for manuell treatment?

    Thank you all for your help.
  • Hi all, I recently tried to start using the save to table feature again, but I am having issues because I used it a long time ago and deleted to zotfile tablet folder in the meantime. When I reactivate it, all the old papers I have in zotfile pop up in the saved searches, and deleting the '_tablet' tag *does not remove them*. Every time I click on those saved searches, I get a string of errors, 1 for each of 343 papers, that can't be found because the new folder is empty.

    Is there any way to clean old ones out and start from scratch? Deleting all the tags is apparently not enough. Thanks!
  • @MikeDacre: You could try to suppress the warning messages by toggling the extensions.zotfile.tablet.showwarning preference in the Config Editor.
    See here: https://github.com/jlegewie/zotfile/issues/417.
  • @qqbb Thanks, that looks like exactly the same issue as mine. Unfortunately, the fix did not work for me, but I can move my comments over to that github issue since they probably make more sense there.

    It would be great if there was a way to reset ZotFile to clear all saved memory of prior on tablet files. I always thought it just used the tags to track them, but that is clearly not the case.

    Thanks again.
  • You could check the discussion here. Some similar issues might have been fixed recently, so make sure you're running the current Zotfile version (5.0.16).
  • Hi,

    I'm new using ZotFile and there's an issue with the extracted annotations. I believe the problem is because the extracted annotations are written in a language that maybe ZotFile doesn't recognise but the characters are the same as English or Spanish and I have no problems with those languages.

    Here's how it looks like:

    "Es diu que un consumidor es troba en situació�de�mercat�informat quan considera que disposa de criteris autònoms d'avaluació del producte, i en situació�de�mercat�no�informat en el cas contrari."

    Any suggestions?
  • @GabrielRobles: It seems that some characters with accents are properly extracted. Could you select the text in your pdf viewer, then copy and paste it to a text editor? If the pasted text remains of poor quality, it might be that the (hidden) text layer of your pdf is not correct. You might get better results by running OCR software on the pdf.

    Maybe the Zotero OCR add-on can do this. I would test it first on a copy of the pdf. If you are using Windows, you could check if the PDF-XChange Editor and its OCR Language Extensions could be of help. A free version for academic use seems to be offered here, but I don't know if the language files are compatible with this version.
  • @qqbb: Thanks for your help. I tried a copy paste in a text editor and the interrogations marks are still there. But in the process I realized that the problem aren't the characters. Those words with interrogations marks are bold words.
    But in the same document the titles are also written in bold letters and it's not a problem. Just in the body text.

    I ran an OCR software but it didn't solve the issue.

    Any ideas?

  • You can find some background on the "text layer" that I mentioned above here and here. For scanned documents, this is often a text that is made invisible and shown above the scanned picture. Even if your pdf is not a scanned document, it might contain invisible characters that are problematic here. It seems that you are getting the correct words and punctuation marks, but that there are unwanted characters in between words that are printed in bold font. As you already noted, bold text is normally not a problem for Zotfile. For an illustration of non-printable unicode characters see here.

    Zotfile has a feature that allows replacing unicode characters, which might help with your issue. In the Config Editor, find the preference extensions.zotfile.pdfExtraction.replacements. If you right-click it, you can modify the value to set a replacement rule. For example, a single character replacement could be [{"regex":"ò", "replacement": "o"}] or equivalently [{"regex":"\\u00F2", "replacement": "o"}], see here. So if you could find the unicode value for the unwanted character, you could replace it with a space character.

    Various ways of identifying the unicode characters on the clipboard are given here. This online tool might be useful:
    (Paste your text and click the "Identify" button.)
  • Hi,
    I met a problem with Zotfile. The "tablet files" and "tablet files(modified)" would be automatically built when I use "send to tablet" function. But I deleted the two files by mistake. I can still use "send to tablet" and "get from tablet" function to view my modified PDF, however, if I forget which PDF has been modified, I may miss it, as I don't have a "tablet files(modified)" to view what I have sent to tablet.
    So how can I get back my "tablet files" and "tablet files(modified)"?
  • You can recreate the collections manually or in the Zotfile settings window.
  • wow, it worked, thank you!@bwiernik
  • edited yesterday at 5:39pm
    I am also having the same problem with @sdknij at page 60. The bug is reproduced when i follow the steps below:
    1. I click X.pdf -> send to tablet (X.pdf is linked in google drive)
    2. I annotate X.pdf on my ipad (via PDFViewer if that matters).
    3. I click get from tablet, so i get X_annotated.pdf in zotero (annotations correctly imported).
    4. When i click X_annotated.pdf --> send to tablet (to keep annotating it), the complete filepath becomes: some_folder/false (originally some_folder/X_annotated.pdf) and no file ends up in the tablet folder. also the link to X_annotated.pdf in zotero becomes corrupt as mentioned.

    A lot of thanks to the developer for helping so many researchers!

    version => 5.0.85, platform => MacIntel, oscpu => Intel Mac OS X 10.15, locale => en-US, appName => Zotero, appVersion => 5.0.85, extensions => ZotFile (5.0.16, extension), Zotero Storage Scanner (5.0.8, extension), Zotero LibreOffice Integration (5.0.22.SA.5.0.85, extension), Zotero Word for Mac Integration (5.0.26.SA.5.0.85, extension), Zotero Scholar Citations (2.0.4, extension, disabled)
Sign In or Register to comment.