Is it possible to export Zotero library or collection as files/folders?

edited September 28, 2018
Is there a way I can export my Zotero library or collection as files/folders such that the hierarchical organization of the selected library/collection being exported is re-created as files in folders?

For example assume the following structure of my library where I want to export Project A. Is it possible to export all the attachment in items for Project A as PDF files where the PDF files are organized in folders/sub-folder as the original hierarchical in the library? Currently Zotero organizes all the files in the \Zotero\storage\ which are internally manged due to technical reasons related to database, syncing, etc... Therefore it is not as simple as navigating to the \Zotero\storage\ and finding the desired files being organized.

My Library
---- Folder A
------ Item n1
------ Item n2
------ Sub-Folder A1
---------- Item n4
---------- Item n5
---- Folder B
------ Item n1
------ Item n2
------ Item n7
....

Possible scenario where this would be helpful:

1a) Personal local backup of all organized files without having to rely on using a software to navigate through the files and ability to quickly read the desired PDF files by navigation through Window Explorer. Yes I understand any changes to these exported files will NOT be synced with Zotero. The idea of this export method is to completely unlink from Zotero or any other software.

1b) Similar to a reason provided by another user, https://forums.zotero.org/discussion/67690/export-library-as-files-and-folders-maintaining-hierarchy. Until now I have been managing all my PDF manually and importing them into Zotero as "Link to file" because I am always worried I won't be able to recover my files if I did "Store Copy of file". Being able to export as files/folders gives ability to recover all my attachments in an organized manner in case if plan to stop using any sort of reference management software in the future. The best way to recover the files would be to export the hierarchical structure of my library into a files/folders.

2) Ability to share an organized PDF file collection with another team member if the team member doens't use any reference manager software (ex: Zotero).

After reading a similar question, https://forums.zotero.org/discussion/67690/export-library-as-files-and-folders-maintaining-hierarchy, I came across a possible solution is using ZotFile, however, I tested out ZotFile, and it doesn't properly export all the PDF files while maintaining the folder-ed hierarchy

«1
  • Zotfile doesn’t do anything with export, but it can be used to organize the files in your library based on a hierarchical collection/folder scheme.
  • @bwiernik Just out of curiosity, how difficult would it be to write a Zotero plugin that would export a selected Library or Collection into the format of files in folders while maintaining the hierarchy?

    I think I can easily do it with Python by querying the zotero.sqlite database. I looked into the Zotero RDF file but I couldn't really understand the structure. If anyone else want to collaborate, I would be interested in contributing (if Python or JavaScript).
  • It would probably be pretty straightforward. The Zotero RDF format includes 'collection' as a field, so you could probably even write a post-processing script to allocate PDF files to folders.

    If you just want a way to send a batch of PDFs to folders, e.g., for emailing, look at the Zotfile "Send to Tablet" feature, which will copy PDFs to a specific folder, potentially with subfolders based on item metadata/collections.
  • @bwiernik I looked in to ZotFile's "Send to Tablet" feature. It doesn't properly work. To maintain the hierarchy, I have to manually have to go through each collection (folder) and sub-collection (sub-folder). You can read more her: https://forums.zotero.org/discussion/comment/316985/#Comment_316985

    I looked into Zotero RDF, I couldn't really understand the mapping from "collections" and "attachment". If you're interested in collaborating on the plugin, we can take the discussion private. I can help write the logic using Python or JavaScript.
  • The 'link' nodes for an item list the id numbers for the attachments to that item, listed later in the RDF.

    Each 'attachment' node starts with the attachment id An the attachment file. The 'resource' node under each attachment gives the path to the attachment file, relative to the Exported Items base folder.

    When you export multiple collections, each 'collection' node has subnodes for each of the individual items, attachments, and sub collections belonging to it.

    Sorry, don’t have time to work on a plugin.
  • https://gist.github.com/retorquere/6bad138046bc0c7b0420166ebcb90028 would be a starter. I'm unlikely to do maintenance on this though as I think it's a pretty niche use-case. Keep in mind that this will export file duplicates if items live in multiple collections, and the handling of snapshots is a little naive.
  • @emilianoeheyns This seems to be exactly what I'm looking for! Awesome!

    How do we use it, though? It's a .js file, not a plugin.

    Also, I think it's fine to export duplicates if an item is in multiple collections.

    My scenario is almost identical to situation #2 described by the OP. I am part of a research group, and we have a large number of attachments saved in our Zotero group. However, we need to export everything to a shared Google Drive location in order to share the documents with the larger community. Thank you for your help with this.
  • It's a translator, you just drop it into the zotero translators directory.
  • If you want something that runs automatically on a schedule, something built using pyzotero is going to be more convenient. Or even something that uses the public api. But that's a lot more work than this quicky translator.
  • Also the translator above blithely overwrites existing files, so if you have two files in one collection that have the same filename, you get only one. This could be mitigated by instead saving to a folder named after the item, but even then if you have two same filenames under one item, one would overwrite the other. It also does not (cannot) clean up attachments that no longer exist, so you may have to delete the whole export dir before doing a new export (not sure zotero even allows exporting to a non-empty directory. This will lead to a lot of extra traffic on your Google drive.

    If you want a sort-of-sync, the best option is either to go with pyzotero or give people zotero access to the group. Export of attachments is good for occasional shares, less than ideal for sync.

    You could get around that by exporting to a non-gdrive place and then rsyncing it into gdrive with the appropriate options to only update changed files and delete removed attachments, but at that stage you're stacking hacks on hacks.

    Note that pyzotero solutions should only be ran when zotero is fully closed (that's cmd-Q for macOS people). The best sync solution would be built on the api.
  • Wow. This is awesome!

    For those who may be unfamiliar with how this works, here are the steps to follow:

    1. I downloaded the file from GitHub as a Zip and extracted it. Then, I moved the .js file to my [User home directory]/Zotero/Translators folder.

    2. I restarted Zotero.

    3. To run the export, I

    a. right-clicked our group library and
    b. selected "Export Library..." > and
    c. "File Hierarchy" & "Export Files" > OK > and
    d. selected the target location for the export under "Where" > Save.

    It exported all my attachments and maintained my entire Zotero library structure! Awesome!

    I did run into two issues:

    1. I had several items in the library which had links to Google Books pages. Because of this, the export process aborted when it encountered the links and an "Error" window with an OK button popped up, saying "An error occurred while trying to export the selected file." If I deleted the HTML links from the items, the export would process successfully. Any ideas?

    2. If there are multiple attachments for an item, it only keeps one of the items. It would be great if the export script would append "_1", "_2", etc. to the filenames if there are multiple attachments.

    I really appreciate your creation of this script, and I think it will work well for publishing our document collection to the larger team. I think it's fine for the prior export to be overwritten--nice way to keep the exported files fresh. Ideally, we'd like to be able to use a Linked Base directory with our group library's Google Drive folder, in which case the file export approach wouldn't be necessary, as the files would just sort of...be there. But, in the absence of that functionality, this will fill the gap beautifully! Plus, since it's a completely separate export, if someone accidentally moves or renames a file, it doesn't affect our Zotero library at all.

    Thank you again!
  • I don't think Zotero actually allows you to overwrite the export. You need to make sure the target does not exist or Zotero will not allow you to pick it. At least that's how it works for me under Linux.
  • With Mac, it asks if you want to overwrite, and you can tell it to do so.

    With the links, then, you just skip over those, right, since they're not actual files? That's perfect, as far as I'm concerned.

    I see that now it creates folders, for some reason, and puts the files in the folders rather than displaying a list of PDF documents, as it did before. I actually prefer your first approach of sending all the PDFs straight to the same folder, without an enclosing folder. That (1) reduces the time it takes to get to your document and (2) allows them to be sorted by the Zotero's "Author - Year - Title" file naming convention naming. Author-year is the way our team looks for documents (not by the title). If you would like to stick with the folders, can the folders follow the Zotero file naming convention?

    Also, for whatever reason, it still doesn't solve the two-file problem. (Sorry!) I ended up with just one copy of the file. I don't think I saw any prefixes...what would those look like? For files with the same name, could they just have "_1" appended at the end, before the file extension, to distinguish them?

    Thanks again!
  • edited September 28, 2018
    @emilianoeheyns Wow, thank you very much for taking the effort to create this translator plugin. This is exactly what I was looking for. Although this translator is a pretty niche use case, I think many other users would find it beneficial. I have updated my first post with possible scenario where the translator you created would be helpful for.

    @internationaled I agree with you also. The first approach was better where the files would be dropped into a folder instead of enclosing each file in a folder. But I think @emilianoeheyns created folder for each item to account for multiple files in a single item. Maybe having a check box option which during Export on which approach user prefer would be helpful?

    Also, yes I think if there are multiple attachments to a single item, maybe just appending a number to end of the file name would be helpful.
  • Current translator should do these things.
  • If you would like to stick with the folders, can the folders follow the Zotero file naming convention?
    the folder-per-item is now gone, but the translator just outputs the files as they are currently named on disk. This may differ from the name you see in Zotero, where attachments can have a title and a filename -- in the overview, you see the title.
  • @emilianoeheyns thank you for the translator! i was almost ready to abandon zotero without it.
  • @meiselman I've been dealing with this issue, and discovered that the BetterBibTex JSON translator exports the natural hierarchy of collections and sub collections without duplicated records.
  • @emilianoeheyns Hi, I do not seem able to get Zotero to load the File Hierarchy.js translator. I download the file (from https://raw.githubusercontent.com/retorquere/zotero-file-hierarchy/master/File Hierarchy.js), I re-started Zotero, but still it does not appear in the Export list. I am using Big Sur and Zotero 5.0.96-beta.4+cd63f96ee. Any hints?
  • For me it just works when I drop it into zotero/translators with the rest of the translators. Does need a restart, but nothing else. I'm on big sur too, but not on the beta, but that would be a strange issue to have on beta. And my other translators are tested nightly on beta and those have passed so far.
  • If you're on Windows, make sure that the file is actually saved as a .js file and not as a .js.txt file -- I've not been able to convince Windows to save .js files from github as such, so always have to go in and manually edit the file extension of manually downloaded translators.
  • @adamsmith Do you have Windows set to always show file extensions?
  • Yes -- surely that can't be the problem?
  • If they’re hidden I’ve found Windows tends to add extensions to more “programming” file types sometimes
  • Yeah, so I just double-checked and I can reproduce this reliably on Windows (mentioning this here since it may help with troubleshooting -- I obviously know how to fix it and Zotero plays no role in producing it):

    1. On Windows 10, using Firefox, I go to any translator page on github such as https://github.com/zotero/translators/blob/master/A Contra Corriente.js
    2. right-click --> save link as on "Raw"
    3. Save the file with file type "Text file (*.js)" (but using "All types (*.*) has the same effect)
    4. The file appears as "A Contra Corriente.js.txt" in Windows and if saving into Zotero's translator folder, I have to rename it & remove the .txt part of the extension before it's recognized by Zotero.
  • @adamsmith: I can reproduce what you described about downloading raw files from GitHub. This should be an issue with Firefox.

    Using Firefox on Windows, I get a .txt file when right-clicking "Raw", then selecting "save link as". Chrome doesn't have this problem.

    Using both Firefox and Chrome, the following works without problems. Left-click "Raw", which opens the raw file in the browser, then Ctrl+S to save it. This keeps the original extension.
  • Thanks a million @emilianoheyns for creating this translation and also thanks to @internationaled for explaining how to implement it.
    Just one tiny thing: I find different versions in the links above. That is,
    https://gist.github.com/retorquere/6bad138046bc0c7b0420166ebcb90028 points to a 3k file last edited 2018-09-27.
    and https://raw.githubusercontent.com/retorquere/zotero-file-hierarchy/master/File Hierarchy.js
    to a 4k file last edited 2018-09-28.
    Testing it on a small group library I see no difference in the result. Judging from the code though, the latter one creates suffixes if files have identical names? Should I use the latter one?
  • Probably. I haven't looked at that in wow 3 years but if I took the effort to create a repo for it then that means I intend to do maintenance there.
Sign In or Register to comment.