Meaningful file names in Zotero storage

Andy Glew · November 3, 2009

Zotero stores stuff in conventional UNIX files. This is good. It allows me to use conventional tools, like Google desktop search, to find stuff.

Unfortunately, the names of the files are not human readable. Apparently they are just numbers. Although I sympathize - I've done this myself in the past - it is an obstacle to tghe acceptance of Zotero in my group.

I wish that Zotero could store stuff in files with human readable, suugested by the user, names. Possibly uniqified by numbers, but still human readable.

E.g. the snapshot of the document 252046.pdf
Intel® 64 and IA-32 Architectures
Software Developer’s Manual
from http://www.intel.com/Assets/PDF/manual/252046.pdf
linked to from http://www.intel.com/products/processor/manuals/

is in the filesystem at
C:\Documents and Settings\aglew\Application Data\Mozilla\Firefox\Profiles\tee0f0ar.default\zotero\storage\3784\252046.pdf

If you are using Zotero, you can find this. (Well, not always, but that is a different issue.)

If you are using desktop search, you can find it.

But if you are trying to browse, using conventional UNIX tools, good luck.

Feature Request: let the user provide the name, e.g. for the directory that the Zotero snapshot and metadata is placed in.

If necessary, optionally uniqify by adding a number (or datae, or...)

If the user doesn't want to proovide the name, generate it as you do now.

But, give the user the option.

dstillman · November 3, 2009

Files saved via translators are given meaningful names based on the parent item metadata, and in Zotero 2.0 you can also rename attachment files based on the parent metadata via a context menu option.

You can also rename attachments manually by clicking on the attachment title in the right-hand pane.

The storage folder names are integers in Zotero 1.0 and random strings in 2.0, but in any modern OS you should be able to use a smart folder/saved search to view all PDFs within the storage directory. As for conventional UNIX tools, find /path/to/storage -name '*.pdf' should work just fine.

An improved folder naming scheme might happen eventually, as it's a common request, but it doesn't seem like a very high priority given the fairly trivial workaround of using a smart folder.

There are many threads on this, so do a search if you'd like more details.

davidcoll · April 17, 2010

+1 on the folder name humanly-readable

adamsmith · April 17, 2010

david - could you explain for what purpose and why e.g. a saved search/smart folder wouldn't work for you?

davidcoll · April 21, 2010

ok, hmm. Let's take an experience I add recently, I needed to export a sub-library of my Zotero library so that I could print it on another computer. To do this I looked for the .pdf in the directory "MyUserName\Documents\Zotero bibliographie\storage\..." but then it was really messy.

Check this image to understand what I'm talking about : http://funkybudha.net/wp-content/uploads/zotero_storage.jpg

In blue: I'm not inside a AppData\... directory, no, I'm inside a User\Documents\... . I' m Expecting to have human-readable document system here. You could see that as a usability request.

In yellow : You have the actual random integer directories. On the right, you see the name of the pdf [renamed with zotero] showing meta-data (author, year) as expected for this type of document.

Now, I'd need to see the same meta-data reflected on the directories to help me pinpoint rapidly the documents I need.

This is one.

(The number two is that I lacked the possibility to export only the pdfs out of my library. Yes, I did it, then searched for the pdfs inside the export, etc. But it was not an "elegant" method to do so.)

dstillman · April 21, 2010

Did you answer adamsmith's question? I don't think so.

davidcoll · April 21, 2010

Believe I did half the answer.

The problem with smart search is that one can't separate pdfs of a sub-library from another. One get all of them at once and even in multiple copies... So no, adamsmith, a smart search is actually not workable for retrieving sub-library group of pdf as the storage directory structure does not reflect the zotero library structure.

dstillman · April 21, 2010

Any reason you can't drag straight from Zotero? Multi-file drag is currently broken in Windows, but we maybe able to fix that in 2.1. Single-file drag works.

adamsmith · April 21, 2010

no one denies that this would be nice in general - it's a question of priority and I remain unconvinced that this is an urgent/high priority issue.

If you want to export a part of your library, just export as RDF with files - which is what, I think, you actually ended up doing (if you want just the files from that, search for pdf in the folder will do) - while the downside of this is that you have one additional step, the upside is that it allows you to export using Zotero's flexible collection-tag structure rather than a fixed directory structure:
Note that, crucially, Zotero collections are (purposefully) not equivalent to folders on your harddisk - one item can be in several collections and thus a directory structure cannot reflect the collection structure in Zotero - simply because they're two different things. And forcing one into the other is not elegant, either.

And the whole point about saved searches is that you _can_ have human readable data in a folder in your documents directory - you just need to use a built in function of your operating system - thus saving Zotero devs the time to re-program something that's already possible.

Dan/Trevor - this would be a good topic for the new Zotero Tips/Tricks series on the blog.

Once again - the point is not that this wouldn't be _nice_ in Zotero, but let me give you a small list of things that need yet to be implemented in Zotero and see if you really believe this is more important:

- Duplicate detection
- implementation of new CSL (allows for original date of publication and much more)
- Better note management (allow shuffling of notes, hierarchical notes and more)
- Hierarchical data structure - allow for chapters to be grouped as sub-items to e.g. an edited volume
- Improve tag functionality - esp. introduce colored tags for better visibility
- Improve functionality for other word processors through improved RTF-scan or similar feature
- Improve the style repository for searchability
- improve online display - allow for selective publication of libraries, allow sorting of online libraries etc.
- improve data input and output from/to other software especially Endnote - e.g. it's not possible to export files/file links along with an Endnote/RIS export.

and that's just a small selection of things that aren't possible to do without a workaround.

davidcoll · April 21, 2010

Single file work is just fine. The difficulty was with dealing with large number of articles/pdfs. Instead of "click 1" opening the article (+ sign), "click 2" then dragging it in windows, repeat this x times. Batch work is just not top of the shelf yet.

But hey, that's trivial stuff, no issue on that (no high priority) ! Take it as a request for the future, not as a bug. I was just pulling the thread so it wasn't forgotten.

davidcoll · April 21, 2010

I think you may need a pat on the back.. Let me give you one: Zotero is really a great tool !! Hope you the best ;)

thinkdata · April 21, 2010

adam and dan: I love the fact that you not only answer questions, but push back. This speaks to a level of passion that never makes it in a job description, and shows that you actually care about things.

For those of you not in software development, there is a school of thought that mistrusts a programmer that has an opinion, and considers such a person a "risk" from an HR perspective. The corporate drill is that you do what you are told to do, and little value is placed on "systems" thinking. (I of course, don't have any baggage in this regard)
doug

adamsmith · April 22, 2010

just for clarification: I have no job with Zotero.
I come here to do something useful while I'm procrastinating ;-).

Also, I don't see what I'm doing here (or elsewhere on the forum) as "pushing back" in any sense.

I view this - and open source software more generally - as a community effort that benefits from people getting a sense of what's going on, how the software works, some of the underlying concepts, and what type of considerations drive decision making etc. In my mind that's creates a sense of "ownership" of software that encourages people to contribute in one of the multiple possible ways and help make Zotero better. For that reason I also agree very much with miggug that it'd be helpful - and worth some dev time - if we could get some insight into the plans of the dev team at least for the medium run.

naught101 · April 3, 2012

One thing that meaningful foldernames might be useful for is figuring out which items in zotero own the folders. For example, I have trouble with storage space, so I want to see which large files I can get rid of. I can easily do this on the command line, but unfortunately I haven't smart-renamed all my documents, and the original sources weren't very smartly named to begin with. That means I basically have to guess what item "F7SJKFMP/frogs.pdf" belongs to. Some files are even less well named.

I guess there are probably other possibilities for solutions to this, but I don't really understand why a name like "[author][year]" or "[author][year][hash]" wouldn't be just as simple as the current system.

mronkko · April 3, 2012

I guess there are probably other possibilities for solutions to this, but I don't really understand why a name like "[author][year]" or "[author][year][hash]" wouldn't be just as simple as the current system.

It is not hash, but a key that is permanent for an item. One of the problems in this is that author and year might change during the life cycle of an item and might not be available when the item is created.

But if you want to play with the command line, you can probably solve your problem by modifying these scripts

http://forums.zotero.org/discussion/16148/create-a-humanreadable-folderstructure-from-zoterostorage/

https://github.com/mronkko/ZoteroCleanOrphans

(They are separate scripts, I did not put the first one on GitHub for some reason.)

odoyle81 · November 4, 2012

ummm... I want to export a bunch of pdfs so I can share them with others. When I do this via highlighting then right click export ris (with files), the folder names are meaningless numbers. How do the people that I'm sharing this know which folder to find which reference? Drag and drop only works out of zotero only works if I expand each refererence and highlight the actual pdf file to drag and drop.
Any easy solution that I'm missing?
Zotero was great until I needed to share with others who don't have zotero.. now it is a nightmare..
And no I can't use zotero groups because I don't have enough storage space for the pdfs and those people don't want to signup for an account at zotero and have to login etc etc... I can't help that.. they are my bosses and they just want the pdfs.
thanks for ideas

dstillman · November 4, 2012

Drag and drop only works out of zotero only works if I expand each refererence and highlight the actual pdf file to drag and drop.

Press the "+" key to expand all items and select the attachments you want or create a saved search (say, by Attachment File Type) and use Ctrl/Cmd-A to highlight just the search matches. Then use drag and drop.

odoyle81 · November 4, 2012

ok I found a workaround... highlight all refs, then use this trick to expand them all, then you can drag and drop and the pdfs will all land in one folder!

Press ”+” (plus) on the keyboard within the collections list or items list to expand all nodes and ”-” (minus) to collapse them.

http://www.zotero.org/support/tips_and_tricks