Way(s) to clean up zotero data folder?
My Zotero library is starting to get near the limits of my webdav server. One thing I've noticed is that sometimes there are "extra" PDFs in a zotero folder that aren't in the zotero database. For example, in the zotero interface, Paper A has one PDF attachment, but if you choose "Show file", there will be 2 or 3 in the folder, mostly repeats with different filenames, but of the same file. I have a feeling that there are quite a few of these "dead" files running around my zotero data directory and that they are eating space.
Another thing is that I have 746 records in Zotero, but my zotero data folder has 1486 folders. Are these records I have deleted in the zotero interface (perhaps in previous, pre-trashcan versions)? If so, are these deleted files still there, eating space?
Is there a way to poke around for any files in the Zotero directory that are not "known" by Zotero and remove them?
Thanks :)
CB
Another thing is that I have 746 records in Zotero, but my zotero data folder has 1486 folders. Are these records I have deleted in the zotero interface (perhaps in previous, pre-trashcan versions)? If so, are these deleted files still there, eating space?
Is there a way to poke around for any files in the Zotero directory that are not "known" by Zotero and remove them?
Thanks :)
CB
Only one file is linked to Zotero for each attachment, so there's no harm in removing non-primary files. (This also means Zotero won't detect any deletions, so you'd have to clear the files from the WebDAV server and Reset File Sync History in the Sync->Reset pane of the Zotero prefs.) Only attachment items have 'storage' folders. You can use an Advanced Search for [Item Type] [is] [Attachment] and Select All to get a count (though that would include linked files and web links as well). Not really, at the moment. We can probably write a little utility function for this at some point, or somebody could write a plugin to do it.
Obviously, if you can reproduce the orphaning of a directory in the 'storage' folder, that's a bug. We're not aware of any current bugs of this sort, but it's possible they've existed.
As for the multiple PDF issue, I've had zotero since an early version, across multiple computers, multiple platforms, multiple webdavs, several newbie screwups, and several system failures, so I'm quite certain it was some wacky thing I did when I was still figuring out how to get it all running smoothly (which it seems to be now). I blame at least some of the duplicate PDF issue on me trying to manually fix duplicate records and update filenames in the days before the "rename file from metadata" option was there, but I really doubt I did that for 700 files.
Without a way to identify the orphaned files, it's hard to know if it is a zotero bug or just a Dumb Newbie User Error.
+1 vote for a "clean out storage folder" utility
+1 vote for someone to write a plug-in!
What data would I lose by doing this (folders? tags? notes?)
Thanks,
CB
For the dupes, one thing you could do is create a smart folder via your OS for PDFs within your 'storage' directory and just go down the list, deleting the obvious dupes. Then do what I said above to trigger a re-upload.
What about if I reset the webdav sync (ie - overwrote the webdav from my local Zotero), then did the new firefox profile thing, and downloaded everything from the webdav? (making backup copies just in case, of course).
Would that keep the references in my Word docs happy? The neat-nick in me would really like to clean up everything, everywhere.
First, make a backup copy of your Zotero data directory.
Then do the smart folder thing above (or find and remove the dupes manually), since that's the only way to take care of unwanted files in valid directories.
Next, delete all files off your WebDAV server, Reset File Sync History, and do a sync. Zotero will sync only the folders it knows about.
Then delete the 'storage' folder within the data directory. Leave the database and all other files intact. Sync, and Zotero should pull all the files back down.
(Alternatively, to avoid the extra download, you could just compare the files on the WebDAV server with the local folders and delete the extra local ones.)