Delete all snapshots in one go
Hi, I found that backing up my Zotero lib costs a long time. I guess the snapshots contain a large number of files which are seldom used by me. Just wonder how i can delete all these snapshots? I need to keep the PDFs.
Thanks!
Thanks!
Thanks for the suggestions, it is indeed a good way if want to view the text later, especially for those without pdf full-text and during off-line.
I have deleted all the snapshots using the search function suggested by Dan. The size of my zotero file folder now occupied ~3.5 GB, but previously it was 10+ GB. It also helps to save a lot of time during backing up the lib.
NOTE: This only works if you have NOT renamed the snapshots. I found that I did rename one of my snapshots.
attachment is PDF
attachment is Webpage
and include parents & children, create a saved search. Tag all the top-level items in the saved search with an otherwise unused tag (most easily a colored one so you can batch assign by number key), then filter by that tag and search for Snapshot as described above.
At some deep level, Zotero knows the difference between these files because they have different icons associated with them. (Since I'm trying to convert all old snapshots to single-file HTMLs, it would be nice if there were some search method that could tap into that deeper metadata.)
[NB: This is for anyone interested in converting multi-file webpages (or otherwise isolating only snapshots for deletion even if you've changed their names and so on).
It will only be meaningful to you if, like me, you have a substantial mix of newer content-rich single-file HTMLs (that you want to keep) along with older snapshots or multi-file webpages (that you want to convert or discard). The main reason for attempting this at all is freeing up disk space and cutting down on clutter. If none of this applies, these suggestions are not for you. If you have better suggestions, please offer them. And finally: all the usual caveats... backup your data; attempt at your own risk; proceed with caution.]
1. Using a script or text-editing app of your choice, hunt down a metatag in your single-file HTMLs that's unique to them. It will depend on which browser add-on you used to make them; you'll have to open a few and have a look. Find-and-replace all instances of that tag in all .html files such that you insert a hidden div next alongside it in every file. This div should having some identifying text inside of it, e.g., "single_file_HTML." This will trick Zotero into indexing that text as real content, although in 99 percent of cases it will not affect how the page displays, because a div outside a body will generally be ignored by web browsers. Like I said, this is a hack.
2. Rebuild your full-text index in Zotero. If you don't index, turn it on.
3. Create and save an advanced search for all Attachment Content containing your hidden text, in all Attachment Types that are webpages. For simplicity, create a new tag such as #single_file_HTMLs and tag all items in your search results with it. Now you can create an advanced search for all webpages that are not tagged in this way: i.e., your snapshots. Delete them, or do whatever you want with them.
When you're done, you can undo the find-and-replace routine in step 1. Probably a good idea.
Using these steps, I was able to find all snapshots and multi-file htmls.
Converting them in one batch, as expected, was a bit more complicated. But it reduced my storage footprint by about 40 percent, so worth it. One day I may post on that process. But it will become largely irrelevant (I hope) once Zotero goes to single-file HTMLs for snapshots by default, which I believe is planned.
I find it more convenient to use relative links for my attachments and store my library locally, then sync it across machines with a platform that offers more space. For instance, my institution offers virtually unlimited space via Google Drive. Using Google Drive sync, my attachments are available and linked to the Zotero data library on each machine I use.
I believe elsewhere in the forum you can find further instructions on how to implement that if you want to go that route. Also, incidentally, I haven't looked into this issue recently so I don't know if the latest version of Zotero has moved yet to single-file snapshots, in part because the process I'm using now works so well. Good luck!