Delete all snapshots in one go

smoother · May 8, 2016

Hi, I found that backing up my Zotero lib costs a long time. I guess the snapshots contain a large number of files which are seldom used by me. Just wonder how i can delete all these snapshots? I need to keep the PDFs.
Thanks!

dstillman · May 9, 2016

If you can match them with a search (e.g., for "snapshot" in "All Fields & Tags" mode), you can press Ctrl-A/Cmd-A to select all the matches (the black, as opposed to gray, matches) and then press Delete/Backspace to delete them. You can also match them with an advanced search (e.g., "Attachment File Type" "is" "Web Page"), save it as a saved search, and do the same from there.

smoother · May 9, 2016

Dan, that helps a lot using search function. thanks!!

bob.builder77 · May 11, 2016

Snapshots often download tens or even hundreds of the little files that go into modern websites - javascript files, svg files, gifs by the million - as well as the html and css files that form the 'core' of the webpage. If you want to keep the html and css without all the rest (which will usually leave you with a crippled web page that pretty much just has the text and not much else - but that's usually all you need!) you can navigate to Zotero's storage folder, search through ALL the folders for the unnecessary file types (.js, .svg, .gif, .jpg etc) and delete them all in one go. That streamlines your library without *completely* losing the snapshots if you ever need the text for later.

smoother · May 11, 2016

Hi bob.builder77,
Thanks for the suggestions, it is indeed a good way if want to view the text later, especially for those without pdf full-text and during off-line.

I have deleted all the snapshots using the search function suggested by Dan. The size of my zotero file folder now occupied ~3.5 GB, but previously it was 10+ GB. It also helps to save a lot of time during backing up the lib.

bulrush · May 23, 2016

> If you can match them with a search (e.g., for "snapshot" in "All Fields & Tags" mode),

NOTE: This only works if you have NOT renamed the snapshots. I found that I did rename one of my snapshots.

vooras · January 21, 2019

Anyone know how to efficiently delete all snapshots for items that have PDFs? That is, I would like to keep snapshots if I don't have a PDF.

adamsmith · January 21, 2019

Nothing super simple, but you can create a search for
attachment is PDF
attachment is Webpage
and include parents & children, create a saved search. Tag all the top-level items in the saved search with an otherwise unused tag (most easily a colored one so you can batch assign by number key), then filter by that tag and search for Snapshot as described above.

ria3k · January 27, 2019

Dan's advance search also captures single-file HTMLs. What if I only wanted to capture Zotero-produced snapshots in my search, including those I have renamed such that the word "snaptshot" is not in their title?

At some deep level, Zotero knows the difference between these files because they have different icons associated with them. (Since I'm trying to convert all old snapshots to single-file HTMLs, it would be nice if there were some search method that could tap into that deeper metadata.)

ria3k · February 1, 2019

Well, in the meantime, I'll offer a hack.

[NB: This is for anyone interested in converting multi-file webpages (or otherwise isolating only snapshots for deletion even if you've changed their names and so on).
It will only be meaningful to you if, like me, you have a substantial mix of newer content-rich single-file HTMLs (that you want to keep) along with older snapshots or multi-file webpages (that you want to convert or discard). The main reason for attempting this at all is freeing up disk space and cutting down on clutter. If none of this applies, these suggestions are not for you. If you have better suggestions, please offer them. And finally: all the usual caveats... backup your data; attempt at your own risk; proceed with caution.]

1. Using a script or text-editing app of your choice, hunt down a metatag in your single-file HTMLs that's unique to them. It will depend on which browser add-on you used to make them; you'll have to open a few and have a look. Find-and-replace all instances of that tag in all .html files such that you insert a hidden div next alongside it in every file. This div should having some identifying text inside of it, e.g., "single_file_HTML." This will trick Zotero into indexing that text as real content, although in 99 percent of cases it will not affect how the page displays, because a div outside a body will generally be ignored by web browsers. Like I said, this is a hack.

2. Rebuild your full-text index in Zotero. If you don't index, turn it on.

3. Create and save an advanced search for all Attachment Content containing your hidden text, in all Attachment Types that are webpages. For simplicity, create a new tag such as #single_file_HTMLs and tag all items in your search results with it. Now you can create an advanced search for all webpages that are not tagged in this way: i.e., your snapshots. Delete them, or do whatever you want with them.

When you're done, you can undo the find-and-replace routine in step 1. Probably a good idea.

Using these steps, I was able to find all snapshots and multi-file htmls.

Converting them in one batch, as expected, was a bit more complicated. But it reduced my storage footprint by about 40 percent, so worth it. One day I may post on that process. But it will become largely irrelevant (I hope) once Zotero goes to single-file HTMLs for snapshots by default, which I believe is planned.

RoiAd · March 20, 2022

@bob.builder77 does that method also delete them in online library?

ria3k · March 21, 2022

I don't use the online library for storage, because of the space limitations of Zotero's (free) usage. But I assume if you are syncing then, yes, any changes you make locally will affect your online library.

I find it more convenient to use relative links for my attachments and store my library locally, then sync it across machines with a platform that offers more space. For instance, my institution offers virtually unlimited space via Google Drive. Using Google Drive sync, my attachments are available and linked to the Zotero data library on each machine I use.

I believe elsewhere in the forum you can find further instructions on how to implement that if you want to go that route. Also, incidentally, I haven't looked into this issue recently so I don't know if the latest version of Zotero has moved yet to single-file snapshots, in part because the process I'm using now works so well. Good luck!