Saving web site snapshots in "reading format"

j_xella · October 18, 2024

I appreciate the idea of saving snapshots of a web site as close to the original as possible.

However, some web sites have too much unnecessary noise that may be irrelevant to the purposes I use zotero for (ads, for example). Saving all that noise can be both more distracting when reading the snapshot later and take a lot of storage space in the long run.

When saving a web site, it would be nice to have a possibility to save the "reading version" of the page, similar to what pocket or wallabag do.

Is it currently possible with zotero and, if not, can it be considered as a feature for the future?

pcasl · October 22, 2024

agree! I hope the snapshot can work like this one https://cn.clearlyreader.com/en/intro

suzannah · January 22, 2025

Yes Please! I agree.

With all the junk on a webpage these days search is basically trash. Half the time I search for a term, the results are filled with the text from an ad on a snapshot.

Also all of us neurodiverse people, the core of academia honestly, cant handle all that distraction. We need a good clean copy.

There's a ton of people finding markdown workarounds by taking the snapshot, then running it through some sort of readability cleaner, and trying to save it in markdown. I believe it would be fairly easy to use the mercury parser that all the other services are using and build this in. Anyone using capacities or obsidian has to manually do this step.

Please consider this.

tim820 · January 24, 2025

I use adblockers. So I don't really see any ads in my snapshots. Mainly uBlockOrigin, Privacy Badger, and NoScript.

If you can get the page looking the way you want it in a 'reading format' view, you should also be able to use the SingleFile browser extension to save the page, and then attach it to the item in Zotero.