Reducing the size of saved pages

Many web pages have elements I don’t wish to save, nor print.

I am mainly interested in doing this to save disk space under the storage directory.

I saw many postings referring to a Firefox add-on called Scapbook which lets you edit a web page before saving. It does a good job editing the page, but it does not save much space.

I did an experiment using the following web page http://online.wsj.com/article/SB10001424052970204136404577211391192172770.html?mod=WSJ_hpp_MIDDLE_Video_Third

I saved a web page using different methods, and then right clicked-> properties to find info about it

directly from firefox (Save as, complete webpage), 3.74 MB, 96 files, 2 Folders

directly from firefox (Save as, html only) 231 KB, 1 file
This looks horrible (if it looks good, if you did not clean the cache).

I next tested using Scrapbook after cutting unwanted elements using two different methods.

I saved edited webpage directly through the web page, not in the Scrapbook repository:, 3.61 MB, 92 files, two folders
This is how think it would behave if using Scrapbook before saving to Zotero.

Saving edited webpage through Scrapbook into Scrapbook repository, 912 KB, 156 files, 2 folders.

I have used another FF add-on called “Print Edit”. I use it to cleanup web pages before printing, but unfortunately it is not perfect. The Zotero tool bar icon and status bar icon disappear when using this tool (the page reappears as a pop-up). There is also a feature to save the edited web page, but this feature does not work, so I could not test it to see if it would save any space.


Does anyone know of any other Firefox tools to edit and reduce the size of web pages?
I bet if I could eliminate the unwanted elements on a web page before saving to Zotero my repository storage directory would be at least 50% smaller.


Thanks
Michael
  • Zotero saves snapshots using WebPageDump, which is itself based on Scrapbook (though at this point a very old version).

    In some cases, at least, Zotero translators automatically save the print-friendly page, and more can probably be changed to do that. You can obviously do the same when you save manually.
Sign In or Register to comment.