Workflow for saving and annotating web pages as HTML files

Dries_B · July 15, 2022

Hi all,

In this discussion, I would like to describe a workflow for saving and annotating web pages as HTML files.

The first step in the workflow is saving the source material. This can be conveniently done using the Zotero Connector’s ‘Web page With Snapshot’ option. Alternatively, I sometimes use the Print Edit WE tool (https://addons.mozilla.org/en-US/firefox/addon/print-edit-we/), which allows for web page cropping.

Unfortunately, there do not seem to be any straightforward tools for HTML annotation that suit my requirements. Hypothes.is does not fully support Firefox, which is my browser of choice. Also, it stores annotations online while I want to be able to keep my work locally.

Therefore, instead I have been testing an own implementation: I have used a ‘what you see is what you get’ (WYSIWYG) editor, which Wikipedia describes as “a system in which editing software allows content to be edited in a form that resembles its appearance when printed or displayed as a finished product”. Such an editor can be used to highlight text of interest, and to add notes directly in the text as well. For me, BlueGriffon (http://bluegriffon.org/) works.

One can add extra value by saving the edited file as a separate version. Highlights and notes can then be extracted by computationally checking differences compared to the original. (Please ask if you need suggestions on how to do this.)

What do you think about this workflow? I would be interested to receive your suggestions for developing it and documenting it more clearly!

Dries_B · July 18, 2022

Some delving within the Zotero forum resulted in the finding that adouwa was already using an almost identical workflow in 2016.

They used LibreOffice Writer as a WYSIWYG editor, although I think that it tends to mangle one's HTML code.

If you have any comments or suggestions, please get in touch @adouwa!

Dries_B · July 20, 2022

After some more searching and testing, I found that the Firefox annotation add-on TextMarker allows for easier highlighting than BlueGriffon. Conveniently, highlighted regions can then be found (and potentially extracted) in the HTML code by searching for "textmarker-highlight".

TextMarker cannot remove highlights from local files, however, and saving notes in local files looks quite messy. So it is probably best to use it in combination with a WYSIWYG editor.

luispuerto · October 11, 2022

Hey!

I've started to use Zotero now, I was using it a while ago, and my workflow with sites is converting them to markdown. In the end, what I'm interested on is the text and markdown is quite simple tool to use and you can even use Critic Markup to highlight or comment.

divercl · June 3, 2023

The easiest way I found is to 1) use a browser extension to save the webpage to Zotero with Embedded Metadata; 2) Save the webpage to your desktop as PDF; 3) Attach that PDF to the webpage in Zotero.

smidm · June 10, 2023

It would be a huge leap for Zotero to have an html viewer / annotator on par with pdf!

I read mostly papers uploaded to arxiv.org. There is an experimental arxiv project offering html version of papers generated from LaTeX sources (https://ar5iv.labs.arxiv.org/, or just swap arxiv to ar5iv in axiv url, e.g. https://ar5iv.org/abs/2004.03686). Similar works are already in progress or ready on biorxiv.org and other sites. The html format is the future of scientific publishing. The accessibility and device independence is so much further than with pdfs. I'd like to read and annotate papers on smartphone or tablet without fiddling with pdfs.

Sayuj7 · October 24, 2023

+1 to smidm. An HTML viewer is required!

adamsmith · October 24, 2023

Zotero 7 has web (and epub) annotations, available in beta

various-pectin.0u · February 13, 2024

Annotations are great - but you can't do it in the iOS version, just desktop. Back goes my iPad into the tech drawer :-|

ulahcherubim · February 13, 2024

I think it is planned to have epub and html views in mobile apps.