Automaticaly add webpage to archive.org
Zotero is capable of doing a snapshot when you save a webpage. The problem with those snapshots is they can be quite heavy and not 100% reliable. Also, most of the time, the website is still up so doing this snapshot is useless.
Worst, snapshots are useless for the reader.
It would be cool to be sure that any page you save as a reference will be accessible in the following years.
I think the best would be to use https://archive.org to save the page, and keep the archive.org link in the Zotero reference. I'm thinking that it's even better to cite with the original link and the archive.org link (a bit like what Wikipedia does now).
That would also be cool for achive.org as it will provide some "curated" content to archive.
It can definitely be done in a plugin, but I think it would be a great addition to Zotero core.
To save a page on archive.org you simply have to visit https://web.archive.org/save/URL
Worst, snapshots are useless for the reader.
It would be cool to be sure that any page you save as a reference will be accessible in the following years.
I think the best would be to use https://archive.org to save the page, and keep the archive.org link in the Zotero reference. I'm thinking that it's even better to cite with the original link and the archive.org link (a bit like what Wikipedia does now).
That would also be cool for achive.org as it will provide some "curated" content to archive.
It can definitely be done in a plugin, but I think it would be a great addition to Zotero core.
To save a page on archive.org you simply have to visit https://web.archive.org/save/URL
1.) The full text searchability of Snapshots from within Zotero. I know this is crucial for a number of users
2.) The fact that the Internet Archive follows robots.txt instructions to not archive which include substantial parts of the research-relevant internet including e.g. the New York Times and Quora
I think, the biggest value using archive.org is for citations and long term references.
If we have a functionality to use archive.org, then we can for example add an option in the Zotero snapshot to only save only the rendered HTML of the page (no image, no JS, no CSS). A bit like when you use the "read mode" on Firefox. Both functionality can be complementary.
How can we make this happen?
Haven't tested, but looks pretty good.
I don't understand why Zotero doesn't prioritize this feature, link rot continues to be a very real problem, as well as the state of websites changing over time.
https://github.com/lanl/Zotero-Robust-Links-Extension
and
https://github.com/leonkt/zotero-memento/
But neither of them seems to have been updated to Zotero 7, nor does there seem to be any initiative to do so.
It's sad this project between the Internet Archive and Zotero (that was announced in 2007) didn't pan out as expected:
https://dancohen.org/2007/12/12/zotero-and-the-internet-archive-join-forces/