webpage snapshot timing

Since sometime in mid October I've discovered that Zotero hasn't been saving snapshots on some sites I view regularly. It has never been an issue before, so I haven't been verifying.

Here's what's happening... I'm using current Chrome, current Zotero, and the current Chrome-Zotero connector.

If I click on the connector on Chrome, the Save to Zotero ... dialog shows up, and the snapshot icon is gray. After a wait of up to 30 seconds, it is highlighted. Watching this in Zotero, the main entry shows up in the view pane and after the wait, the attachment is noted with the ">" marker.

If I don't wait and move to another page, double clicking on the main entry opens the URL in Chrome. If I do wait, the snapshot is opened in a new Zotero tab.

Discussing this with ChatGPT, I added
user_pref("snapshot.disable_javascript", true);
user_pref("extensions.zotero.snapshots.readabilityBased", true);

following CHAT's assumption that this was related to the time taken to download the scripts, CSS, etc. (although I'm on a 300MBs connection)

Chat then suggested that the problem was the complexity of the web pages and that it could not find the main content.

If it isn't obvious by now, I'm a newbie to this level of conversation, although I've been using Zotero as a repository for my web research.

Any suggestions would be helpful.

Thx
Marc
  • I didn't provide the following...

    URLs
    ~20 to get snapshot for https://www.timesofisrael.com/liveblog_entry/herzog-says-decision-on-pms-pardon-request-will-weigh-only-the-good-of-the-state/

    ~15 to get snapshot for https://www.timesofisrael.com/liveblog_entry/herzog-says-decision-on-pms-pardon-request-will-weigh-only-the-good-of-the-state/

    Other sites were as long or longer but are behind pay walls

    Zotero 7.0.30 (64 bit)
    Chrome 142.0.7444.176 (Official Build) (64-bit)
    Connector 5.0.190
  • (LLMs are extremely likely to make things up about Zotero. Please don't try random things they suggest — just ask us. You should clear those prefs, which are totally made up.)

    That page saves extremely slowly via SingleFile, which Zotero uses for snapshots, possibly because of the extreme number of iframes (143) on the site. We'll see if there's anything we can do, but I'm seeing similar times with a Connector build from earlier this year, so I don't think there was any change on our end here.
  • I think that this kind of document complexity is going to become more the rule than the exception. The connector was never able to deal with pages from NationalPost.com. I have a chunk of Python that cleans them up and produces an HTML document that I suspect is like what Zotero does for readability mode. My routine is not polished and does not work for other pages that I thought were similar to the National Post ones. For National Post, I generate the HTML but then load them into my LocalHost, point Chrome at it and load those pages. I put enough metadata into the document to satisfy the Zotero tags I want.

    I get the point about LLM's - it is a crap shoot, but when it works it solves problems that I can't solve myself.

    Thanks for paying attention to this.
Sign In or Register to comment.