Bug in Zotero beta: video iframe image not saved in snapshot.

Webpage https://lionbridge.ai/articles/deep-learning-is-dead-towards-artificial-life-with-olaf-witkowski/.
Note the third image down.
Saving snapshot then viewing it shows the image has disappeared.
Works fine in SingleFile.

Firefox 81.0.1 (64-bit); Zotero Connector 5.0.75beta2; Zotero 5.0.91-beta.9+479f01fa9; Windows 10 Home, v2004.
  • edited October 12, 2020
    We remove frames, because they're mostly ad content or dynamic content that wouldn't work without JavaScript anyway (as is the case here). Saving ad content in particular can drastically increase page size and saving time.
  • @dstillman: I disagree with the philosophy above. Making objects disappear unexpectedly violates the Principle of Least Astonishment (Wikipedia article). I quote:

    > The principle of least astonishment applies to user interface and software design. The following is a typical statement of the principle: "If a necessary feature has a high
    astonishment factor, it may be necessary to redesign the feature."

    If objects go missing from snapshots, then a user's internal dialogue might go like this: "Did I do something wrong? Is my software environment broken? Have the programmers stuffed up? Has dstillman stuffed up? Do I have to check my snapshots each time, to see how else they've gone wrong? What other parts of Zotero don't work as expected? Is Zotero riddled with bugs? Should I file a bug report? How do I do that? Won't they ask me tricky questions that I can't answer? I don't want to appear foolish; software always make me feel foolish. Where's my life heading, anyway? What's it all mean? Who can I trust?". Etc.

    I make these further arguments:

    > We remove frames, because they're mostly ad content...

    1) I might really want the snapshot to include those items that have mysteriously gone missing.

    2) I can remove ads easily enough with an ad blocker, eg, uBlock Origin.

    > ...or dynamic content that wouldn't work without JavaScript anyway

    3) I can remove ads and other junk, pre-snapshot, easily enough with, say, Nuke Anything, for *my* choice of what's junk and what isn't. (Yes, removing superfluous images does dramatically reduce snapshot size; I do it before every snapshot.)

    4) People surely understand that items no longer work in snapshots? Just like steam engines no longer work in photographs of steam engines. :)

    And finally, in my experience, SingleFile *doesn't* mysteriously delete stuff, and SingleFile is surely the gold standard in how to do webpage snapshots?

    Hope the above doesn't sound snarky; not my intent.
  • And finally, in my experience, SingleFile *doesn't* mysteriously delete stuff, and SingleFile is surely the gold standard in how to do webpage snapshots?
    "Remove Frames" is just a SingleFile setting, one of 50 or so. They default it on. We default it off, because we tested with it on and the results were unacceptable. Saving a NYT article with a video player takes over 15 seconds. Without, it takes 7. And even 7 is a significant performance regression from our pre-SingleFile snapshots, both on its own and because we now save snapshots within the browser rather than in Zotero when saving the current page, which allows us to capture the rendered page without a refetch but means that you can't leave the page before the snapshot has saved.

    The version with frames is also 40% larger, in this case because of a fairly pointless 1.4 MB Base64-encoded PNG poster frame — determined randomly by the current playback position of the auto-playing video player — on a nonfunctional video tag.

    It looks like there's much less of a difference in save time from SingleFile itself, so we need to do some profiling to determine why frames make things so much slower for us — there's some natural overhead getting snapshots over to Zotero and to disk from the browser, but I'm not sure why the difference between the two modes is quite so large in our case. But as long as a snapshot with frames takes 15 seconds, there's just no way we can leave that on.

    We've discussed adding an option to save frames, either for individual site translators or as a user-facing setting, but we're not going to do that as long as snapshots are taking 15 seconds.

    In general, bear in mind that Zotero snapshots are somewhat auxiliary — they're primarily a way to make sure that non-PDF full-text content is searchable within Zotero and that resources that may go behind paywalls or otherwise change or disappear remain available. If your main goal is to save pixel-perfect copies of webpages, you may want to continue using SingleFile with your desired configuration. You can always attach those files to Zotero items after the fact.
  • Thanks for the detailed explanation.

    > If your main goal is to save pixel-perfect copies of webpages,

    Not worried about pixel-perfect copies. My concern was for, say, text in the snapshot which refers to "the video below", when there is no longer any trace of a video below.
    But not a major problem :)

    > In general, bear in mind that Zotero snapshots are somewhat auxiliary — they're primarily a way to make sure that non-PDF full-text content is searchable within Zotero

    Not auxiliary to the way I use Zotero. That functionality is magic, and the major reason I use Zotero over JabRef.
  • edited November 2, 2020
    Since the above discussion, we've diagnosed and fixed the slower performance of Zotero snapshots vs. SingleFile, and we're currently evaluating what to do about "Remove Frames". We may be able to at least make it a hidden setting, even if it defaults off.
    My concern was for, say, text in the snapshot which refers to "the video below", when there is no longer any trace of a video below.
    Well, but again, the "video below" would be, at best, a randomly chosen still from the video, determined entirely by how far along the player happened to be. Maybe there's some value in that, but not a lot.
Sign In or Register to comment.