Zotero 7 beta: Download format selection?
Now that there is an epub reader and annotatable HTML, how are you thinking about importing full text?
Some sites do offer epub alongside PDF (e.g. Frontiers https://www.frontiersin.org/articles/10.3389/fsoc.2023.1190872/full ). Many pages offer full text HTML, of course.
I think it's fairly clear that most people will continue to prefer PDFs for now, but both HTML and epub have distinct advantage, not least with respect to accessibility (PDFs can be accessible-ish, but are much less likely to be and even accessible PDFs are a mess e.g. for people who want larger fonts).
Getting this right for the maximal number of users seems tricky -- I guess conceptually, you'd want something like a preference order? (such as: ideally PDF, if that's not available HTML, never epub) but that seems impossible UX-wise. Any thoughts on this?
Some sites do offer epub alongside PDF (e.g. Frontiers https://www.frontiersin.org/articles/10.3389/fsoc.2023.1190872/full ). Many pages offer full text HTML, of course.
I think it's fairly clear that most people will continue to prefer PDFs for now, but both HTML and epub have distinct advantage, not least with respect to accessibility (PDFs can be accessible-ish, but are much less likely to be and even accessible PDFs are a mess e.g. for people who want larger fonts).
Getting this right for the maximal number of users seems tricky -- I guess conceptually, you'd want something like a preference order? (such as: ideally PDF, if that's not available HTML, never epub) but that seems impossible UX-wise. Any thoughts on this?
I think we could come up with something reasonable for the preference — we basically would need a list with "PDF", "EPUB", "HTML" where the options could be both moved up/down and individually disabled. But, of course, we already save snapshots if snapshots are enabled, and often not full text. So then if someone puts "HTML" first, what does that mean?
It's a bit hacky, but a potential solution here is suggested by #3078, where we're likely going to just look for, e.g.,
full[ -]?text snapshot
in the title from the translators and use a localized string. Maybe that counts as a PDF alternative and anything else doesn't.Alternatively, we could just drastically scale back what we save snapshots for so that we're really only saving them as alternatives to the PDF. I assume there are some exceptions where we don't want to do that, but otherwise we have a situation where "HTML" is a full-text option along with "PDF"/"EPUB" but we also have the existing snapshot preference? Explaining that does seem fairly close to impossible.
(And then we have the problem that the updated translators will need to return all of these for Z7 to pick from, but they can't be served to Z6, because Z6 would just happily save them all. I'm not sure we ever came up with a great solution for this sort of situation, but I'll think about what we can do.)
Cambridge UP: Save PDF
NY Times: Save HTML
Preference HTML > PDF
Cambridge UP: Save HTML
NY Times: Save HTML
So far that seems doable and simple enough. This would primarily seem an issue to accomodate the Snapshot preference of people who don't want snapshots?
If we're thinking of the preferences above, would it be possible to also omit options, which will then *never* get saved so two more options:
Preference PDF
Cambridge UP: Save PDF
NY Times: Nothing
Preference HTML
Cambridge UP: HTML
NY Times: HTML
JSTOR (i.e. pages w/o useful HTML full-text): Nothing
Going that way, you'd be able to add epub into the logic easily, but it seems fairly complicated to convey?
The remaining question is if we ever want HTML and PDF and I think there my answer would be No, although we're currently doing that in a number of places. (We may want PDF and attached link, but that seems different enough to accomodate)
(I don't have anything on Z6/Z7 but yes, that's... tricky)
If we were sure snapshots were always full-text snapshots, we could remove the separate snapshot preference and just have the prioritization pref. It's always been clumsy that you had to configure those separately when some sites offered both. Sure. Just imagine a reorderable listbox with checkboxes and grippies for dragging:
==============
| ✓ PDF == |
| ✓ EPUB == |
| ✓ HTML == |
==============
==============
| ✓ EPUB == |
| ✓ HTML == |
| ✓ PDF == |
==============
=============|
| ✓ PDF == |
| ✓ EPUB == |
| HTML == |
==============