[Read Aloud] Feedback and Suggestions
First and foremost, many thanks for this feature. I cannot express how excited I am to see this supported natively.
A few remarks and suggestions that I hope may be useful:
1. The reading sometimes contains pauses or hiccups, especially with certain names, for example E. J. Lowe. (It's read as E *full stop* J *full stop* Lowe *full stop*.)
2. Words split across line-breaks are sometimes read incorrectly, for example metametaphy-sics.
3. The feature also sometimes reads watermarks in the margins or at the side of the page.
4. When playback is stopped and restarted, it begins again from the beginning. It would be helpful if it could resume from the previous position, or at least offer this as an optional default behaviour.
5. It is not currently possible to start Read Aloud directly from a highlighted line. In heavily annotated documents, this makes the feature somewhat cumbersome to access, since one first has to select text to reach the option through the context menu.
6. An autoscroll function would also improve usability, ideally with the text remaining centred on screen while playback continues. A toggle for this would be useful, similar to ElevenReader, where one can also scroll manually and recenter when desired. An option to disable this behaviour would, of course, also be helpful.
7. It would be useful to have a toggle for whether footnotes are read aloud. The default should probably be not to read them.
8. The highlighting during playback does not always cover the full sentence, which means that sometimes only part of the sentence is highlighted rather than the sentence as a whole.
I realise some of these may already be known issues. Apologies for any repetition, but I thought it would still be worth flagging them just in case.
A few remarks and suggestions that I hope may be useful:
1. The reading sometimes contains pauses or hiccups, especially with certain names, for example E. J. Lowe. (It's read as E *full stop* J *full stop* Lowe *full stop*.)
2. Words split across line-breaks are sometimes read incorrectly, for example metametaphy-sics.
3. The feature also sometimes reads watermarks in the margins or at the side of the page.
4. When playback is stopped and restarted, it begins again from the beginning. It would be helpful if it could resume from the previous position, or at least offer this as an optional default behaviour.
5. It is not currently possible to start Read Aloud directly from a highlighted line. In heavily annotated documents, this makes the feature somewhat cumbersome to access, since one first has to select text to reach the option through the context menu.
6. An autoscroll function would also improve usability, ideally with the text remaining centred on screen while playback continues. A toggle for this would be useful, similar to ElevenReader, where one can also scroll manually and recenter when desired. An option to disable this behaviour would, of course, also be helpful.
7. It would be useful to have a toggle for whether footnotes are read aloud. The default should probably be not to read them.
8. The highlighting during playback does not always cover the full sentence, which means that sometimes only part of the sentence is highlighted rather than the sentence as a whole.
I realise some of these may already be known issues. Apologies for any repetition, but I thought it would still be worth flagging them just in case.
Upgrade Storage
If I recall, there were some technical limitations with restarting from the previous position, but we'll see what we can do.See below. You mean because the Read Aloud option isn't shown in the context-menu option when you right-click on an annotation? It already should, but it looks like that may not currently be working for all PDFs. We'll investigate, but examples would be helpful. Do you mean footnotes themselves, or footnote numbers within the text? (But both are covered by 3 above.) Again, we'd want specific examples, since we may be able to improve them, but sentence detection is always going to be a little dicey, and this is what the expand option for is in the popup detection.Thanks for testing!
Or do you mean restarting after closing the Read Aloud popup (or closing the document entirely)?
> For the reading issues, you should say what voices you're trying with, and ideally provide specific examples. In general, the Standard voices will have more problems like these than the Premium voices. (The issue with name initials is known, though, and we should be able to improve that.)
I was working with Premium Voice 1. (I prefer UK voices.)
For example, in the sentence: ‘The present Element is then primarily an exercise in metametaphy-sics, that is, the field of philosophy studying the nature of metaphysics: its subject matter, branches, method, concepts, epistemology, and semantics.’ (See screenshots below.)
Here, the hyphen reflects a line break in the document. The model seems to get confused by this and pronounces it as ‘metametafee…sics’.
> You mean because the Read Aloud option isn't shown in the context-menu option when you right-click on an annotation?
Yes, there is a workaround: clicking either on the side of the document or somewhere in the white space between annotations. It is not a major issue, but it might still be worth considering adding the option ‘Read Aloud’ to the context menu for a highlight, as that would eliminate the problem entirely.
> Do you mean footnotes themselves, or footnote numbers within the text? (But both are covered by 3 above.)
I was thinking primarily about footnotes. Most of the time, I do not want them to be read aloud, but sometimes I do want to hear them when I am reading a text carefully. If they are skipped automatically, that means I would have to pause the model to read the footnote myself and then restart it afterwards. That said, when footnotes are not read aloud, it is actually useful to have the footnote numbers spoken, since that gives me a cue to pause and read the footnote myself.
> Again, we'd want specific examples, since we may be able to improve them, but sentence detection is always going to be a little dicey, and this is what the expand option for is in the popup detection.
An example:
https://s3.amazonaws.com/zotero.org/images/forums/u5387224/7obwh0kun7rb1408qvds.png
https://s3.amazonaws.com/zotero.org/images/forums/u5387224/fxtp3hqg6w5pc4yf74wu.png
In the first screenshot, a fairly large chunk of text, spanning multiple sentences, is selected as the current reading segment. (I must admit this can be inconvenient if you want to start from a particular sentence.) That being said, sentence-level highlighting still works reasonably well until the final sentence, after which the next highlighted segment becomes very small. Moreover, because the final sentence is split into three reading segments*, highlighting it would require using the highlight function three times to highlight one sentence.
I also noticed another just now:
9. When you change the playback speed, the output starts reading again from the beginning of the current reading segment.
It is not much of an issue when the current reading segment is small, but when it spans an entire paragraph, as in the screenshot, it becomes quite noticeable.
Ah, I realise that was ambiguous. Sorry! Yes, I meant that when you close and reopen the pop-up, playback starts again from the top of the page. It would personally feel more natural to me if it picked up from where it left off.
In any case, we have a new text parser coming that should improve a bunch of things in PDFs like this.
Even in that case, I would personally prefer it to continue from where I left off.
If I had to explain it, to me it is like returning to an audiobook after a short interruption. Closing the pop-up, or even the tab, perhaps because I want to skim the bibliography or look at another document for a while, does not mean that I want playback to restart elsewhere. I would prefer it simply to resume from where I left off, unless I decide to move to a different point myself.
> In any case, we have a new text parser coming that should improve a bunch of things in PDFs like this.
Got it. Thanks for the explanation, and I’m looking forward to the new parser. :-)
First of all, thank you again for all the work you have put into the Read Aloud feature. It is great to see how it is developing.
Now that I have had the chance to use the feature a bit more, I wanted to add a few follow-up suggestions.
10. Autoscroll and zoom-to-page behaviour
The main issue for my workflow is that, when I use spread view with zoom-to-page, autoscroll keeps moving the view while the text is being read.
I normally read in odd spreads with zoom to page height, where the relevant unit is the two-page spread. For that way of reading, I would like the full spread to remain visible while Zotero reads it, and then move to the next spread/page once the relevant text has finished.
Horizontal scrolling avoids some of the vertical jumping, but the sideways movement is also too strong for this workflow.
I can imagine a few possible solutions:
i) add an option to turn off autoscroll during Read Aloud;
ii) change Read Aloud behaviour when zoom-to-page is enabled, so that Zotero keeps the current page or spread fixed until the relevant text has finished;
iii) or perhaps tie that same page- or spread-level behaviour to wrapped scrolling, even if that is not what wrapped scrolling currently does.
With odd/even spreads turned on, switching between vertical scrolling and wrapped scrolling does not seem to make much difference for this issue. But wrapped scrolling could perhaps be used as the cue for this more stable page- or spread-level movement.
11. Commenting while highlighting in Read Aloud mode
I would like to request a quicker way to add a comment to the current highlight, ideally through a keyboard shortcut.
At the moment, I have to pause playback, click the highlight, open the comment bubble, type the comment, exit the comment bubble, and restart Read Aloud.
A shortcut for opening the comment field on the current highlight, or on the passage currently being read, would be fantastic.
12. Sentence-level rewind
I would also like to be able to rewind by one sentence. I use the play/pause and forward/backward keys on my keyboard quite a lot. It would be preferable if the backward button could go back one sentence, while the arrow keys could move by whole paragraphs.
13. Deleting highlights issue
When I delete a highlight through the Read Aloud menu, Zotero sometimes jumps back to the page where Read Aloud originally started. If I pause while deleting the highlight, it can also start reading the paragraph from the beginning again.
14. Position of the Read Aloud bar
This is a minor workflow preference, but I would prefer the Read Aloud bar to sit slightly more to the left by default. I normally read using spreads and zoom-to-page, and the bar currently seems to align with the sidebar rather than with the page. For me, it would feel more natural if it aligned with the side of the page and moved right only when the sidebar is open.
There could be a way to add a comment from the Read Aloud annotation popup, but I don't think we would have it stop playback automatically. Option/Alt + left/right (buttons or arrow keys) moves by sentence instead of paragraph.
Oh, sorry. I realise now that I was unclear. I just meant that the steps I described are my current workflow.
What would be ideal would be something like: hit H to highlight, then hit, for example, C to open the comment bubble, type the comment, and then press Esc to close the bubble.
I would personally probably pause playback while typing the comment, but that is separate from the main point, which is having a shortcut that opens the comment bubble on the highlighted line. I agree that it seems more logical for playback not to stop by default, since different users may want different behaviour here.
> Option/Alt + left/right (buttons or arrow keys) moves by sentence instead of paragraph.
Oh, that's embarrassing. I'm not sure how I missed that. Thanks!!