Available for beta testing: Read Aloud

dstillman Zotero Team
edited 11 days ago
In the latest Zotero beta, we've added a major new feature to the reader: Read Aloud.

Read Aloud reads your documents to you in high-quality, natural-sounding voices. It works on PDFs, EPUBs, and webpage snapshots.

To start it reading, just click the headphones button in the reader toolbar.

As you're listening, you can skip forward or backward by paragraph or sentence (Option/Alt-click or Option/Alt-left/right for the latter), and you can start reading from a particular point by right-clicking and choosing Read Aloud from the menu.

An "Annotate Sentence" button — or H or U on your keyboard — will automatically highlight or underline the last sentence you heard (or the current sentence if you're more than a few seconds into it). After you create an annotation, a popup will show you the annotated sentence, and there are shortcut keys to quickly move, expand, or delete the new annotation.

Read Aloud requires an internet connection and a Zotero account for high-quality voices, which we're calling Zotero Voices. If you'd like to use Read Aloud offline, you can still use the text-to-speech voices available on your system, but the quality will be much worse.

We're offering two tiers of Zotero Voices: Standard and Premium.

Standard voices are generated on Zotero servers, and we're offering unlimited Standard minutes to Zotero Storage subscribers (including institutional subscribers), as well as 2 hours/month to free accounts. Standard voices are currently available for 8 major languages, but they don't support multilingual text — e.g., they can't read text in one language when set to another.

Premium voices are the highest-quality voices, processed by external text-to-speech providers. They'll make fewer mistakes and generally sound more realistic than Standard voices. They also support many more languages and can handle multilingual text (or just read your documents in a foreign accent, if you prefer!). We'll be offering a certain number of monthly Premium minutes (varying by the specific voice you choose) to individual Zotero Storage subscribers, as well as a small number to free and institutional accounts in order to try them out.

During the beta, you'll be able to request additional Premium Voice minutes for free in order to test them out and provide feedback. After we've had a chance to see some real-world usage, we'll provide more details on the monthly allocations and options for adding additional minutes going forward.

A few known issues we'll be improving:
  • We're currently including a large number of Premium voices, particularly for non-English locales. We'll be narrowing the list as we see which voices people prefer in which languages, so please don't become too attached to Zotero Premium Voice 32!
  • There can be a delay of a few seconds before it starts reading large PDFs. (This actually isn't text-to-speech time, just a local processing delay that we need to fix.)
  • It'll get much better at skipping headers, footers, footnote/endnote superscripts, etc.
Currently, Read Aloud is only available in the desktop app, but it'll be coming to the iOS app soon (and Android after that).

Please start new threads to report any problems you encounter.

We're really looking forward to seeing how people use this feature. Thanks for testing!
  • Congrats on making this a proper internal feature!!

    One feature from my TTS plugin ZoTTS, that you may want to consider adding was a "speak from here" function, useful for those who don't want to start from the beginning every time, and proved pretty popular.

    Thanks for all your continued hard work!
  • dstillman Zotero Team
    @Imperial_Squid: I mention that in my post — you can right-click anywhere and choose Read Aloud. But we may look to make that easier.
  • Ah nice, you're well ahead of the curve then!

    Congrats again! :D
  • I am have a TTS pluging I developed that allows you to run kokoro TTS locally and generate voice. Have you considered adding that option as well, so people can self host fully?
  • Can't wait to test this on Android, ideally with html / epub support. It works well on desktop. Thanks - been wanting TTS since when I first started using Zotero many years ago. Fantastic feature.
  • Thank you very much for introducing this feature.

    > Read Aloud requires an internet connection and a Zotero account for high-quality voices, which we're calling Zotero Voices. If you'd like to use Read Aloud offline, you can still use the text-to-speech voices available on your system, but the quality will be much worse.

    On, for example, a Google Pixel, I imagine the voices will be of sufficient quality, but on Linux I don't know. Will it be possible to use a local API to select text-to-speech voices on a Linux system?
  • Are will this feature opt-in or opt-out?
  • dstillman Zotero Team
    @FredH281: You can use local voices on Linux, but in our testing we were seeing literally hundreds of exposed "voices" on some systems, and they were mostly unusable. If you can configure a decent local voice, you could use it.

    @asmlibre: Not sure what you mean by that. It's just a button in the toolbar. If you don't click it, you won't use it.
  • edited 5 days ago
    Should we expect the unlimited tier storage to have unlimited premium voice time (or high enough that a single human user can consider it unlimited)?

    Hopefully using your own TTS server on the LAN e.g. https://github.com/hexgrad/kokoro will also be supported (maybe our option there is to configure it to show up as a 'local voice').

    Being able to identify each voice by at least gender and accent/dialect (e.g., British vs USA English) would be nicer than just numbers when users initially pick a voice from the text list.
  • AbeJellinek Zotero Team
    @ryanwwest, re dialect/accent: voices are supposed to be organized by region, but that behavior was accidentally removed right before the beta release. Really sorry about that! We'll have a fix in the next beta.
  • dstillman Zotero Team
    @ryanwwest: Standard minutes will be unlimited. We don't yet know what the Premium limits will be, but these have real per-second costs from external providers, so we're not able to offer unlimited Premium minutes.

    But again, you can request additional Premium minutes during the beta, so we encourage people to try them out and provide feedback.
  • dstillman Zotero Team
    Voice grouping by region (accent) is fixed now in the latest beta.
  • edited 4 days ago
    @dstillman I worry about the implications of this feature on our privacy, because it depends on third-party servers.

    I think it would be better if it was disabled by default and we could enable it optionally.
  • dstillman Zotero Team
    edited 4 days ago
    @asmlibre: You have to enable it. You literally cannot use the feature unless you click the Read Aloud button in the toolbar and choose a voice in the first-run dialog. It won't use a Premium voice unless you specifically choose one. There is no way for you to do this accidentally.

    (There's also no identifying information sent with Premium voice requests to begin with — just individual sentences from the document — so the privacy implications are minimal.)
Sign In or Register to comment.