HTML5 conversion for PDF (e.g., ar5iv

The Zotero 6 for iOS is amazing and I am grateful. This is a feature request for automated HTML5 conversion.

I noticed that the iOS and iPadOS app has a "plain text" button in the PDF viewer.
(For some reason, the desktop app doesn't have this; I'm curious as to why not.) The button is in the upper-left-hand corner with a capital 'A' and horizontal lines representing text. This looks very nice for getting a simple snappy version of the plain text of the paper.

However, this understandably doesn't work well for papers with math, with equations appearing to be converted crudely into unicode. My suggestion is that Zotero offer an option to convert the PDF to HTML5, which supports rich math, using an engine like the one that powers ar5iv.org. This lovely service recently partnered with the arXiv to make available an HTML5 version of everything in the arXiv repository.

https://twitter.com/dginev/status/1495779696105119745

ar5iv's conversion process is not 100% perfect, but it's 99% perfect and highly useful.

I do not know if it would be reasonable to perform this conversion locally on the iOS device. If not, one more limited option would be to simply download the HTML5 version for each item in the local Zotero database that happens to be from the arXiv.
  • Oy, silly mistake by me. ar5iv is latex-to-HTML, not PDF-to-HTML. So the only realistic option would be just fetching HTML versions from repositories like arXiv for which papers are known to have pre-made HTML version. That would still be very useful for me, and it's probably easier to implement on Zotero's end, but the audience would certainly be smaller.

    This could take two possible forms. a Zotero plugin or a modification of the arXiv translator in the Zotero connector.
Sign In or Register to comment.