Zotero and OCR
Hello, I use Zotero as my daily companion to manage and store information of all kinds. It works perfectly for almost all my applications, but there is one functionality that I lack.
Many of my lecturers at the university provide parts of the scripts as PDF files. However, these are only scanned images. To insert these PDF files sensibly into Zotero, it would be very useful to have a text recognition in the background, which processes the PDF files automatically. I have projects like OCRmyPDF (https://github.com/jbarlow83/OCRmyPDF) in mind.
My question would be whether there is any ambition or even interest to integrate such functionality into Zotero. Since I need this functionality myself, I would also be willing to put work into the implementation, as far as I am able to do so.
Many of my lecturers at the university provide parts of the scripts as PDF files. However, these are only scanned images. To insert these PDF files sensibly into Zotero, it would be very useful to have a text recognition in the background, which processes the PDF files automatically. I have projects like OCRmyPDF (https://github.com/jbarlow83/OCRmyPDF) in mind.
My question would be whether there is any ambition or even interest to integrate such functionality into Zotero. Since I need this functionality myself, I would also be willing to put work into the implementation, as far as I am able to do so.
There could be an overlapping interest for such a feature/plugin for Tropy as well.
@adamsmith Can the Acrobat OCR also be scripted?
Some facts to OCRmyPDF: It is not only a OCR tool, it generates searchable PDF files out of a given PDF file with only images of the text. And the new PDF file has the text at the same position as the text in the image.
I work almost exclusively on linux and really lack a tool that produces a good quality OCR documents that are searchable and have text that is easy to annotate and copy in the way that proprietary tools such as Adobe's do.
My current workflow uses a Windows application running under Wine to achieve this.