OCR Parameters on linux

Hi, I wanted to check with you guys what are the parameters for the OCR plugin on linux.
I went to the github page and found the link to configuration for mac and linux does not work correctly.
So here are my config param:

First, I installed tesseract-ocr provided by my repositories.

For the OCR engine: /usr/bin/tesseract
For pdftoppm: /usr/bon/pdftoppm
For the language script: script/Latin

If you have a working param list, please let me know.
  • I'd recommend posting this as an issue to that project's github -- I don't think the maintainers are super active here.
  • For refererence I want to give some information here, also the issue might be (hopefully) already long solved:

    The easiest is to leave the configuration blank. Then the Zotero-OCR plugin will look for some default locations and possiblities for the poppler tools and tesseract. This includes calling the tools by their name e.g. tesseract, which should work as long as you have added it to the path variable.

    Only if the default (empty) configuration for the path does not work, then you should specify the path on your local mashine. This should be the complete path including the name of the tool, e.g. tesseract resp. tesseract.exe. In the debug log it can been seen what calls are exactly tried out in the end.

    The (default) language/script parameter is english, but can be changed. However, it is then crucual that you have installed the corresponding language/script model in tesseract. E.g. you can change it to script/Latin if you have installed that model in tesseract. The English language model (eng) is always installed.
Sign In or Register to comment.