OCR Parameters on linux
Hi, I wanted to check with you guys what are the parameters for the OCR plugin on linux.
I went to the github page and found the link to configuration for mac and linux does not work correctly.
So here are my config param:
First, I installed tesseract-ocr provided by my repositories.
For the OCR engine: /usr/bin/tesseract
For pdftoppm: /usr/bon/pdftoppm
For the language script: script/Latin
If you have a working param list, please let me know.
I went to the github page and found the link to configuration for mac and linux does not work correctly.
So here are my config param:
First, I installed tesseract-ocr provided by my repositories.
For the OCR engine: /usr/bin/tesseract
For pdftoppm: /usr/bon/pdftoppm
For the language script: script/Latin
If you have a working param list, please let me know.
The easiest is to leave the configuration blank. Then the Zotero-OCR plugin will look for some default locations and possiblities for the poppler tools and tesseract. This includes calling the tools by their name e.g. tesseract, which should work as long as you have added it to the path variable.
Only if the default (empty) configuration for the path does not work, then you should specify the path on your local mashine. This should be the complete path including the name of the tool, e.g.
tesseract
resp.tesseract.exe
. In the debug log it can been seen what calls are exactly tried out in the end.The (default) language/script parameter is english, but can be changed. However, it is then crucual that you have installed the corresponding language/script model in tesseract. E.g. you can change it to script/Latin if you have installed that model in tesseract. The English language model (eng) is always installed.