Zotero OCR/Tesseract tutorial

erazlogo · July 23, 2022

Hey, I wrote up a Zotero OCR/Tesseract tutorial for my students--maybe people here will find it useful.
https://publish.obsidian.md/history-notes/04+OCR+in+Zotero

dduckie · February 28, 2023

Thank you so much this is wonderful!

morettifc · February 28, 2023

Thank you so much! I've been able to get it almost right. Apparently the plug in works, but it doesn't produce a new .pdf file that I can read, but a separate .txt file that is saved on the computer. Has anyone already encountered the same problem?

morettifc · February 28, 2023

Never mind. I've gotten the link to the pdf but it doesn't save in the same collection. Weird

Eseila · June 12, 2023

Hey there!

I followed the steps of that tutorial but when i try to OCR the PDF nothing happens. I can sometimes see in the task manager, that pdftoppm is actually using CPU and doing something but other than that nothing happens. Any idea why that is and how to fix it?

Thx

migugg · June 12, 2023

Thank you, I followed it and it is brillant, but please add a note that until this is repaired, every pdf will expand to ca 10 times its original size. For me, this makes the whole process useless. it may not for others, but they should know before they install it.

erazlogo · June 12, 2023

@eseila It works for me so far. The process can take a long time. You can click on "Show file" to see what's happening.

AndrewRRM · August 22, 2023

I've also followed the tutorial but nothing happens when I OCR the selected PDF.

AndrewRRM · August 22, 2023

I can see in the containing folder that all the intermediate PNG files have been created.

erazlogo · August 22, 2023

@AndrewRRM this happened to me once, but I was never able to reproduce it again. How big is your document?

AndrewRRM · August 22, 2023

Ah well, 200 pages. Too big? I'll try it on something smaller.

erazlogo · August 22, 2023

No, that shouldn't be too big--I ocr'ed 300 page documents without a problem. Something else must be stalling it. Can you try another document?

antonio-mv · September 24, 2024

@eselia and @andrewrrm
it happens to me too. have you solved the problem?

aborel · September 24, 2024

The problem might be totally different, many things have changed since August 2023.
In order to help you, we need:
1) the Zotero version
2) the exact Zotero-OCR plugin version
3) a screenshot of your Zotero-OCR preferences

If the PDF you want to process is freely available online, a link to it can also be very useful.