zotero ocr file size
Hi all zotero ocr users,
thanks to some great tutorials i managed to install zotero ocr. However, the output files of zotero ocr produce file sizes that massive, and that make using it pointless, since my hard disk would quickly be full (a book that was 14mb, became 295mb after OCR). I understand that OCR might increase filze size, but there are other OCR programs that manage file size much better.
Is there a way how to select output file size somewhere in zotero OCR?
thanks for help
thanks to some great tutorials i managed to install zotero ocr. However, the output files of zotero ocr produce file sizes that massive, and that make using it pointless, since my hard disk would quickly be full (a book that was 14mb, became 295mb after OCR). I understand that OCR might increase filze size, but there are other OCR programs that manage file size much better.
Is there a way how to select output file size somewhere in zotero OCR?
thanks for help
Will wait then for some kind of solution.
This is really a consequence of the underlying OCR engine, for a real improvement we'd need very fundamental changes. We're considering it but the roadmap hasn't been decided yet.
https://s3.amazonaws.com/zotero.org/images/forums/u4655716/2af6alaftfedlt408irw.png
Also - I had a bit of trial and error when I first started using this tool, so I had a few files where I OCR'd it more than once. While I deleted it from my zotero library (and cleared the trash) these files appear multiple times when I search in my computer's "storage" folder. Is there any way to delete the extra copies? Will removing it from my storage folder cause problems?
The overall size of data in your library can also be reduced if you unselect the "save intermediate PNGs" and "save output as a HTML/ocr" preferences, these aren't really useful once you are sure the plugin is working properly. You can also decide to "overwrite the initial PDF with the output" but I don't recommend it: in case something goes wrong, it is safer to keep the original, non-OCRed file.
The other parameters don't have any impact on the file size.
---
Multiple copies of OCR0'd files? I don't think I've heard of this before. If you can report the exact steps to reproduce this behavior, I'll be happy to take a look.
I hope this helps, don't hesitate to ask for more details or clarifications if necessary.
Does the "import the resulting PDF as a copy instead of as a file link" make a difference as to whether the new PDF is stored in the zotero storage?