A PDF attachment fails to be indexed
Hello,
I'm trying to index an OCR'd PDF but nothing happens. I logged the output, which is submitted under the bud ID below. The strange thing is that, when I changed the name of the PDF, cleared the log, then tried again, I got the same error message but the new error persisted in identifying the file by its old file name (although the directory was correct). I changed it back, and it tried to index the file, correct filename and directory name, but same error. No idea what's happening.
Thanks,
Joe
Bug ID: D1955072479
I'm trying to index an OCR'd PDF but nothing happens. I logged the output, which is submitted under the bud ID below. The strange thing is that, when I changed the name of the PDF, cleared the log, then tried again, I got the same error message but the new error persisted in identifying the file by its old file name (although the directory was correct). I changed it back, and it tried to index the file, correct filename and directory name, but same error. No idea what's happening.
Thanks,
Joe
Bug ID: D1955072479
This is an old discussion that has not been active in a long time. Before commenting here, you should strongly consider starting a new discussion instead. If you think the content of this discussion is still relevant, you can link to it from your new discussion.
), using the old file name. But after a number of lines with JavaScript Errors there's a separator of =====, under which more details are given, which use the correct/current file name (which has an extra hyphen before the .pdf). This is not part of the log from before I changed the name, because in each case I've cleared the log between changing names.
But that's not the main problem--the problem is that this PDF doesn't index.
2nd bug ID: D174204464
In any case, if you go into that directory, is there a .zotero-ft-cache file, and if so, do you see text from the PDF in it?
I opened up the cache file and there's no PDF text there. Here is all the text there is (repeated for every time I hit the index button):
Title: Ryso0470 1..47
Creator: 3B2 Total Publishing 6.03d/W
Producer: Acrobat Distiller 3.01 voor Windows
CreationDate: Tue Feb 29 09:41:02 2000
ModDate: Mon Jan 22 14:48:57 2007
Tagged: no
Pages: 47
Encrypted: yes (print:yes copy:no change:yes addNotes:yes)
Page size: 595 x 841 pts
File size: 275838 bytes
Optimized: yes
PDF version: 1.4
But if 3.03 works, I think on Linux you can safely swap in the 3.03 pdftotext binary in place of the existing Zotero one. (We use a custom pdfinfo build, since the standard pdfinfo build doesn't support text file output, and custom versions of both on Windows to prevent console windows from popping up.)