Could not read text from pdf error - with all pdfs
Hi folks,
I am trying to add references for a number of downloaded pdf's. In each case, I add the file to the library with "Link to File..." try "Retrieve Meta-Data for PDF." The pop-up dialogue then tells me "Could not read text from pdf". The info window for the reference indicates that it is not indexed. My preferences tell me that I have pdf2text 3.02 installed (Mac Snow Leopard). Rebuilding the index does not help. If I open the pdf in Preview, I can select, copy, and paste the text suggesting that the full text is in the pdf. There aren't any odd characters in the pdf file name (suggested elsewhere in the forum).
Any advice? Here's a downloadable copy of the pdf from my dropbox, to help debug/diagnose this.
http://dl.dropbox.com/u/245354/Goff%202008.pdf
Any help would be appreciated tremendously!
Henry
I am trying to add references for a number of downloaded pdf's. In each case, I add the file to the library with "Link to File..." try "Retrieve Meta-Data for PDF." The pop-up dialogue then tells me "Could not read text from pdf". The info window for the reference indicates that it is not indexed. My preferences tell me that I have pdf2text 3.02 installed (Mac Snow Leopard). Rebuilding the index does not help. If I open the pdf in Preview, I can select, copy, and paste the text suggesting that the full text is in the pdf. There aren't any odd characters in the pdf file name (suggested elsewhere in the forum).
Any advice? Here's a downloadable copy of the pdf from my dropbox, to help debug/diagnose this.
http://dl.dropbox.com/u/245354/Goff%202008.pdf
Any help would be appreciated tremendously!
Henry
Goff, D. C., Lamberti, J. S., Leon, A. C., Green, M. F., Miller, A. L., Patel, J., Manschreck, T., et al. (2007). A placebo-controlled add-on trial of the ampakine, CX516, for cognitive deficits in schizophrenia. Neuropsychopharmacology, 33(3), 465–472.
Not sure if I have any idea of what is up, but a little more info might help.
On the linked .pdf file does Zotero tell you it is indexed in the right column? Does it have a page count?
What do the index statistics on the search pref pane look like (#indexed, #partial, #unindexed, and #words?)
On my side, the pdf is definitely not getting indexed - "Indexed: No" on the right. Clicking the reindex button doesn't cause it to be indexed; and in preferences if I clear the index and force a re-index, it still isn't indexed.
Zotero error report notes:
[JavaScript Error: "Goff 2008.pdf was not indexed" {file: "chrome://zotero/content/xpcom/fulltext.js" line: 476}]
Report ID: 890701973
So perhaps this is a pdftotext problem; however I installed this directly from zotero. I'm on Snow Leopard - I wonder if there is another pdftotext that is causing a conflict?
(3)(+0000003): pdftotext version 3.02 registered at /Users/henry/Library/Application Support/Firefox/Profiles/ct405b3k.default/zotero/pdftotext-MacIntel
(3)(+0000000): pdfinfo version 3.02 registered at /Users/henry/Library/Application Support/Firefox/Profiles/ct405b3k.default/zotero/pdfinfo-MacIntel
However, after I click to index that pdf
(3)(+0000001): Running pdfinfo "/Users/henry/Library/Application Support/Firefox/Profiles/ct405b3k.default/zotero/storage/24572ZW5/Goff 2008.pdf" "/Users/henry/Library/Application Support/Firefox/Profiles/ct405b3k.default/zotero/storage/24572ZW5/.zotero-ft-info"
(3)(+0000024): Running pdftotext -enc UTF-8 -nopgbrk "/Users/henry/Library/Application Support/Firefox/Profiles/ct405b3k.default/zotero/storage/24572ZW5/Goff 2008.pdf" "/Users/henry/Library/Application Support/Firefox/Profiles/ct405b3k.default/zotero/storage/24572ZW5/.zotero-ft-cache"
(2)(+0000007): Goff 2008.pdf was not indexed
Thoughts??
Now everything works as expected - I can add a pdf; index it; and retrieve the meta-data.
Very odd - however, I'm happy now.
The problem is resolved by re-installing as I mention above - if there's anything helpful I can do to document the problem let me know - I'd be happy to do so. However, since it's working now I probably can't replicate the error.
The origin of it is puzzling though, as I installed the pdf* through Zotero in the first place.
I am running Zotero on a Linux machine. Tried to follow your path to the solution of the problem but the explanation of the steps # hmahncke gave were not clear to me. What does 'reinstalled from zotero' mean?
Would be really grateful if you could spell it out more precisely so I can try and solve the same problem, thanks.
Zym
On Mac OS X, I can click on the zotero icon in firefox, which brings up the zotero window; then open preferences from the gear menu; then go to the Search pane; then install PDF indexing from there. My understanding is that this installs the zotero specific versions of pdftotext and pdfinfo. I assume it works this way in linux as well, but I don't have the linux version.
Best regards,
Henry
I tried to uninstall zotero but to no avail. It seems that nothing changed after I installed it again.
Thanks,
Zym
But did you delete the four pdf* files from the zotero directory; then reinstall the pdf tools from zotero; then clear the index and rebuild it as I described above? Your comment says you "don't know how and whether reinstalling pdftotext will do the trick." Have you tried it?
Henry
I did follow your instructions to a t and have to say that this did not yield any satisfactory results.
I even tried to reinstall my whole firefox and zotero installations, to no avail
When I enabled debugging mode in zotero it gave me something of 18000 lines of output.
Everything else works fine but this is the one zotero feature I was looking out for a long time.
Hope someone here can help me.
I can provide the output file if anyone knows how to make any sense of it.
Thanks
Zym
Sorry to bother but I have a similar problem and am at a loss trying to solve it. I can't index files nor retrieve meta-data.
I tried to migrate from Windows (XP) to Linux (Mint KDE) so I copied my entire data directory, the same way I've successfully done it in the past from a different Windows version to another. I used FEBE and OPIE to restore my add-ons and preferences so I thought it was the culprit. I deleted everything, including what was related to mozilla in usr/lib and /home. But to no avail.
After, I suspected that it may be caused by the duplicate files of pdftotext and pdfinfo since the win32.exe were already included in the Zotero directory and Zotero asked me to reinstall the Linux versions. To be sure, I deleted everything once again and started from scratch, deleting the windows files before reinstalling both Zotero and the pdf tools from the preferences pane. Didn't work neither.
But... If I import files (save copy of file) in the default folder (I forgot to tell that I had set up a personalized one in "documents") with the pdf tools installed there, everything works fine. So I figured out there could be some problem with the path used. I deleted everything in the default folder and replaced it with my old database, excepted pdfinfo and pdftotext, which I reinstalled once again. Didn't work. In fact, only the default document (the start up guide) shows up, meaning that it doesn't read the database (which it did when I was using the personalized folder).
Of course, I could copy all my files frome the "save copy" command, but that would take quite a while and I'd love all the meta-data already retrieved.
If somebody could give me a hint, it would be really appreciated. Thanks in advance!
I probably could sync it but my bandwidth wouldn't really allow such a big upload and subsequent download...
I encountered your identical problem and finally got to this thread.
I followed @hmahncke's instruction (thank you, hmahncke!) and everything just worked like a charm!!
First you need to delete the 4 pdf* files (pdfinfo-Linux-i686, pdftotext-Linux-i686, pdfinfo-Linux-i686.version, pdftotext-Linux-i686.version) in zotero directory, for me it is /home/luzerno/.mozilla/firefox/fgovp8a2.default/zotero
Restart Firefox, ta-da!
Thank you again, hmahncke!
Hope this will help.
Thanks!