Problem with indexing pdfs
Hello,
I cannot seem to indexing all my text PDFs. 1578997951
Is there a way to find out what the problem is?
I cannot seem to indexing all my text PDFs. 1578997951
Is there a way to find out what the problem is?
1352818398
The search engine stops working and I need to restart firefox to make zotero work.
thanks.
because it uses google scholar, google locks you down after a while because you look like a robot ;-).
- I imported my database with NO docs attached from Endnote;
- I used OCR on my 400 pdf files to then be able to index them with the Zotero function;
- I attached each file to an entry through a *link* using zotfile: this was the only way, to my knowledge, to rename in a standardize fashion my files and move them to a folder of my choice as opposed to zotero's numbered directories. I did this also to avoid conflicts with Dropbox - it took me some time but I was very pleased with the result.
- I then tried to maximize the number of indexed files in a variety of ways: I tried to rebuild the index from zero; I tried to "index unindexed items"; I tried to clear the index and do it again... No matter what I do, Zotero does not seem capable of indexing more than 150 files. Provided that a portion of my 400 files may have not converted to text (10-20% tops), I still don't understand while the indexing process stops at around 150 items instead of continuing until at least 300 files have been indexed.
Thanks.
I have a likewise problem, but from the beginning, not after a import.
No pdfs (probably) are indexed - i fear, the 82 indexed entries are all websites.
I installed the pdftotxt files in the search options dialog, i tried to index single files containing text.
Supposing an XP problem i copied my data to a linux and tried it there, same effect :-(((
Its dont matter if i created them myself or downloaded anywhere.
Using pdftotxt on a linux i can get the text, so it seems, my pdfs are ok.
425236630
This is becoming quite frustrating actually, I hope someone will help.
if you're on Linux or Mac, could you try to run pdftotxt on one of the files that crashes zotero? (I have no idea how to do that - or if it's even possible - on Windows, but if it is - the same).
@reh - what happens if you manually try to index one of the files in question (select the file and click on the green arrow-circle next to indexed: no
see if you get an error message that you could post.
In the search tab of your preferences - do you have both pdftotext and pdfinfo shown as installed?
Also, for both of you which Zotero version are you using?
if you're on Linux or Mac, could you try to run pdftotxt on one of the files that crashes zotero? (I have no idea how to do that - or if it's even possible - on Windows, but if it is - the same).
Sorry, never used a Mac/Linux in my life. By the way, I am running the latest version of Zotero and I have both pdf software installed.
Also, for both of you which Zotero version are you using?
By crash I mean that my zotero database stops responding and all the entries disappear.
I tried to manually index a pdf linked to one of my entries and this is what happened:
- the two green "recycle"-like arrows disappeared and the "indexed" category still showed "No".
- I tried to go to another entry to see what would happened and the central window displayed the message "an error has occured. Please restart Firefox.....
This time the "report error" option was actually grey and I could not report the error. I closed Zotero without closing firefox and once I tried to reopen zotero I received an error message.
[removed non-Zotero error — D.S.]
[JavaScript Error: "uncaught exception: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFile.moveTo]" nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)" location: "JS frame :: chrome://zotero/content/xpcom/attachments.js :: _moveOrphanedDirectory :: line 1230" data: no]"]
648032580
The Debug ID is D904080953.
the arrow shortly disappears - nothing else happens
> do you have both pdftotext and pdfinfo shown as installed?
yes, in both OS
> which Zotero version are you using?
the last beta, today i installed the 2.0rc2, same problem
Firefox is 3.0.3 on XP and 3.6 on Linux
> see if you get an error message that you could post.
(XP) seems like this is the problem: Cache file doesn't exist!
The Debug ID is D1719467896.
Im not sure, how the rights was on the virtual ubuntu, but here on XP there are no write restrictions.
This obviously shouldn't take down Zotero in any case, though, so we'll take a look.
reh: I'm not sure why you're getting the indexing failure, assuming that PDF does indeed have embedded text, but the non-C: drive would be my best guess. You might be able to learn more by running pdftotext from the command line using the same arguments that are shown in the debug output.
I set the storage preferences to the default, install the pdf-tools:
Error running pdftotext
The Debug ID is D69316096.
On linux my home is a mounted NFS devise (XFS).
I set the storage preferences to the default, zotero to 777.
Same effect as yesterday: no cache file.
I made a cd to the zotero dir and tried to use pdftotext-Linux-i686 [string from log] on a commandline (hopefully correct): command not found.
The same with pdftotext works and created the cache file.
Running pdftotext-Win32.exe [string from log]: program could not be executed (german text).
on XP i get: program to bit for RAM (working memory)
But normally i work with XP.
Unfortunately i cant find this pdf tools for manual download.
(but i found another thread with the same problem: http://forums.zotero.org/discussion/7681/pdfinfo-pdftotext-crash-program-too-big-to-fit-in-memory/)
In another post i found a link to http://www.zotero.org/download/xpdf/pdfinfo-Linux-i686-3.02
but http://www.zotero.org/download/xpdf/ is not allowed.
The autoinstall is a good thing, but i think, it should be able to manually download the proper version, if needed. Please link it anywhere.
For the developer:
Our situation here is a PC with 64 bit linux running several virtual 32 bit machines (vmware).
And a separate PC (athlon dual core 4850e) with a 32 bit XP.
I already installed it several times, also with disabled cache, but it dont work - seems that the problem is not on my pc, but with the auto downloading.
Maybe there should be any form of checking in Zotero (checksum?), if the download was correct .
Checksumming is planned, but, of course, that would only indicate a failure in your case, not fix it. The auto-download works for most people, so it's likely an issue either with your computer or a network glitch (or Firefox still had the corrupted version cached).