rebuilding pdf index

I am trying to rebuild my index but of approx 500 pdfs I am unable to get Zotero to index a remaining 60 and 35 partial. I have tried increasing the maximum character count and maximum pages without success.

I have received a chrome script error and followed the advice here:

http://www.zotero.org/support/kb/unresponsive_script_warning

and discussion here:

http://forums.zotero.org/discussion/11783/another-script-error-and-instability-of-firefox-with-zotero-202/

... without success.

I have Report ID 1866499065

I am running the latest Zotero and Firefox updates on a fast Win XP machine.
  • From the error report:
    PDFs with filenames containing extended characters cannot currently be indexed due to a Firefox limitation
  • edited May 4, 2010
    Hi Dan,

    Will you please suggest the easiest way to find and replace these file names?

    Also, will you please offer some examples of 'extended characters'?
  • Look at the error report in Report Errors for the filenames.

    Or wait. The underlying bug will be fixed in Firefox 3.7, I believe.
  • edited May 4, 2010
    Is there any idea to try to install Firefox 3.7 alpha?

    I generated another Report ID 1933833608 and had a look at the files. Only one shows the extended characters error (the danish "ø").

    Meanwhile all show this: {file: "chrome://zotero/content/xpcom/fulltext.js" line: 476}.

    Another curiosity is that there remain 60 unindexed files but only 5 in the error report???

    Here are the five files showing up in the error report:

    [JavaScript Error: "Green 2008 Capturing User Requirements (thesis).pdf was not indexed" {file: "chrome://zotero/content/xpcom/fulltext.js" line: 476}]

    [JavaScript Error: "Høybye, Johansen, T-Thomsen 2005 Online Interaction.pdf was not indexed -- PDFs with filenames containing extended characters cannot currently be indexed due to a Firefox limitation" {file: "chrome://zotero/content/xpcom/fulltext.js" line: 476}]

    [JavaScript Error: "Tegeland 2007 Information om kundval.pdf was not indexed" {file: "chrome://zotero/content/xpcom/fulltext.js" line: 476}]

    [JavaScript Error: "99024537.pdf was not indexed" {file: "chrome://zotero/content/xpcom/fulltext.js" line: 476}]

    [JavaScript Error: "Johansson 2008 Older people's home modification process (thesis).pdf was not indexed" {file: "chrome://zotero/content/xpcom/fulltext.js" line: 476}]
  • Do those files have embedded text?
  • edited May 4, 2010
    I'm not entirely clear what you mean by "embedded text" but I checked two of the five PDFs and they appear to be in PDF SECURED format (text does not copy when selected which in effect is like a non-ocr / image pdf). This would explain why they couldn't be indexed. (I've tried to find ways to "unsecure" the pdfs but short of printing and rescanning I have not found a good solution.) It is possible that the other 55 are either in "secured" format and/or scanned image pdfs.

    Question: Any ideas for how to located the other 55 pdfs that are not yet indexed since they don't show up in the error report?
Sign In or Register to comment.