extracting metadata from pdf not working

I am on Zotero 4.0.8 standalone in Windows 7 (and new myself to Zotero). A few basic tools don't work for me. Here is one: I dragged a recent Env. Sci. Technol article as a pdf from my literature directory into the middle panel of Zotero. Then right-clicked to "Retrieve Metadata from PDF". It goes into a small window that never ends its job (keeps showing the busy circle).

Since the automatic installation of the pdftotext and pdfinfo did not work, I had downloaded the Win32 versions from Xpdf per instructions and put the two files with the proper names into the data director. Zotero Preferences recognizes that they are there (albeit with unknown version). When I run either command from the command line on this same pdf file, results are returned instantly.

Why does Zotero hang?

P.S.: I do see the command prompt window briefly popping up when either using the "Reindex" or "Retrieve metadata for PDF" commands after right-clicking on the pdf item.
  • My best guess is that all of your problems are related -
    https://forums.zotero.org/discussion/29370/pdf-downloaded-with-citation-but-not-found/#Item_0 being the other one.
    This sounds like some type of issue - maybe a permission issue? - with writing files to your Zotero folder. The automated installation of the pdf tools basically never fails.

    So let's start with that. Move your custom installed pdf tools out of the Zotero folder. Restart Zotero. Then provide a debug ID for the process of trying to (automatically) install them:
    http://www.zotero.org/support/debug_output
  • Adam,

    I appreciate your efforts. I agree, all my troubles seem related to the same problem. At this point (11:30pm PDT) I seem to be unable to even "Submit to Zotero Server". But here is the debug report (very short):

    [JavaScript Error: "The character encoding of the plain text document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the file needs to be declared in the transfer protocol or file needs to use a byte order mark as an encoding signature." {file: "zotero://debug/" line: 0}]

    version => 4.0.8, platform => Win32, oscpu => Windows NT 6.1; WOW64, locale => en-US, appName => Zotero, appVersion => 4.0.8

    =========================================================

    (5)(+0000000): SELECT COUNT(*) FROM fulltextItems WHERE (indexedPages IS NOT NULL AND indexedPages=totalPages) OR (indexedChars IS NOT NULL AND indexedChars=totalChars)

    (5)(+0000000): SELECT COUNT(*) FROM fulltextItems WHERE (indexedPages IS NOT NULL AND indexedPages<totalPages) OR (indexedChars IS NOT NULL AND indexedChars<totalChars)

    (5)(+0000001): SELECT COUNT(*) FROM itemAttachments WHERE itemID NOT IN (SELECT itemID FROM fulltextItems WHERE indexedPages IS NOT NULL OR indexedChars IS NOT NULL)

    (5)(+0000001): SELECT COUNT(*) FROM fulltextWords

    (3)(+0003352): HTTP GET http://www.zotero.org/download/xpdf/Win32.latest

    (3)(+0001448): HTTP GET http://www.zotero.org/download/xpdf/Win32.latest
  • I'm guessing you're connecting via a system proxy? That isn't working for some users in recent versions of Zotero Standalone, though we haven't been able to reproduce it ourselves. I believe lack of a network connection from Standalone would explain all the problems you're seeing.
  • ....another example of not being able to connect: when I check for Zotero updates under "Help", it also doesn't connect. I am not aware of using a proxy, except once I am using the library (have not used that today and never logged into that proxy today). Do you have a suggestion where to look (Windows 7)? Thanks.
  • Here are instructions for setting system proxies for Windows7
    http://answers.oreilly.com/topic/675-how-to-configure-proxy-settings-in-windows-7/
    obviously you don't want to enable proxy settings, but that'd be the place to check.
  • edited May 12, 2013
    Adam,

    Thanks for sending that tool. I am getting a few things to work now.

    Here is what I am finding out:

    The system-wide settings under "internet options" (oreilly link above) are the same that I see in the Chrome settings. In both, my "LAN Settings" are:

    ------------------------------
    (checked) Automatically detect settings
    (checked) Use automatic configuration script
    Address: http://www.lib.ucdavis.edu/proxy/[...]
    (NOT checked) Use a proxy server for your LAN
    ------------------------------

    The automatic script (second check above)automatically offers me to login to the library proxy (for database access), but only once I arrive on the library webpage. Of course, that is how I get to my literature (and how I would like to operate with Zotero), but at the moment, the connection doesn't even work before I login to the proxy (or with Chrome just turned off).

    Here is what I tried as a fix:

    I "unchecked" the two "checked" boxes in my LAN settings (from Windows 7 internet options per the link you provided, http://answers.oreilly.com/topic/675-how-to-configure-proxy-settings-in-windows-7/). I did not turn off the Chrome browser. But I did have Zotero closed. I reopened Zotero. The result: the connection issue seems to be resolved (I will add these as I go through):
    * Zotero properly checked for software updates
    * it installed the PDF Indexing version ok (under "Tools - Options")
    * it properly created a parent item from a pdf by reading its metadata
    * it properly renamed the pdf
    * zotfile now works properly

    FYI. The connection issue is also resolved (i.e., working) if I "check" the first box ("Automatically detect settings"), but not the second box ("Use automatic configuration script"). The connection is broken (again) if I only check the second box [a simple check on "Help->Check for Updates" will reveal the connection issue]
  • OK, glad that seems to have been the problem. Dan may have further questions on the configuration script, they're still trying to figure out when/how that affects Zotero (ideally it shouldn't, obviously).
  • here how i solved that problem, in windows 7, xpdfbin-win-3.03

    1. download latest xpdf from http://www.foolabs.com/xpdf/download.html
    2. extract the file xpdfbin-win-3.03.zip (latest today)
    3. copy pdfinfo.exe and pdftotext.exe (check whether 64 or 32 bit os) into Zotero Data Directory Location,
    4. delete pdftotext-Win32.exe and pdfinfo-Win32.exe in Zotero Data Directory Location ,
    5. rename pdfinfo.exe ==> pdfinfo-Win32.exe and pdftotext.exe ==> pdftotext-Win32.exe ,
    6. open zotero
  • Zotero uses a modified version of pdftotext (or maybe it was pdfinfo), so you're probably going to run into problems at some point.

    I'm not really clear on what issue specifically you were trying to solve, but if you're not able to install pdf tools via Zotero, you're bound to run into other problems as well. Feel free to start a new thread to troubleshoot this.
  • the 3.0.3 version of pdftotext works with copy protected PDFs, the 3.0.2 version shipped with Zotero doesn't, so this may make sense under some circumstances.
Sign In or Register to comment.