pdf metadata retrieval error

hi,

I dragged an existing pdf file into zotero, whose name contains Chinese charators. Then I right click and choose retrieval metadata for pdf.

At this time, an eror happened, which shows: No matching references found.

However, if the file name is totally Engish, the metadata retrieval function works OK.

So I guess it is due to Chinese charators.

And I googled on the web, and found out pdftotext and pdfinfo should be installed.

Then I download them and installed (unzip, and set an enrioment variable point to the folder). However the same error happens.

I also found that, in Edit->Preference->Search, there should be pdfinfo and pdftotext found. But in my zotero, they can not be found.

I also found that, the zotero installing folder, there are pdftotext.exe and pdfinfo.exe existed.

I have no idea how to resolve this problem.

Anybody can help me?
  • edited July 14, 2020
    And I googled on the web, and found out pdftotext and pdfinfo should be installed.

    Then I download them and installed (unzip, and set an enrioment variable point to the folder).
    No, don't do that — I'm not sure what you found online, but it's out of date and not something you should be following. The necessary PDF tools come bundled with Zotero, and have for years. Undo whatever you did.

    What OS is this? Can you provide a Debug ID for a retrieve attempt that fails?
  • thanks for your reply.
    My OS is win10.
    And the debug ID is:
    D217365727

    hopefully I am doing correct.
  • edited July 15, 2020
    This actually isn't about the filename at all. Zotero is extracting text properly — it just can't find any metadata for that file. It's retrieving metadata for files with English filenames most likely just because they're publications in English that exist in the databases Zotero checks.

    You don't say whether this is an academic paper. There are Chinese databases that Zotero isn't able to check, because they don't provide reliable ways to do so, but remember also that anything can be distributed as a PDF, and you shouldn't expect Zotero to be able to retrieve metadata for random documents. See Retrieve PDF Metadata for more info.
  • Thanks I am trying to understand the flow:
    1. using pdfinfo and pdftotext to get the pdf information
    2. send the information to https://services.zotero.org/recognizer/recognize to find the metadata
    Am I right?

    I tried to get metadata of another pdf file, but something fails. I logged the debug ID:
    D1124319285. You can find the log.

    I reviewed the log, and find out the error:
    (3)(+0000008): Running C:\Program Files (x86)\Zotero\pdfinfo.exe 'E:\backup\zotero\storage\MA5FQN2G\The Quantization Effects of the CORDIC Algorithm.pdf' 'E:\backup\zotero\storage\MA5FQN2G\.zotero-ft-info'

    (1)(+0000221): Error running C:\Program Files (x86)\Zotero\pdfinfo.exe

    (1)(+0000019): Error: C:\Program Files (x86)\Zotero\pdfinfo.exe returned exit status 1 Error: C:\Program Files (x86)\Zotero\pdfinfo.exe returned exit status 1 observe@chrome://zotero/content/xpcom/utilities_internal.js:551:27 From previous event: Zotero.FullText</this.indexItems@chrome://zotero/content/xpcom/fulltext.js:555:15

    Which said, while using pdfinfo to get the information of pdf, it returns a error.

    However, I am using a third party compiled pdfint to get the info, it is success.

    The 3rd party pdfinfo version is 4.02, which I downloaded from here:
    https://www.xpdfreader.com/download.html

    Where you can see:
    Download the Xpdf command line tools:

    Linux 32/64-bit: download (GPG signature)
    Windows 32/64-bit: download (GPG signature)
    Mac 64-bit: download (GPG signature)

    It seems that zotero orignal pdfinfo is not robust enough. And I also donot know what version of zotero's orignal pdfinfo.

    Is it possible to update the zotero's pdfinfo in next version, so that this problem can be resolved?
Sign In or Register to comment.