Retrieve metadata fails with "PDF does not contain OCRd text" with

I downloaded a document from here

http://www.oecd-ilibrary.org/industry-and-services/entrepreneurship-at-a-glance-2010_9789264097711-en

and attempted to retrieve metadata for the item. The document is a normal document that contains text and has a DOI on the third page. The error message that I get is "PDF does not contain OCRd text". Since the report clearly has text content, at least the error message is wrong.

Also the Zotero translator (DOI) fails on that page.
  • Also, a related to this report. The item type that I would use for this is "report", and the report says that it should be cited as

    OECD (2011), Entrepreneurship at a Glance 2011, OECD Publishing. http://dx.doi.org/10.1787/9789264097711-en

    However, the report item type does not include DOI. Why is this?
  • The document is a normal document that contains text and has a DOI on the third page.
    The DOI is on the fourth page. Zotero's metadata retrieval currently only checks for (sufficient) OCRed text as far as the third.
  • edited March 1, 2012
    On the other hand, if you remove the blank second page Zotero identifies it as a completely different item, because it's using the boilerplate copy on the (then) third page instead of the DOI.
  • The DOI is on the fourth page. Zotero's metadata retrieval currently only checks for (sufficient) OCRed text as far as the third.
    It makes sense not to scan the entire document. However, the reason for failing is not that there is no OCRd text, but that the first three pages do not contain sufficient information to identify the document. The error message could be changed to reflect that.

    Also for some reason Zotero cannot add this item by the DOI.
Sign In or Register to comment.