issues with PDF Metadata retrieval options

Note: I am aware of a few ongoing discussions related to this, but they do not solve my problem.
(Notably https://forums.zotero.org/discussion/comment/340128#Comment_340128 and https://forums.zotero.org/discussion/78638/unable-to-bulk-import-a-list-of-urls are the adjacent ones).

Issue:
I am using the API to manage libraries programmatically (I ported a number of libraries from competitors in order to do this, since everyone else has rubbish APIs).
However, it is a really serious problem that I cannot trigger metadata extraction programmatically. I know that before now you were rate limited, but now that you are not, surely this is possible to do? Consider this a feature request, but also a request for a workaround in the interim.
As it is right now, it is pointless for me to write objects to the API, since once I upload files I have to manually click on them to extract data. This is a legitimate API use case - to upload files received from a digest and automatically extract information about them.

As a workaround I tried using https://github.com/zotero/recognizer-server, which I would be happy to do, but I'm uncertain if it was intended for external consumption or not. If so, maybe the README could be fleshed out a bit?
I can't even get past "npm start" - I think there are some dependencies or version incompatibilities that might need to be documented. (though I don't speak node js so my ability to understand its error codes is flawed and a product of much googling).
Note that the javascript option in the discussion won't work since this has to all exist as part of offline jobs that run to manage the libraries. That said, if it was for some reason preferable to use urls instead of pdf files, that would be fine, I could start with a URL instead.

Thanks very much! If API support does exist I'll port it to pyzotero so it can reach a wider audience. In the interim though some other solution would be really helpful.
Sign In or Register to comment.