Pushing URLs to Zotero API to create entry?

vishalbelsare · May 9, 2017

I have a collection of PDF and Postscript files on a disk, which have been saved incrementally since 2005. Needless to say, I don't have .bib file capturing the bibliographic information / citations. If the number of articles were small enough to work through manually I'd have done that, but that is not the case.

What I have done now to make the task a little more manageable is to use pdfgrep [ https://pdfgrep.org/index.html ] and extract potentially relevant information, for instance, in case of articles from SSRN, the SSRN link which is contained in those PDFs, and similarly the ArXiv identifier in case of files from ArXiv, and DOIs for others.

Now the pertinent question is this: Is there a way to push the SSRN and ArXiv links through the Zotero API, and allow the creation of entries, AS IF I had browsed those URLs in Firefox/Chrome and use the Zotero add-on/plugin?
For instance, can I push "https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1509466" through the Zotero API and hope to create an entry with the information captured from the SSRN or Arxiv page?

adamsmith · May 9, 2017

DOIs you can simply import into Zotero using the add by identifier function (which takes newline separated lists) and that'll work best.

Zotero can't do natively what you want with URLs. You could maybe use the Wikipedia citation API: https://en.wikipedia.org/api/rest_v1/#/Citation which relies on Zotero on the backend, get bibtex back, then import that into Zotero.
There may be other ideas. Unfortunately all of this would likely still require linking the PDF manually.

vishalbelsare · May 9, 2017

Your suggestion to use Wikipedia citation API is very helpful. I did not know about that and from a quick read of its backend, it uses Zotero Translators behind the scene. Thanks much.

So, wrapping :
curl -X GET --header 'Accept: application/x-bibtex; charset=utf-8' 'https://en.wikipedia.org/api/rest_v1/data/citation/bibtex/https://arxiv.org/abs/1612.03350'
into a script would potentially work nicely for my present purpose.

I was specifically looking to do something only with the API. i.e. something I can write a script around. Essentially, from my set of PDF and Postscript files (numbering > 1000), I extract using pdfgrep something on the lines of one of the following :-

(1) ~/researchPapers/SSRN-id950500.pdf | http://ssrn.com/abstract=950500

(2) ~/researchPapers/Arxiv//1606.00229v2.pdf : arXiv:1606.00229v2
From field #2 of (2) I can construct the arxiv url 'https://arxiv.org/abs/1606.00229v2'

(3) ~/Downloads/adf0066-zhangA.pdf | DOI: http://dx.doi.org/10.1145/2939672.2939673
From field #2 of (3) I get the DOI URL

(4) ~/Downloads/daniel2016.pdf | doi: 10.1016/j.eswa.2016.11.022
Field #2 of (4) just gives me the DOI

While the Wikipedia citation API is what I will go with for now, for sake of informing myself better, is there no way to either use the SSRN URLs,ArXiv URLs or DOIs through the Zotero API to fetch the bibliographic information and create an entry in a scripted way (which would allow one to process say >1K PDF files)?

adamsmith · May 9, 2017

not that I'm aware of at least. I think Dan is interested in providing better API access locally (as opposed to via the server API), which could include something like this.

dstillman · May 9, 2017

For instance, can I push "https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1509466" through the Zotero API and hope to create an entry with the information captured from the SSRN or Arxiv page?

You can use translation-server to do this, though you'll then need to upload the data to the API separately.

(There's actually support for doing this in one step under-the-hood in the API, but it's not exposed publicly yet.)