Is there a plugin that scrapes citations/references from PDFs?
Dear all,
First, I'd like to thank the developers and contributors that have made Zotero, a very useful bibliographic tool. I have found it fundamental to a bibliometric project here at my organization. Using Zotero, in conjunction with Zotero Scholar Citations by Anton Beloglazov, I have been able to fetch citation data from Google Scholar for roughly 4,000 documents. There were a few problems in doing this, but I managed to get by GS's anti-spam mechanism (I can explain how I did this if anyone is interested... the solution was a bit simple, although it still required some manual input).
Now, we were hoping to move forward and try to scrape all the citations within these documents. Are there any plugins for Zotero, EndNote, etc, that could help achieve this goal? I have found a discussion that indicates a demand for this (look at DWL-SDCA's first post on Mar 19th 2012), but it seems that Zotero isn't able to do this admittedly problematic task.
The only tool that I have found is called PDF Extract by CrossRef Labs. It is still in its development stages and produces results that are less than perfect. I was just hoping that others on this forum may have knowledge of a tool to do this job.
Thanks,
bix
First, I'd like to thank the developers and contributors that have made Zotero, a very useful bibliographic tool. I have found it fundamental to a bibliometric project here at my organization. Using Zotero, in conjunction with Zotero Scholar Citations by Anton Beloglazov, I have been able to fetch citation data from Google Scholar for roughly 4,000 documents. There were a few problems in doing this, but I managed to get by GS's anti-spam mechanism (I can explain how I did this if anyone is interested... the solution was a bit simple, although it still required some manual input).
Now, we were hoping to move forward and try to scrape all the citations within these documents. Are there any plugins for Zotero, EndNote, etc, that could help achieve this goal? I have found a discussion that indicates a demand for this (look at DWL-SDCA's first post on Mar 19th 2012), but it seems that Zotero isn't able to do this admittedly problematic task.
The only tool that I have found is called PDF Extract by CrossRef Labs. It is still in its development stages and produces results that are less than perfect. I was just hoping that others on this forum may have knowledge of a tool to do this job.
Thanks,
bix
This is an old discussion that has not been active in a long time. Before commenting here, you should strongly consider starting a new discussion instead. If you think the content of this discussion is still relevant, you can link to it from your new discussion.
Upgrade Storage
We have a list of tools that extract citation information from formatted bibliographies here: http://www.zotero.org/support/kb/importing_formatted_bibliographies afaik none of them have an implementation that also finds the bibliography in a PDF (like PDF Extract does) and none of them has terribly good results across the board.
I will also see if there are other ways to get around this problem... if I come across an ad-hoc solution I will let you guys know.