That by itself is a non-trivial task, as bibliography styles vary greatly. There is a fairly good machine learning based tool that does that https://github.com/kermitt2/grobid
It may also be possible to get the list from Crossref directly if the publisher has deposited it there with an article's main metadata (millions of papers apparently have that reference list data deposited, but not all do). Theoretically Zotero could also potentially extract that list when accessing that main metadata.
Hi, @tim820. Thanks for mentioning it! I had seen it before but haven't had the chance to try it myself yet. Based on your description of it, it looks really neat and complete! Rest assure we will mention it in our presentation.
https://www.wikidata.org/wiki/Wikidata:Zotero/Cita/Documentation#How_can_I_extract_citations_from_a_PDF_attachment?
https://github.com/diegodlh/zotero-cita
You can then copy that list as you wish.
It may also be possible to get the list from Crossref directly if the publisher has deposited it there with an article's main metadata (millions of papers apparently have that reference list data deposited, but not all do). Theoretically Zotero could also potentially extract that list when accessing that main metadata.
@dominic-d and I plan to present on how to integrate Cita with reference extraction workflows there: https://mpilhlt.github.io/reference-extraction/workshop-2023/programme/
I am not sure if any of the workshop organizers are active here, but the other recent player in this space that they may wish to to cover is:
https://github.com/MuiseDestiny/zotero-reference
It is still under development, but has both PDF citations text extraction and citations database API-based retrieval, that work quite well. It can also show which cited references are already in one's library, find new ones via DOI, etc. Some discussion here (it has advanced somewhat since then):
https://forums.zotero.org/discussion/103205/is-it-possible-to-extract-references-from-article-pdfs-webpage
Rest assure we will mention it in our presentation.