Fetching paper references on import
I'm a huge fan of Zotero and have been using it for quite a while, which is why I would love to contribute to its features. As has been discussed before, Zotero doesn't store any citation graph data (see adamsmith's comment here: https://forums.zotero.org/discussion/comment/332403/#Comment_332403)
I'm wondering how much effort it would be to add this as a feature and what would be the best mode? I've realized most journals (at least in the biomedical sector) openly share reference lists for their papers alongside their abstract without need for subscription or institutional login (e.g. https://www.nature.com/articles/jhg201415). Now, wouldn't it be cool if these would be imported alongside other meta-information? I realize this would require changes to all of the site-specific-translators, something I could definitely help with. Or are there alternative ways of implementing this, e.g. with a Zotero plugin?
The main use case for this would be graphical network representations of the graph data, which paper cites which, and the identification of seminal papers that are cited a lot by your bibliography but are still missing therein.
Apparently the Better BibTeX for Zotero plugin already has a "graph translator" but still relies on the graph information to be added manually to the extra field (https://forums.zotero.org/discussion/comment/332434/#Comment_332434) in the form "cites: citationkey", thus limiting citations to work already present in the library. Personally I would prefer to import ALL references (what form would be most suitable? doi?) during import (which field should it be stored in?), not just the references already present in the library.
Personally I do have some coding skills and would feel comfortable creating a webextension-like-plugin but would prefer an open discussion about how this feature could be implemented best.
I'm wondering how much effort it would be to add this as a feature and what would be the best mode? I've realized most journals (at least in the biomedical sector) openly share reference lists for their papers alongside their abstract without need for subscription or institutional login (e.g. https://www.nature.com/articles/jhg201415). Now, wouldn't it be cool if these would be imported alongside other meta-information? I realize this would require changes to all of the site-specific-translators, something I could definitely help with. Or are there alternative ways of implementing this, e.g. with a Zotero plugin?
The main use case for this would be graphical network representations of the graph data, which paper cites which, and the identification of seminal papers that are cited a lot by your bibliography but are still missing therein.
Apparently the Better BibTeX for Zotero plugin already has a "graph translator" but still relies on the graph information to be added manually to the extra field (https://forums.zotero.org/discussion/comment/332434/#Comment_332434) in the form "cites: citationkey", thus limiting citations to work already present in the library. Personally I would prefer to import ALL references (what form would be most suitable? doi?) during import (which field should it be stored in?), not just the references already present in the library.
Personally I do have some coding skills and would feel comfortable creating a webextension-like-plugin but would prefer an open discussion about how this feature could be implemented best.
https://forums.zotero.org/discussion/77659/forward-and-reverse-citations-lookups?new=1
As I mention there, I think that a basic approach could be to use services like google scholar, citeseerx, and other services which already parse and extract data pertaining to a paper's local citation graph. For my use case I think it's also important to facilitate limiting the imported papers according to research questions that I pose during active review and annotation of the paper.
Citation Star Graphs
The citation graph structure I'm dealing with is just a single central node connected to handful of "forward" and "reverse" nodes which I find interesting, and without any other structure necessarily inferred between these nodes. This is like an attention-focused microscope view of the citation graph, something that we might call a citation star. Multiple stars can be assembled into a network, but they don't have to be (they're a useful research tool by themselves).
Citation networks
What you suggest is a bit more ambitious, and could lend itself to techniques like page rank and other citation graph mining techniques, where we import a large citation network, and then look for key papers possibly in a very automated way, and focus our attention on those papers which seem relevant to the structure of that graph.
I think the problem could be done, with a couple caveats:
1. We'd have to do some complex querying and data fusion to build a larger scale network
2. We'd need to add a custom table or database for this structure
3. It might be a challenge to implement the data mining algorithms
Two Approaches for problem 1: Find one (or more) large bibliographic dataset(s) - or - build a google scholar crawler
I would be very much for this kind of turbocharged citation network analysis, and will take some time to review the data source we might use here. Ideally we might be able to find entire multi-gigabyte datasets representing large fields with many subfields, and work directly with that (intimidating as that may initially seem). Barring that, it may be that it's feasible to "crawl" the citation graph - using above mentioned services - and build a local graph around a collection of "seed" papers.
Dev Mentorship
It's important to not get stuck, and it would be great if we can find a dev that will help sponsor this project with ideas and an occasional solution. If any devs reading this feel like this is not an ill fated disaster waiting to happen, and would like to spend some of their precious time guiding this project, that would drastically increase the project's chances of success!
Conclusion
I wanna see zotero be able to do space age citation network analysis, and that might mean pulling out the stops and doing some kind heavy engineering. However, I'm not /entirely/ sure what the scope should be, and keeping it simple will mean a higher chance of success. Lets brainstorm more about objectives, data services, and algorithms/techniques that can help us meet those objectives. And lets recruit a dev that can give us their blessing and guidance!