Idea? Using CSL Citation styles to output biblio in LaTeX
I am new to Zotero. The program looks great, although it will need some developments to be really useful to me. I am a LaTeX/BibTeX user, but also a keen programmer, so I could help develop some stuff.
Not quite sure if I am in the right discussion here. That might fit better in the plugin discussions.
Anyway, here is my idea: Currently, if you want to use Zotero with LaTeX, you need to export your bibliography to a BibTeX file, and then you need to use BibTeX and one of the BibTeX citation styles (the so-called .BST files). The sad thing about that approach is that you don't take advantage of the all the nice CSL files developed for Zotero.
Could we instead do something like that: Write a program, let's call it "bibzot", that would replace BibTeX and that would use Zotero dB and CSL files instead.
Bibzot would start exactly like BibTeX, i.e., read the .AUX file generated from the LaTeX document, and from there extract the name of the citation style (the name of a CSL style instead of a BST file) and the citations we want (all the \cite{..} labels) (we will need to find a way to map the citation keys used in LaTeX to the corresponding Zotero entries somehow). From there, Bibzot will then depart from BibTeX. Instead of reading a BibTeX file and run it through a BibTeX style file, it would get the bibliographic entries from the Zotero dB, and format them through one of the standard CSL file.
From there, how do we go about converting that output to LaTeX coding? I don't quite know what CSL output actually look like, I haven't looked that yet, but if it is XML or HTML, then converting to LaTeX should be relatively straightforward. From there, we write that into the .BBL file that LaTeX wants. Et voila!
What I am wondering is, first, is that a silly idea? Am I trying to make something simple in a very complicated way? Are other people interested in that? In my view, it would have some advantages:
- We take advantage of all the CSL files that have been developed
- We don't have to re-export the Zotero dB to a BibTeX file each time we add something
- Different CSL files might use say, full journal titles rather than abbreviated titles. We would not need to export the Zotero dB in two different ways to get BibTeX doing that right (although we could get that running using macros for the journal names in the BibTeX export).
- BibTeX essentially didn't evolve in the lat 10 or 15 years. It will need some more powerful replacement one day (although it has proven remarkably robust in handling many complicated cases). Zotero with bibzot could be that one.
There are some issues obviously:
- How to map the LaTeX citation keys to the Zotero entries?
- Can we handle every case that way? I know that some Author-Year citation styles leads to quite cumbersome LaTeX code. Is that do-able?
As I say, I am not familiar at all with the Zotero code, and the CSL files syntax. Could the above proposal actually be implemented?
Thanks for any comments, thoughts, encouragements, or whatever help I may need to start implementing that idea.
Not quite sure if I am in the right discussion here. That might fit better in the plugin discussions.
Anyway, here is my idea: Currently, if you want to use Zotero with LaTeX, you need to export your bibliography to a BibTeX file, and then you need to use BibTeX and one of the BibTeX citation styles (the so-called .BST files). The sad thing about that approach is that you don't take advantage of the all the nice CSL files developed for Zotero.
Could we instead do something like that: Write a program, let's call it "bibzot", that would replace BibTeX and that would use Zotero dB and CSL files instead.
Bibzot would start exactly like BibTeX, i.e., read the .AUX file generated from the LaTeX document, and from there extract the name of the citation style (the name of a CSL style instead of a BST file) and the citations we want (all the \cite{..} labels) (we will need to find a way to map the citation keys used in LaTeX to the corresponding Zotero entries somehow). From there, Bibzot will then depart from BibTeX. Instead of reading a BibTeX file and run it through a BibTeX style file, it would get the bibliographic entries from the Zotero dB, and format them through one of the standard CSL file.
From there, how do we go about converting that output to LaTeX coding? I don't quite know what CSL output actually look like, I haven't looked that yet, but if it is XML or HTML, then converting to LaTeX should be relatively straightforward. From there, we write that into the .BBL file that LaTeX wants. Et voila!
What I am wondering is, first, is that a silly idea? Am I trying to make something simple in a very complicated way? Are other people interested in that? In my view, it would have some advantages:
- We take advantage of all the CSL files that have been developed
- We don't have to re-export the Zotero dB to a BibTeX file each time we add something
- Different CSL files might use say, full journal titles rather than abbreviated titles. We would not need to export the Zotero dB in two different ways to get BibTeX doing that right (although we could get that running using macros for the journal names in the BibTeX export).
- BibTeX essentially didn't evolve in the lat 10 or 15 years. It will need some more powerful replacement one day (although it has proven remarkably robust in handling many complicated cases). Zotero with bibzot could be that one.
There are some issues obviously:
- How to map the LaTeX citation keys to the Zotero entries?
- Can we handle every case that way? I know that some Author-Year citation styles leads to quite cumbersome LaTeX code. Is that do-able?
As I say, I am not familiar at all with the Zotero code, and the CSL files syntax. Could the above proposal actually be implemented?
Thanks for any comments, thoughts, encouragements, or whatever help I may need to start implementing that idea.
There are certainly "messier" styles that are not yet capable of being written in CSL that would require you to either have clever LaTeX macros (as BibLaTeX uses) or would otherwise require multiple iterations of LaTeXing/citation writing. But there's no point in worrying about that now. Given enough monkeys at typewriters, why not? The Zotero & citeproc-js codebases are fairly nice. But, as above, you don't necessarily have to start with them to get what you want.
If you do start with them, they are nicely modularized & you can more or less ignore large swaths of code (XML parsing, etc.) and focus mostly on the output classes.
http://groups.google.com/group/comp.text.tex/browse_thread/thread/62d240a7ee7b7cb8/fe0cf6b42b5299d9?#
Zotero/CSL output is getting more accurate all the time, but there are still plenty of items that (quite reasonably) are handled by touching up the document after Zotero has done its thing with the citations. To do that kind of touch-up work on references with traditional BibTeX processing, you would be digging into (if memory serves me correctly) the *.bbl file, which is not really meant to be read or edited by humans. Biblatex (which seems to represent the favored approach among the LaTeX maintainers) seems designed to write direct from reference database to *.dvi, so post-processing editing is completely out of the question.
Because of this need for total accuracy, I think you are unlikely to see much enthusiasm for CSL integration among the LaTeX maintainers. Certainly it would not be an attractive or rewarding task to undertake without a good advance base of support in the LaTeX community.
(EDIT: see my retraction of these reservations down thread)
PS: I don't want to get into a flame war about this, it's not that big a deal. If someone (another monkey at the typewriter, to borrow noksagt's phrase) undertakes tighter integration between CSL and LaTeX, that would certainly be a great thing. My point is only that CSL is at its LaTeX 2.09 stage of development, if you know what I mean.
To anyone interested in pursuing such a project, the citeproc-js test suite and sources are on BitBucket, the processor manual is available online, and specific questions can be directed to the integrators' discussion list.