Closing the circle: scraping CSL-generated documents

fbennett · May 25, 2010

With the appearance of standalone implementations of CSL, it's likely that we'll eventually start seeing CSL-generated documents online, through authoring plugins to Wordpress, Google Docs, Wave or the like.

Although currently available processors do not yet support it, CSL processors will eventually be able to include embedded citation data for referenced works (probably as RDFa), as well as metadata for the document itself. This will blur the distinction that Zotero translators currently draw between "single" and "multiple" targets.

This might be a good time to start considering how rich documents of this kind should be handled in the Zotero translator UI.

ajlyon · May 25, 2010

It would be nice to add RDFa support as a first step.

As for the rest of this, I have already floated multiple times the idea of unifying translators, so that all embedded data translators combined their results into a single list of importable results. I will likely do this myself, if no one gets to it before me, in mid-to-late-June. I imagine including both the document itself and the citations it includes in the selection dialog box; it would be nice for Zotero to provide a means of signifying the distinction, but I could use [Citation] as a prefix for the latter in the meantime.