Integration with Word Processors via RTF Scan

iuri · July 23, 2008

Hi,

If someone doesn't want to use Word or OO, s/he doesn't have a way to _really_ integrate Zotero and his/her word processor, right?

Implementing such integrations for all remaining word processors (to name a few: Wordperfect, iWork, GoogleDocs, etc.) would be impossible.

So, wouldn't be easier to implement some form of RTF scan, such as the one found in Endnote, to scan for temporary marks (like the { } 's in Endnote) and change them to the citation style of user's choice, and generating the reference list in the end?

Best,

Iuri.

tylerbickford · October 2, 2008

I'd like to second this request.

Even using word I'd prefer to use temporary citations like Endnote rather than clutter up my working documents with field codes. Seems like scanning an RTF document for temporary citation marks would be relatively trivial, and open Zotero up to users who need some in-text citation capabilities and want to use software besides Word of OO.

arggem · October 2, 2008

It appears to be not quite as trivial as one might like. But it is on the radar. (Last part of the post addresses this).

bdarcus · October 2, 2008

@arggem: I'm not sure you're talking about the same thing. This should be reasonably easy, actually. You're just effectively doing a glorified search-and-replace operation on a file. It's just that you do it after you're done editing the document; kind of like how citeproc-hs works with markdown documents.

arggem · October 2, 2008

Oh. OK. Ooops.

tylerbickford · October 6, 2008

I'm going to keep banging away at this, because I think zotero has a lot to offer, and I'm really surprised at how it doesn't integrate with my workflow.

In-text citations are the central functionality I look for in a reference manager, but it seems to be Zotero's least adaptable feature. Browsing, importing, tagging, and organizing are icing on the cake -- a reference manager is there to make sure that every in-text citation is included in the works-cited list, and that they are all properly formatted (things like repeated years for a single author within one document are automatically dealt with as 2000a/2000b, for instance). As an editor of a journal, over and over again I would find that authors had included an in-text citation as they composed, but missed it when they (manually) compiled their bibliographies. Or after compiling a bibliography, they edited a section including a couple citations out, so that now there were uncited references in the bibliography. It calls for a database to keep track of everything.

The Zotero site uses the word "citation" everywhere, and I've been searching the site and documentation desperately to find out where all these wonderful citation capabilities are. (It's semantics, and maybe I've got it wrong, but I always distinguished between a "citation", which is the moment in a text when you refer to something, and a reference, which is the thing being referred to.) But it seems like Zotero has all but ignored the process of actually citing references while composing, limiting its capacity there entirely to Word and OO integration (which all other reference management software recognizes is too limiting).

Zotero seems to be geared toward a generation of writers who are not dependent on clunky word processors. I compose using Scrivener, and I keep notes in VoodooPad. Both work in RTF, and I wouldn't use them otherwise, because I need good cross-operability. There's no reason, when I'm taking notes in a wiki, that I shouldn't be able to place a citation that will stick around through the life of the text as it moves from one editor to another. That's what I do in Endnote, and it works fine.

I suppose I could export a bibliography from Zotero to Endnote each time I need to cite something new, but that's clunky and difficult, and what I really need is a reference manager that can integrate easily into my writing process. Back when I was working with Latex, I fell in love with BibDesk, because it integrated with text-editors to auto-complete citations as you typed them. In my current process I have to actually open Endnote and mouse to a reference to copy it's citation, which I hate, but I'm willing to live with (it's at least faster in terms of the computer's responsiveness, than using the Word plugin).

Anyway, it seems to me that the solution to this problem is for Zotero to use some sort of plain-text codes for adding to text or RTF documents that can be scanned at a later stage to compile a bibliography. Does anyone else, besides iuri, desire something like this? For me it's a deal breaker, right from the start. How do other people incorporate Zotero into their composing process? Am I missing something?

Zotero development seems to be geared toward Web 2.0 capabilities, which is great. But it's moving too fast. I need the program to do basic reference manager tasks before I can start using it to organize my references.

bdarcus · October 6, 2008

Pandoc, a markdown-based document conversion tool, has recently added support for citation markup and processing (using the same CSL language as Zotero). It might be worthwhile to consider supporting that same markup in Zotero.

mark · November 17, 2008

I just want to add another vote to this. While in my workflow, reference management is more than just 'icing on the take', I fully agree that having a couple of simple plain-text codes for in-text citation is an essential feature for cross-operability — and that is what Zotero should champion.

I'm not a programmer, but this sounds relatively easy to implement: essentially just a full-text scan looking up entries from the zotero db (the Word/OO plugins already do that) and an extra interface for resolving ambiguous in-text refs. The Pandoc citation markup mentioned by Bruce looks perfectly OK for this use.

This kind of stuff would also be of great help to users working in Google Docs and other environments missing a scripting layer. What is needed for such environments would be a simple Zotero extension that can scan any piece of text you feed it, looks up the refs in the database, and inserts references and a bibliography subject to some customization. Such plugins can only be developed, however, if a simple standard for in-text citation markup is implemented sooner rather than later.

So I'll join tylerbickford in banging away at this point.

bdarcus · November 17, 2008

This kind of stuff would also be of great help to users working in Google Docs and other environments missing a scripting layer. What is needed for such environments would be a simple Zotero extension that can scan any piece of text you feed it, looks up the refs in the database, and inserts references and a bibliography subject to some customization.

The in-text markup doesn't really help make this any easier; the hard part is what you ask for here.

It's actually easier than this. For examples like Google Docs, you would just need to be able to scan and update an exported file. So no plugins.

mark · November 17, 2008

I would think having an in-text citation markup standard would make the processing part a bit easier, because then the scanner at least knows what kind of stuff to look for.

bdarcus · November 17, 2008

Well, you need the markup, but you don't need the integrated plugin. That adds another order of complexity.

mark · November 17, 2008

Agreed. So the residue of this discusson is that it would be good for a variety of reasons if Zotero supported some form of simple in-text citation markup.

tylerbickford · February 15, 2009

I'm not a programmer either. What could I do to help get this ball rolling?

fbennett · February 16, 2009

This isn't really one for Zotero specifically is it? That is, the actual tool that does the inserting would not need to be embedded in Firefox as a plugin or extension. It would only need access to and be able to marshal the contents of the database. The tools mentioned by Bruce can probably be adapted for that purpose. What you would want to lobby Zotero for, I think, is just exposure of human-friendly unique keys for each item somewhere in the Zotero interface. There have been requests from LaTeX users for unique keys, so it's in the hopper for eventually, at least. The rest, though, can be run as a (nearly) completely separate project. Someone would just need to do the programming -- so step two would probably be to show kindness toward a nearby geek.

bdarcus · February 16, 2009

Yeah, FWIW, the author of citeproc-hs (which is what provides CSL support in pandoc) has expressed interest in adding this sort of thing to the OpenDocument format. So it's certainly possible. But it would require changes in Zotero.

As I've said many times, though, I'm skeptical BibTeX keys make much sense today when we're talking about the web, multi-user social-networking, etc. We might need a better way to associate a citation with its source.

migugg · March 11, 2009

I have a suggestion that would solve the problems, though it may be impossible to implement: It would combine a bibtext key with a database on zotero's servers:
After each entry into zotero, zotero would connect to a database that would be stored on zotero's servers. This database contains all the entries into every zotero database that is connected to zotero's servers. But it does not contain the full entries, but only those fields that are needed for bibliographic entries (title, authors, year, journal, place, publisher). The datasets also contain a bibtex key (or something similar) for each entry (to be unique, such a bibtex key would possibly contain some number after name and date (i.e. smith.2008.346289).
If zotero would recognize the entry (if the entry would match an entry to the server database) the local zotero would just import the bibtex key from the server database. If the local zotero entry would differ from the server-entries it would automatically be added to the server database and given a unique bibtex key.
Like this, each zotero entry would have it's unique bibtex key AND could be exchanged from one database to another. This system would furthermore enable that I can import new items from this database, instead from google scholar or worldcat or whatever.
However, the main problem would possibly be the immense size of the database on zotero's servers. The advantage for zotero would be, that it would possibly produce in a very short time span a kind of worldcat that is entirely user-generated and that could contain a much wider array of sources than for example google scholar.

urlwolf · March 11, 2009

+1. This is a key feature that would improve zotero's interoperability.
If there's a concerted effort to make this happen, I may be able to help. I need this badly. Zotero is mostly javascript, right?

bdarcus · March 11, 2009

This problem is a big one, and I really dislike any solution that relies on a central hub, whether it's zotero, or anything else (and bibtex keys aren't going to work).

I blogged about some of this recently, BTW.

migugg · March 11, 2009

Well I dislike central hub solutions too, but also in you blogpost I do not see another solution. Maybe the idea would be to solve it somehow like bittorrent-trackers do. But how to do this in detail I cannot imagine at the moment

bdarcus · March 11, 2009

The solution for XML document formats like ODF and OOXML is to a) use standard fields b) embed the citation source metadata in the document package, and c) use standard global identifiers (URIs) to link them. You then process the document, with no connection to the bib app.

That doesn't really solve this problem for other formats (RTF and such) though. For that, some variant of the above approach would work, but I think without an obvious URI, it'd have to be an automated key so that the data can be automatically extracted and used to match records; maybe [first author family name:year:title slug], with some defined algorithm for the slug.

It's an ugly problem though.