General Principles

Hi,

Rather than ask for a specific feature, I'd like to discuss general principles in software design that might make Zotero more useful. Zotero is a terrific tool, and the fact that its developers participate in forums like this is a major plus. That's why I'm posting this here instead of one of the forums for some of the other tools mentioned below.

Over the last year or so I've come to rely heavily on several tools. They all do similar or related things, but none of them have identical functionality. The big problem I see is that because they have similar functionality my research is being spread across different tools and their repositories, and tools that are designed to help keep track of research have the dangerous potential of doing the opposite.

The specific tools I use are Zotero (for citations, bookmarking, and note taking), EndNote (for citations and bibliographies), OpenOffice Writer (for note taking), and diigo (for bookmarking and annotations). I find I have to use both Zotero and EndNote because the latter handles lots of references (I have about 20,000) very well and because it integrates much better with my university's reference databases like CSA, EBSCO, or Science Direct, as well as with publishers' online databases and notification services like Blackwell Synergy. In general, I find EndNote much more "lightweight" (i.e., quicker to use and more responsive) than Zotero, but its integration with ordinary web documents and note-taking features are far weaker. I say "ordinary web documents" because most of the time with pdf files and other even slightly out of the ordinary web material (e.g., any of the U.S. Census Bureau's surveys), I have to enter the citation information manually. I use OpenOffice.org Writer to take notes involving mathematics, since it's much better for this than Zotero (or Word or EndNote). I use diigo to annotate and comment on web pages, including online pdf documents. None of the other tools comes close on this function.

One thing none of these packages does well is deal with web pages being displayed under a parent page. For example, if I use LexisNexis to retrieve newspaper articles, they appear as pdf files in a frame (I think it's a frame). Neither Zotero nor diigo gets the correct citation information, and neither EndNote nor OpenOffice Writer are designed even to try. I have to enter the bibliographic information manually, and I'm not even sure the URL will be reliable in Zotero or diigo since it points to the LexisNexis interface rather than to the individual newspaper article.

In any case, because no one application has all the functionality I need, I have some items' reference information in EndNote, Zotero, or diigo and notes in Zotero, EndNote, or as separate OO.o files. In addition, before using this set of tools I used Scribe and WikidPad. I've also used Vue and Mindmap to draw the logic of some things I'm researching or writing. So my notes and references are diffuse, to say the least.

I am a strong believer in the Software Tools philosophy found in Unix and the software design principles its developers advocate. In particular, the idea of having one tool do one thing well is a good one. The power of Unix is that one can then use the shell to cobble together a customized set of tools to do a more sophisticated task. The problem with the applications mentioned above is that while they all do an excellent job with slightly different tasks, their integration is very poor. Here then are some thoughts about future directions for Zotero that might help address this issue.

1) Always have command line versions of common tasks. This will make automating tasks and creating complex tasks easier, particularly with things like AutoHotkey or shell scripts.

2) Try not to reinvent the wheel. Use interfaces with common, open-source tools (e.g., using OO.o as one's note editor or linking to or importing OO.o files).

3) Where possible, use standards and document the internals of Zotero's operations (data structures, etc.).

4) Provide auxiliary tools to automate integration tasks.

5) Always consider cross-application synchronization problems to minimize their impact.

6) Generate a log, well documented and readily available, to keep track of imports/exports, etc.

7) I do not think it's possible or desirable to have an all-encompassing, multifunction tool. The pace of innovation is too fast. Instead, establish and join standards bodies for such things as citation and notes formats, database searches, graphics integration, etc.

These are just some thoughts. As we look ahead to Web 2.0 for academic research, this issue is likely to increase in importance. The software we use for our research will benefit if developers explicitly add this issue to their checklist of design considerations.

Marsh Feldman
  • Interesting points Marsh, and I generally agree with them.

    But on the other side, let me just point out a couple of things.

    First, some of the limitations you note in Zotero will get resolved. The performance problems are in part because of limitations in how Mozilla is integrating SQLite ATM. Once they support unicode collation at the database level, Zotero won't have to resort to Javascript to sort the table. I do think Firefox is a little sluggish, so I guess that won't get resolved (absent by Mozilla).

    Second, and related, the Zotero team have been borrowing from and collaborating with other efforts. For example, I wrote the CSL language they use totally independent of this project. They have contributed to that rather than invent their own. The same is true of the current work on the RDF ontology. I really think that work is crucial to opening up Zotero—and scholarly workflows in general—in the way that I think you want.
  • Yeah, I didn't mean my post as a criticism, and I was sure that behind the scenes the Zotero developers were already addressing many of the issues. I just wanted to raise the issue in a more open forum, not only to voice it but also to encourage discussion that might lead to some advances in addressing the nexus of interoperability problems.

    Marsh
Sign In or Register to comment.