Primary source materials in Zotero?

I have a lot of complicated questions, but for starters I'll ask two basic ones:

Can Zotero organize and generate properly formatted citations for primary source materials like letters, interviews, and archival documents? If not, is such a capability on the collective to-do list? (Before telling me to simply create new entry templates, I suggest that you consult the Chicago manual's guidelines for citing such sources - and in particular for shortening subsequent citations).

If Zotero can handle such documents, can it import the relevant bibliographic data from online collections like the LOC's American Memory?

My underlying concern here is that historians rely primarily on unpublished sources (most of which use non-standard citation formats) and on sources not available online. I'm interested in whether a "next-generation research tool" will better enable scholars to organize off-line and unpublished sources - something that current-generation bibliographic and notetaking software simply doesn't do.
  • If you click the green button with the plus sign to add a new item, you'll find that Zotero supports letters, interviews, and archival manuscripts, among many other document types.

    Rest assured that we're well aware of the needs of historians. Zotero is developed by the Center for History and New Media, after all. We are also familiar with the Chicago Manual of Style (it's what we use in our own written research), but it's easier said than done shoehorning various new and old media formats into what is a fairly idiosyncratic and antiquated system.

    If after trying Zotero's various item types and export formats you have specific suggestions, we would welcome your comments here or on our dev list.
  • edited February 2, 2007
    I'll use a specific example:

    Let's say I cite a letter from a collection (published, archival, microfilm, whatever) and subsequently cite a different letter from the same collection. The second citation should include the full biblio information for the letter, and a shortened reference to the collection as a whole - even though it's the first citation of this particular document. Also, the bibliography should include a single reference to the collection, and not list each document separately.

    This essentially requires formatting one part of the citation as a separate citation. No bibliographic software that I'm aware of can do this, so I won't be disappointed if Zotero can't do it either. But I hope that there are at least plans for addressing this issue in future versions.

    These rules may well seem arcane, but there are good reasons for most of them. I'm currently preparing a mss. for publication that cites a few dozen different documents from about four or five big collections. Including the full collection information for each document would probably add at least a page of unnecessary ink. So this is actually a good rule to have. Needless to say, editing and proofing all the citations by hand is not fun - in fact, it's exactly the sort of task that bibliographic software should do for us.

    I'm curious about Zotero both because historians are behind it and because of its stated goal of offering workable software solutions for scholars who still do a lot of their research by hand. I know plenty of historians who don't use any bibliographic software, or who use it only for the secondary literature - precisely because it wasn't designed to handle our sources.

    I've browsed your website pretty thoroughly, and have yet to find any explanation of how Zotero addresses - or will someday address - these problems. Which gives me the impression that you're well aware of the needs of those historians who do most of their research "in the browser," but you aren't really offering much to those of us who still work primarily in archives. I very much hope that I'm wrong, and that by posting this query I'll find out about how Zotero's developers plan on addressing some of the thornier primary source citation problems.

    If such plans do exist, by the way, I strongly encourage the designers of this website to start trumpeting them - or else a lot of other historians with not much time to test drive new software will similarly conclude that this software is great for online research, and more of the same for everything else.
  • Ah, now I see what you mean. The problem you describe actually has nothing to do with archival sources per se. It is precisely the same citation problem posed by citing multiple chapters from an edited volume, where one would likewise not want to reproduce the entire citation of the volume itself.

    We already have plans to solve this problem in a post-1.0 release (in the not-too-distant future).
  • Interesting discussion. This is among the reasons I keep saying that Zotero needs to have a policy about reference types and how they intersect with CSL (which I maintain), and that this policy needs to be derived from a public discussion. [And despite me saying this repeatedly for the past six months, it still has not happened.]

    Related, it's also why I called for the inclusion of a generic "Document" type (I still don't really know why we have archival manuscript reference types, when what thosae us who deal with archival materials care about is the characteristics of the document). As Sean notes, the rules about shortening collection information are indeed quite similar regardless of the specific reference type. The trick in understanding Chicago (and indeed other style manuals) is to read beyond the reference types into the general rules.

    BTW, while it's true that some of the "arcane rules" have sensible, space-saving intentions, others of them are simply ridiculous, focused more on making authoring easier before we had computers. A perfect example of this is "op. cit." Every time I read a book or article that uses this convention, I get annoyed. Note, though, that Chicago recommends against it these days.
  • Actually, I don't think that the same rule applies to multiple chapters from an edited volume. At least, I haven't found anything to this effect in my 14th ed. of CMS. The distinction makes sense to me because authors rarely cite more than two or three chapters from a single anthology, but are more likely to cite lots of documents from a single collection.

    I agree with bdarcus about the archival manuscript reference type. In my experience, most biblio programs include these to give the impression that they can handle archival documents, when in fact all they're offering is a generic type that still treats each document citation independently from others in the same collection.

    As I see it, there are different types of items (articles, chapters, interviews, letters, diaries, etc.) that occupy different types of "containers" (journals, books, archives, microfilm, etc.) Each item type can be found in any of the various container types, and the rules for citation formatting will vary accordingly. So biblio software needs to be able to customize how it formats containers (and in some cases, containers of containers) according to the item type _and_ the context (i.e., whether either item or container has been cited already). Also, in some cases containers will be cited on their own, not as part of item-specific citations, and this will require a different citation format. So containers need to exist as separate records within the bibliographic database, producing different formats depending on whether they're cited independently or as part of an item-specific citation. In other words, the software needs to be flexible enough to treat these records as either "items" or "containers," depending on whether you're citing the whole or just a part of the whole.

    I hope that the above makes sense - I figure if there is anyone on earth who can understand it, they're probably involved in this project.

    Again, I strongly encourage Zotero's boosters to talk more about plans to address these issues - I find it ironic that historians are behind this effort, yet most of the "buzz" surrounding it focuses on its integration with the browser, translation from web sites, etc., all of which is great for the secondary lit but has little to do with most primary source research, which is after all our bread & butter.
  • I've been playing around with the "letter" reference type that you mentioned. I appreciate that this software is still in beta stage, but I fail to see any difference between Zotero's "letter" reference type and any other bibliographic program's. There is no field for "recipient," but there is a title field for who knows what reason. In short, it's a generic reference type that fails to include even the basic elements that distinguish letters from other documents.

    More importantly, Zotero seems to assume that letters are only found in archives. What about letters from published collections, or microfilm, or even journals? I've used all of these variations at one time or another, and I'm sure there are more possibilities out there - and I haven't even mentioned other types of documents.

    The key is to create compound citations that combine data from two or more records - in this case the letter and the collection. This solution would address the subsequent shortening problem, and it would also cut down on the number of record types (instead of separate types for manuscript letters, microfilmed letters, and anthology letters, you'd have a single "letter" record type that linked to whatever other record type was needed to complete the citation).

    Whoever figures out how to do this will create the true "next generation" of bibliographic software. With its open-source strategy, this could very well be Zotero - but from what I've seen, you're not there yet.
  • Thanks for your comment. We are well aware of this problem, and we already have plans to solve it in a post-1.0 release. If you're interested in participating in the development process, please see our developer documentation.
  • It doesn't address your other concerns, but just a small note: there's a "Recipient" field for letters in the creator types drop-down—click on "Author" to view the different creator types for an item type. Similar to the other communication-based item types (e.g. interview), it doesn't currently show up in citations (and therefore isn't all that useful), but as Sean said, we're planning to address these issues after 1.0. We'll be discussing much of this on the dev list once we have a bit more time after the next beta release, which will be feature-frozen for 1.0. Thanks for your comments.
  • Thank you for your responses. Would any of you care to elaborate on the plans to address these issues in a post-1.0 release? I've been through your developer documentation site twice now, and it sounds like you're doing some very interesting stuff, but I'm not a software developer myself - just an interested user - so much of the techspeak is over my head. More to the point, I couldn't find anything related to the specific issues I raised above. So I'd appreciate any details you'd care to provide.
  • You might want to look at Trac.
  • edited February 4, 2007
    CloudofDust -- just another "yes, what he/she said" on your point that "Zotero seems to assume that letters are only found in archives. What about letters from published collections, or microfilm, or even journals? I've used all of these variations at one time or another ..."

    I have too. My dissertation (and subsequent book) included letters, reports, memos, press releases, etc., etc., some of which I sourced from physical archives, others from microfilmed collections, and still others from the internet. It's why I've been particularly sensitive to this issue of data modeling and reference types.

    But notwithstanding that you are "just a user" the way you outline the conceptual issues here shows a better understanding of the problem than the vasrt majority of developers, so keep prodding on this. And don't be afraid to jump into discussions about this on, say, the dev list. I was once "just a user" too.

    FWIW, the way I see it there are a few different aspects to this problem. The most low-level—but crucial—aspect is the data model and formatting logic. But sitting above that, you need a GUI that is easy to map onto those. It's not necessarily that straightforward to do that, I think, because if you leave everything configurable, you by definition are giving users more choices, and hence more room for confusion.

    Perhaps an ideal compromise between ease-of-use and flexibility is to have configurable "types" which are assembled from common components.
  • Friends,

    I've looked through a good deal of your documentation, and various developer-oriented forums, and found much that is of interest, but I haven't found anything that addresses my specific question: how are you planning to resolve the many complicated issues surrounding non-standard citation forms? I understand that you have such plans, and I'm happy to hear it, but I haven't come across even a vague statement of what those plans are. Are there actual plans, or simply plans to make plans? Or is my technical illiteracy keeping me from seeing what seems obvious to you?

    Part of my frustration here is that discussions surrounding Zotero (and no doubt countless other open source projects) seem to be taking place at two rather disconnected levels. First, there are the conversations between end users and developers about what features are working, what ones aren't, and how the existing product could be improved. These conversations, however, don't seem to touch on the kind of conceptual issues I'm interested in, and when I ask questions about them in this forum, you suggest that I check out the documentation and the dev list, etc. So I browse my way around the documentation site, and some of the boards devoted to development problems, and find some very interesting conversations (they don't address my concerns directly, but they're still interesting). But those conversations are happening in programming jargon that makes them largely unintelligible to the uninitiated masses, including myself. It sounds like you folks are doing some neat stuff, but I wouldn't want to be quizzed on what I read.

    I'm not saying that you shouldn't use programming jargon, the purpose of which is, after all, to facilitate communication among programmers. My problem is that there doesn't seem to be any room for end user-developer conversations that rise above the most basic level of implementation problems and requests for bells and whistles. Is there an implicit assumption here that only programmers (or those with the time and inclination to learn programming) can contribute much to the conceptual conversations? Or are the conceptual conversations just not happening at all? (bdarcus's earlier comments suggest that they're not).

    I would like to find out more about your plans to address what I see as the greatest limitations of existing biblio programs, and since I've given these problems considerable thought I might even be able to offer some constructive suggestions. That said, I, and I suspect most academic humanists, don't have the time or inclination to learn programming, or for that matter to try and make sense of conversations among developers, most of which don't address my concerns anyway. So while I appreciate the invitation to jump into dev list discussions, I'm reluctant to do so both because of the linguistic barrier and because from what I've seen, Zotero developers aren't actually discussing the issues I care about. I see lots of discussion of how to properly write and validate Zotero code, etc., but not much about the conceptual framework underlying your assumptions about what your code should do. I'm sure that you developed such a framework long ago and haven't felt a need to spell it out explicitly, but I'd really like to see you do so, in reasonably accessible language, at least as regards the more complex citation issues.

    The bottom line is, a solution to those complex citation issues would be far far more interesting to most historians I know than Zotero's existing features - however great they are for other purposes. So if you have plans to solve those issues, please share your ideas and solicit comments, so that those of us who care about that aspect of the project actually know what you're up to and can enter into a constructive conversation about it.
  • edited February 5, 2007
    I'm not a Zotero developer, but I think WRT to these questions:

    "Is there an implicit assumption here that only programmers (or those with the time and inclination to learn programming) can contribute much to the conceptual conversations? Or are the conceptual conversations just not happening at all? (bdarcus's earlier comments suggest that they're not)."

    ... the answer is "no". There has been no discussion. But Dan mentioned they intend to try to open those up once they've released the next beta. So I'd say, be a little patient.

    I mentioned the dev list because I think it *should* be the place for these conversation. It hasn't yet been, but hopefully that will change soon-ish.

    On your first question, I personally don't think higher-level discussions should be limited to developers. Indeed, that they are higher-level conceptual discussions seems to me to provide a common language of discussion between users and developers.

    BTW, you can see some background on related issues in a blog post of mine, and a response from one of the Zotero guys (Josh):

    http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2007/01/27/zotero-and-the-bazaar-what-zotero-should-learn-from-successful-open-source-projects
    http://www.epistemographer.com/2007/01/30/cathedrals-and-bazaars/
  • There is an important point emerging here and elsewhere, that the broader philosophical discussions about the utility and affordances of research software are entangled with technical discussions of how those principles are actually implemented in code; as a scholar, CloudofDust, you are very much an expert on research practices, regardless of your ability to implement those ideas in working code.

    Part of the problem, to me, is that there isn't really a good model in the open source world for where this sort of discussion should take place - because open source projects are traditionally meritocracies where status is measured in lines of code, the dev list is naturally dominated by that sort of focus on implementation. What you're talking about is more of a "strategy" or "vision" discussion, and it's rare to find a place that empowers users (as opposed to developers, though the two categories overlap) to engage in this sort of talk...right now, I'd actually argue that the best place for that discussion is the "Feature Requests" category on this board, where both developers and users can weigh in on broader issues without non-coders feeling excluded (leaving the dev list for the more technical issues, which can intimidate non-coders and stifle free discussion).

    How about if we start a new thread on the "Feature Request" board explaining our current vision for the future of item types, post a note to the dev list inviting everyone to join in, and we forge forward there?
  • edited February 5, 2007
    I think that's a good plan Josh.

    The one suggestion I would offer is that you be careful to frame the discussion such that people understand the issues going in. I tried to do that on the wiki earlier (though it is more developer-oriented):

    http://dev.zotero.org/docs/reference_types

    I don't think it's productive to get into a laundry-list discussion of every type that someone might want. We could easily end up with an incoherant list of 100 or more types. For that reason, better to focus on the conceptual principles and goals.

    BTW, on this:

    "Part of the problem, to me, is that there isn't really a good model in the open source world for where this sort of discussion should take place - because open source projects are traditionally meritocracies where status is measured in lines of code, the dev list is naturally dominated by that sort of focus on implementation."

    ... I think it might be true that status as a measure of ultimate decision-making authority is measured by demonstrated contribution. But that doesn't mean a good open source community doesn't encourage and appreciate contributions from anyone. I certainly do. Good open source projects blur the line between user and developer after all, and one of the things you encourage of users is to get involved in any way they can. Often that leads them into development when they realize nobody else is scratching their own itch. That was certainly my experience.
  • Bruce, I totally agree. My point wasn't to say that only technical contributions matter, but rather about the historical development of open source culture - because the people who've been most involved have been fluent programmers, the two functions of strategy and implementation have been conflated into one discussion. The people who were planning and the people who were building were one and the same.

    Only recently, I'd argue, have more than a few outlier users begun to want to engage in the discussion of how their tools are designed and built (because of the very blurring of lines b/w user and developer that you point to). Thus, my point is that we should disentangle these two conversations, so that those who *don't* have technical skills or inclinations can contribute without feeling intimidated or crowded out of the discussion.
  • Okay, let's take this over to http://forums.zotero.org/discussion/391/ and start having a discussion over hierarchical types...
Sign In or Register to comment.