Separate fields for Title and Subtitle

Currently, Zotero provides only one field for the title of the work. But often, a title consists of a main title and a subtitle. I think Zotero should provide separate fields for both.

Why is this important? This is mainly important for consistent output regarding punctuation. In German, it is common to use a period between the main title and the subtitle. In English, I think a colon or a semicolon are used. This is also brought up in this thread. There, it is recommended to do more parsing on certain fields. Anyway, one would come up with two separate fields in CSL. I think having two separate fields in the data entry mask is easier than parsing title strings.

For some bibliography styles, it is important to have control over the formating of the title/subtitle. MODS already has separate fields for main title and subtitle, as does biblatex.

Cheers,
Frederik
«1
  • At this point, you have a very large burden to prove that there's a need for this that justifies the rather significant UI, data layer, and styling changes that would be required to support it.
    In German, it is common to use a period between the main title and the subtitle.
    Can you be more specific. How "common" is it? Is it uncommon, or broadly understood as incorrect, to use colons as the delimiter?
    Anyway, one would come up with two separate fields in CSL... For some bibliography styles, it is important to have control over the formating of the title/subtitle.
    Haven't we been through this sort of thing before, where you ask for new features without any concrete evidence they're needed? Examples please.
    MODS already has separate fields for main title and subtitle, as does biblatex.
    But the vast bulk of existing practice is in favor of a single field (RIS, Refer, DC, BibTeX, CSL). It also makes it harder to handle some other important features (say different kinds of titles) when you split them.

    Also, both CSL, Zotero, and BIBO (Zotero's preferred import/export format) all have support for a notion of a "short title", which is typically the title sans subtitle.
  • In German, it is common to use a period between the main title and the subtitle.

    Can you be more specific. How "common" is it? Is it uncommon, or broadly understood as incorrect, to use colons as the delimiter?
    Yes, every German bibliographic style I know of (mainly social sciences and humanities) uses the period as a delimiter. One prominent example may be the KZfSS, the most important German journal for Sociology, see this style guide.

    Personally, I always replaced the colons from imports with periods. But this does only work until you don’t publish in English journals, where the colon is common. I stumbled upon this issue since my boss just told me that this was the main stopper preventing him from switching to Zotero. So I thought I’d bring it up here. And that other thread shows that at least one other person already came up with it.
    MODS already has separate fields for main title and subtitle, as does biblatex.

    But the vast bulk of existing practice is in favor of a single field (RIS, Refer, DC, BibTeX, CSL). It also makes it harder to handle some other important features (say different kinds of titles) when you split them.
    Most of them are rather old formats. They are well established, but don’t cover all use cases, especially from the humanities. MODS is quite complex, but also very feature-complete. And biblatex (and other BibTeX-derivatives) was invented to overcome some limits of plain BibTeX which is rather a least common denominator.
    Also, both CSL, Zotero, and BIBO (Zotero's preferred import/export format) all have support for a notion of a "short title", which is typically the title sans subtitle.
    But it doesn’t help you to format the delimiter. And it doesn’t always hold true: In history and related disciplines, in subsequent citations, a short title is used which is typically shorter than the main title (Example). You could, for example, have »Religionsgeschichte Deutschlands im 19. und 20. Jahrhundert« as the main title, »Religionswissenschaftliche Überlegungen« as the subtitle, and »Religionsgeschichte Deutschlands« as the short title.
  • But Frederik, at least KZfSS doesn't care if you use period or colon.
    here a couple of citations from a recent article (Höpner, M.; Jackson, G. Das deutsche System der Corporate Governance zwischen Persistenz und Konvergenz. KZfSS Kölner Zeitschrift für Soziologie und Sozialpsychologie 2002, 54, 362-368).

    There is no pattern here. Seems like they are OK with whatever the original title is. Which seems eminently reasonable to me.

    Jackson, Gregory, 2001: Organizing the Firm: Corporate Governance in Germany and Japan, 1870–2000. Dissertation, New York, NY: Columbia University.

    Jong, Henk Wouter de, 1997: The Governance Structure and Performance of Large European Corporations, The Journal of Management and Governance 1: 5–27.

    Kurdelbusch, Antje, 2002: The Rise of Variable Pay in Germany. Evidence and Explanations. In: Anthony Ferner (Hg.): Special Issue of „European Journal of Industrial Relations“ on Multinational Companies and Globalisation (im Erscheinen).

    Thelen, Kathleen, 2002: How Institutions Evolve: Insights from Comparative-Historical Analysis. In: James Mahoney und Dietrich Rueschemeyer (Hg.): Comparative-Historical Analysis: Innovations in Theory and Method. New York: Cambridge University Press (im Erscheinen).

    Thelen, Kathleen, und Sven Steinmo, 1992: Historical Institutionalism in Comparative Politics. S. 1–32 in: Sven Steinmo, Kathleen Thelen und Frank Longstreth (Hg.): Structuring Politics. Historical Institutionalism in Comparative Analysis. Cambridge, MA: Cambridge University Press,.
    Wójcik, Dariusz, 2001: Change in the German Model of Corporate Governance: Evidence from Blockholdings, 1997–2001. Oxford: School of Geography and Environment, University of Oxford (unveröffentlicht).
  • It is quite hard to come up with strict rules, since in the humanities, theres often a strong focus on common sense and tradition, only few normative guidelines exist.

    Maybe this example helps: http://www.hendrik.maekeler.eu/zitierrichtlinien.de.htm
    Nach dem Namen des Verfassers steht ein Doppelpunkt. Darauf folgt der Titel, der vollständig inklusive Untertitel anzuführen ist, unabhängig von dessen Länge. Zwischen Haupttitel und Untertitel wird ein Punkt gesetzt.
    The last sentence states: »A period is put between main title and subtitle.«
    Das Kurzzitat setzt sich aus dem Nachnamen des Verfassers, dem ersten sinntragenden Substantiv des Titels und dem Erscheinungsjahr (in Klammern) zusammen.
    Short cites contain only the first meaningful noun, which I guess would count as a short title.
  • felwert,

    We're all keen to see title/subtitle correctly supported for German publications; the question is just how best to go about it. As Bruce points out, some pretty sweeping changes would be necessary to split titles into two fields, and then correctly handle the result. Titles from most repositories would need to be correctly split (i.e. when the translator is run to capture the cite data), and during manual entry, users would need to use the two fields correctly. The change might actually make things less reliable overall, rather than more. It would certainly mean more typing for everyone.

    If a single title field can be split reliably into the correct two parts by the CSL processor, we can accomplish the same result (i.e. printing a period between title and subtitle in German styles, and a colon in English styles), without any changes to the database or to Zotero -- and fewer changes to existing kit means an earlier solution.

    At the moment, Zotero doesn't do this, so to cope, you have curated your database with periods between the two elements. Many other German-speaking users will have done the same. It might be that a colon provides a more reliable hint that the field should be split, but let's start by looking at whether there are titles that _cannot_ be split correctly by breaking on the first period. Looking at your database, are there any such titles?

    If you can identify "difficult" titles, let's see if we can work out a way to handle the split correctly. If that can be settled, the next step is easy -- we'll add the title delimiter issue to the todo list for CSL development, and once it's been added to the CSL specfication, I'll add the necessary support to the upcoming CSL processor for Zotero.

    Meanwhile, please tell your boss that we're interested in resolving this issue for German users, but that splitting the title across two fields in the database mask might not be part of the solution.
  • I'm generally in favor of greater control, but where is this a problem, even in German? When would one ever want to cite the subtitle by itself? Can anybody give an example? I wouldn't even want to sort by subtitle. It's extra information, for clarity. Many of my books have long titles and subtitles, but I've never felt the need to break it down into parts for any reason. The combination of full format control via "show editor" and a short title field covers every contingency I've ever encountered. There are more pressing concerns, surely? Endnote has no subtitle field, by the way.
  • I would add that the Chicago style (at least) stipulates that title citations of foreign works follow the conventions of the work's native language: in other words, for publication in English, not only wouldn't you have to change a German title's punctuation, you shouldn't do so. I just looked this up yesterday--and by the way, this is one reason being able to turn off auto-capitalization is a great option. I wish it weren't a hidden option, so more people could take advantage of it.
  • MG6,

    Thanks for this additional info. There seem to be journals out there that require the title/subtitle delimiter to be localized. One of the devs wrote in 2007:
    (And the examples and guides I've seen keep the title-separating punctuation consistent, no matter what language the original was written in. That is German bibliographies cite English works using the period to set off the subtitle, English works cite German works using the semicolon).
    So there is at least anecdotal evidence that this is a real issue.
  • fbennett,

    thanks for your reply. I fully understand that introducing a new field has quite much impact, and so it might not be worth the trouble. I thought that having separate fields is an easier solution in the end, and that it also could be used optional (if one puts everything into the title field, it doesn’t matter as long as you don’t need the features discussed here). But I might be perfectly wrong, it was just my view on the issue.

    And I think that it is more valuable to have a well-thought-out solution in the end, than making quick changes which might turn out as inappropriate. So everything I write is pure brainstorming, everything can be dismissed in the end.

    What I think should be considered when trying to solve this issue:

    • It might not only be about punctuation. I have seen cases where, e.g., the main title is printed in italics, while the subtitle is printed normal. This is probably only an issue for strange German traditions, but one might consider this.
    • A title might have more than main title and subtitle, but also a title addon. This is the case for Festschriften. (see this example). There you have the main title (Das Auge Des Betrachters), a subtitle (Beitrage Zum Konstruktivismus) and a title addon (Festschrift für Heinz von Foerster). This is not so uncommon in German publications.
    I skipped through my library, and it seems that splitting at the first period is in deed problematic in some cases, e.g., abrreviations, like this: »Immigrant Religion in the U.S. and Western Europe: Bridge or Barrier to Inclusion?«. Splitting at the first period would split after U. Considering period-space as a delimiter, one would split after U.S. In this case, one could split after a period only if the title contains no colon, but what for German titles with abbreviations and periods as delimiter?

    Also, question marks can be used as delimiters, as in »Ghettos oder ethnische Kolonien? Entwicklungschancen von Stadtteilen mit hohem Zuwandereranteil«, but I think this is also true for English resources and should be less problematic.

    But I think it would be better to migrate the library once to using only a colon as a delimiter than having an unreliable solution. But this means that it must be known that the colon has a special meaning, and that one should not use a period in titles. That relates to the issue of how one could teach people about special »micro-syntax« used in Zotero’s fields for parsing, which was raised in the topic I linked in my first post.

    I hope I didn’t come up with too much to think about. Just ignore whatever you consider irrelevant with regard to the original issue.

    Cheers,
    Frederik
  • But the issue raised by MG6 is considerable: If some styles require titles to be rendered as they are originally printed (or as customary in the language of the title) - i.e. sometimes periods, sometimes colons - that would make both the two field solution and the substitute solution unworkable. I would certainly want to be able to do that for every title.
    I would also be strongly against a feature that requires users to know that colons have some special meaning - that sounds like a road to user unfriendliness.

    If we need to inconvenience many users to accommodate very few that would seem like a bad idea. If anything, those very few users who really rely on that feature should bear the 'burden' - e.g. use some _very_ uncommon string (#:# or so) to delimit Title and Subtitle - this would also currently be possible and then just require a find&replace at the end...
  • addamsmith, MG6:

    If we have the situation that some styles require the delimiters to be normalized, while some require the original delimiters to be used, then I think we end up at the very edge of Zoteros localization capabilities. This has to my knowledge already been discussed for quotes: The very idea of quoting in CSL was that quotes should be localized to the current locale (I’m not sure if this is already implemented).

    But there are styles which require German titles to use German quotes, and English titles to use English quotes. The same seems to hold true for some styles and title delimiters. This would require language-awareness on a per-title basis. That was discussed for quotes but rejected as being too complicated to implement.

    So if one wants serious localization capabilities which get many of the even complicated corner-cases, one should probably stick to LaTeX/biblatex. But it would be nice if Zotero could be used to manage one’s collection and bibtex files only be exported when actually writing papers. This does, in turn, require that Zotero is capable of exporting various BibTeX dialects, including biblatex, which requires to separate parts of the title. So there we are again.
  • edited July 29, 2009
    But there are styles which require German titles to use German quotes, and English titles to use English quotes.
    There are a lot of odd style rules out there; doesn't mean we have to support all of them.

    My basic criteria for adding features to CSL is:

    1. is the feature really needed?

    2. is it easy enough to implement in both CSL syntax and in CSL-based processors?

    This isn't exactly the classic "80/20" rule that developers often cite, but it's similar.

    Multiple-localization per document (not just of titles, but also of quotes and dates), to my mind, breaks the second rule, and is marginal enough that it's not clear that it satisfies the first.
  • But Frederik - separate fields - while nice for select biblatex users - doesn't work for CMOS - perhaps the most widely used style overall.
    So if there is a trade-off between the two, I think we'd want to go with the one that keeps CMOS correct and intact.
  • I don't suppose there's any chance that the Short Title field will always be a left-bound substring of the Title field? If so, we'd have both the full, unaltered original title and a pretty good hint as to where to look for the delimiter, and it could be manually adjusted if the initial (currently unsophisticated) logic to generate the Short Title failed.
  • Adam, your right. To have the least impact, it will probably best to use some smart kind of title parsing that requires no changes to existant titles. That should keep everything working that works now.

    So probably one could live with something along these lines:

    1. If the title contains a question mark, split at the (first) question mark.
    2. If the title contains a colon, split at the (first) colon.
    3. If the title contains a period, split at the (first) period.
    The only corner case would be titles which contain abbreviations and use periods as delimiters. But I think it is ok to change these to using a colon instead.
  • edited July 29, 2009
    @Dan: That's a neat heuristic, and should greatly reduce the number of failures. I don't think we can put all of our weight on it, though. Two works might have the same main title, and be distinguished by using the subtitle, or some of its buzzwords, as short title.

    For the very small number of remaining failures, would it be permissible to use backslash escapes on individual characters (Frank ducks)? In felwert's example above, changing U.S. to U\.S\. (would be ugly in the UI but) would provide a workaround for exceptional cases, without breaking the data for CSL purposes. Backslash escapes could be stripped (or a stripped copy of the field could be stored separately) for data exchange or record comparison. If that's too ghastly, we could just leave the edge cases to be touched up by hand.

    @felwert: Exactly, that should work. Then in the CSL markup we can treat the two (or three) sides of the split as separate virtual fields, where discrete formatting or selective presentation is required.

    @Bruce: What's the "classic '80/20' rule"? My only CS training was a course in Symbolic Language for the IBM 1401 that I took three decades ago. Our instructor must have skipped over that one.
  • edited July 29, 2009
    I just mean the notion there's a sweet spot of functionality vs. effort, and that going beyond often involves unreasonable effort for minimal gains. It's related to other ideas such as the evils of premature optimization I suppose.

    For CSL, every feature we add, and every corner case we decide to cover, means additional work for implementers, and so less likelihood we'll see libraries finished (which is already a problem), and in turn applications built on top of them.
  • Possibly another way to approach this is via support for field data in multiple languages. As described in this thread, a crucial requirement for many users:

    http://forums.zotero.org/discussion/1798/

    In terms of the title/subtitle problem, more data entry but less processing logic. However I don't think it would handle all the issues with localizing quotes etc., unless you entered every sort of variant (which implies you're not necessarily entering multiple languages but just multiple ways of rendering the title, author, etc.).
  • Multilingual layering will be another exciting can of worms to tackle, but I don't think we need to worry about it here. This one just boils down to three possible choices for a delimiter: force-colon, force-period, or use-original.
  • "Also, both CSL, Zotero, and BIBO (Zotero's preferred import/export format) all have support for a notion of a "short title", which is typically the title sans subtitle."

    I've always thought "short title" would be the place to put abbreviation of the work common in the field you work in. E.g. TESZ.
    I know there is a "journal abbreviation" field for journals, but where would I put a similar abbreviation for standalone work, if not "short title"?
  • is that ever cited formally? If not - in the notes.
  • Well, I believe to have seen styles where works which are referred to by their customary abbreviation in the body text show up with the abbreviation instead of the (author name + year) in the bibliography.

    Something like this:

    ...
    Robertson, A. 1997: "Xxxxx xxx", ...
    TESz: "T.... E.... Sz... " (XY edt.), ...
    Uhlbeck, P. 2001: "Yyyyy yyy", ...
    ...
  • Really? It certainly shouldn't be in the short title field, if anything it seems like a short version of the authors - I think this is probably sufficiently rare and idiosyncratic to remain in the "let the user fix this at the end" category of things, but if you can find a style guide that actually requires that practice that may change people's mind.
  • edited February 10, 2010
    I see at least two issues that haven't really been addressed in this thread. The most significant to me is that, unless the title is hand-entered, being specific about the punctuation is only as good as the punctuation of the title in the database from which the article is imported. I have recently noticed (and commented upon) titles in the Web of Science that do not match the title of the actual printed article. See: forums.zotero.org/discussion/10506/woswok-and-added-hyphens-in-article-titles/ While i always go to the original article to be sure that it says what I think it says, I could easily overlook the precise punctuation of the title if there is a difference between the imported database title and the true title. Second, I don't expect to ever be able to completely depend upon automated formatting. I always feel a need to do a little hand editing of the reference list format. I've used a bibliographic management programs since the CPM (pre-DOS) days so maybe I don't expect enough of my bibliographic tools.
  • edited February 10, 2010
    Not sure why these issues should be addressed in this thread, though?
    Zotero should work in a way that if you have the correct data you get the correct citation output. That's really all it can do - but I do think it should try to come as closely as possible to that ideal.
    For an end user that may mean occasional fixing of items in the database that have, e.g. been poorly imported - but that is a one-time operation - much preferable to doing the same thing every time I cite the same source.

    And sure, people should proofread their stuff - but I think the fewer manual editing necessary the better, no?
  • I came across this discussion, searching for a way to circumvent the fact that I didn't find a mean to get the "short title" field.

    For full note citation, in the subsequent cites, it is very annoying to have a very long title.

    It often happens when books have subtitles, but not only.

    The notes field cannot be a good place for the short title. Notes are used for supplementary informations.
    They appear in the bibliography part, but not for the citations one.

    It' s precisely for citations, subsequent part essentially, that there is a need for short titles.

    So I would very much appreciate to be able to use the short title field.

    Besides, I wondered whether it would be possible to have a parameter ("et-al" alike) to fix a maximum number of characters, followed by a localized string ("...") ?

    If anyone has a solution for writing shortened titles, thanks for letting know it
  • I came across this discussion, searching for a way to circumvent the fact that I didn't find a mean to get the "short title" field.
    You mean in CSL styles? You can simply use <text variable="title" form="short"/> That should print the short title.

    But this is not really related to the issue of having main and subtitle separated.
  • Thanks you very much.

    The question concerned CSL styles.

    I'd coded this way, but it didn't seem to work. I must had done mistakes somewhere at that time.

    After reading your message, I filled some short-titles fields and verified it at once, and could see that my style worked all right!

    Since I'd seen in Zotero Metadata Field that there was no CSL fieldname for short title, I was convinced that it was not possible to use it for the moment, I left the code lines (for the future), but didn't go on filling short-title fields !

    I posted my question in this discussion, for it was related to long titles.

    I support the idea of having more fields, as separate main title and subtitle, but automatism is dangerous, for there are too many "standards", and they are changing all the time.

    Even if have nothing to change in my style, your answer helps me a lot.

    Thanks again. Excuse me for the disturbance.
  • I enter this conversation as an ally of Felwert. I also call for future versions of Zotero to distinguish titles from subtitles by separating the fields.

    1. Contra CM6's claim above, Chicago stipulates that English-language publications need not follow European conventions for subtitles. See CMS 14.97-98. Indeed, many presses insist that their own national conventions be observed on this point, even for citations of material published in a different linguistic tradition. In my experience this line is followed by most presses, nationality aside. And those presses that don't are eclectic in other places where they clearly conflict with Zotero's policy. E.g., they publish bibliographies with a mix of punctuation devices: chevrons around French article titles; American quotes around American-published titles; single quotes around UK-published titles; upside-down opening quotes for German titles, etc.

    2. TEI recognizes the standard. The element <title> includes as recognized values for the attribute @type both "main" and "subtitle".

    3. Zotero has always pursued a policy of trying to get out of the fields anything pertaining to metainformation. The punctuation and spacing separating title from subtitle are elements of metapunctuation; they are not intristic to the titles themselves. This is demonstrated by the great number of books that omit on their cover or title page any punctuation delimiting title from subtitle. If the colon or period were really a part of the title, it would appear everywhere all the time.

    4. Capitalization of a subtitle or subsubtitle is oftentimes dictated by style concerns. Zotero users should not be forced to declare whether a (sub)subtitle should or should not begin with a capital letter. That's a style call.

    5. Short titles cannot be used for this purpose, since that field is frequently being used for other purposes (such as taming the length of an overly long main title).

    6. I work at a press that has a regularizing policy, and a number of my authors have resisted or fought our press's editorial policy on this point. Those same critics have usually not considered the important distinction between metapunctuation and intrinsic punctuation. This should be a non-issue. Publishers, like Zotero, are leaning toward publication models that allow readers to view bibliographic data as they wish. The only way to effectively employ this strategy, however, is if main and subtitle fields can be distinguished. Then those who dislike colons can be as happy as those who prefer them.
    I'm willing to go further, and say that we need not only a subtitle field, but a sub-subtitle and a sub-sub-subtitle field. Yes, I have seen books with four levels of titles. They are usually horribly edited and dreadful to read. But there they are.

    Someone may counter that sometimes one needs to record in a Zotero record what national convention has been followed. That is easily accommodated by faithfully keeping the language field up to date.

    To those who suggest that such a fix would be a heck of a lot of work, I would encourage a long view of things. It doesn't need to happen now. But it needs to happen eventually. The compelling need is already starting to build up a head of steam as more and more publications in the humanities (where people are passionate about punctuation) take to the Web. Developers should plan accordingly.
  • Well-- the discussion of new fields is currently happening, so I suppose it's now or never.

    See http://forums.zotero.org/discussion/15636, and the concrete proposals and their status at https://github.com/ajlyon/zotero-bits/issues .

    I will add that many site translators currently concatenate titles and subtitles that are distinct in the source site to get the present content of the title field. So this would match the structure of much of our source data (including MARC and MODS, as has been noted).

    I'm pretty sure, however, that multiple levels of subtitles are not going to happen any time soon.
Sign In or Register to comment.