Non-numeric editions

stevebush13 · August 29, 2011

I deal on a regular basis with editions of books that involve more than a numeral. For example, 1st Vintage ed. or 1st Westminster/John Knox Press ed.

When I enter these into the edition field, they output in my footnote as just 1st ed.

I looked through the forums and thought the problem had been addressed:
http://forums.zotero.org/discussion/4956/difficulty-formatting-complex-book-editions-in-bibliography/

However, when I put the edition text string inside double quotes, it still disregards the text and outputs as: 1st ed.

Microsoft Word for Mac 2011, v. 14.1.2
Zotero 2.1.8

Rintze · August 29, 2011

Surrounding 1st Vintage ed. in double quotes doesn't help? (see the thread you linked to yourself)

Another quick workaround would be to substitute digits for text in these cases, e.g. by using "First Vintage ed.". That should eliminate the problem with most styles.

stevebush13 · August 29, 2011

Correct: surrounding it in double quotes has the same result as no quotes: It outputs in my footnote as 1st ed.

I'm using the Chicago Manual of Style (Note without bibliography)

fbennett · August 29, 2011

Thanks for reporting; I'll look into this.

Rintze · August 29, 2011

I'm still wondering if we can't handle this more gracefully. Maybe we should make "is-numeric" test true only for completely numerical content.

While number extraction is potentially useful, it is only a solution for messy metadata (e.g. by allowing "Edition 12" to be rendered as "12th edition" through cs:number), which should be more a problem of translators & (manual) database curation than of CSL.

adamsmith · August 29, 2011

I'm inclined to agree with Rintze.
Something like "3rd and revised edition", e.g., is quite common and using quotation marks seems to me like a workaround and not a solution.

dstillman · August 29, 2011

Does "is-numeric" parse "Edition 12", or only a field beginning with "12"? If the former, the behavior changed from my post in that other thread, when it matched only if the value began with a number.

Certainly that workaround I suggested and implemented in the other thread was never a good solution, and it should probably be removed. Having "is-numeric" fail on anything other than a number is probably fine for most fields, but I suspect there are an awful lot of cases of "12th ed." and the like out there.

It might be too clever, but we could treat an edition string as numeric if it begins with a number, there's only one space, and the second word begins with one of the localized terms ("ed", "éd", "aufl", "utg"). Then more complicated strings such as "1st Vintage ed." or "1st Westminster/John Knox Press ed." would go through without relying on a field-corrupting hack.

Rintze · August 29, 2011

Does "is-numeric" parse "Edition 12"

citeproc-js does, and currently should: http://citationstyles.org/downloads/specification.html#choose

fbennett · August 29, 2011

I've checked in a processor version that recognizes quoted escapes, but I take the points made above.

There is one off-spec wrinkle in the current processor behavior that I have discussed with Rintze previously, in that it recognizes multiple numeric values, and will apply range collapsing and ordinal suffixes to each number in the set.

It should be possible to work out more clever behavior than quoted escaping, while retaining multiple values. Certainly we (I) need to revisit the is-numeric test, to make sure it dovetails nicely with the behavior of cs:number. (Just in case anyone out there is getting nervous, I won't make further changes until the details have been agreed all around.)

fbennett · August 31, 2011

I've implemented some experimental changes to explore this, and bundled the modified processor in the multilingual client. The changes are described here.

The code is currently more aggressive about dismissing lone descriptors than Dan's suggested fix, at least on single numbers; any lone descriptor will be dismissed (and is-numeric will test true), while multiple descriptors will always be retained (and is-numeric will test false).

I think this may work out alright, although the full description seems complex at first glance. It depends on how nearly this reflects what people would expect.

fbennett · September 2, 2011

I have adopted Dan's suggestion about sniffing for likely-looking descriptor strings on single-number input. I've thrown quite a few tests at the code, and I'm pretty sure that (a) it won't cause headaches for people with legacy data, while (b) encouraging clean input.

I'll bundle the changes with a formal processor release, with a note to developers. I think I have the bases covered, but it will be for the developers to decide whether to adopt the revision. For reference, a bundle of worked examples can be found here.