Requirement for string scan in CSL
In Bluebook citation, I have come up against two formatting cases that cannot be handled without looking inside the string content of a field. Both of these involve page numbers. As I dig through the style requirements I may come across other instances, but these two are already definite. Rule 3.3(a) of the 17th edition of the Bluebook provides as follows ...
(a) Pages. Give the page number or numbers before the data parenthetical, without any introductory abbreviation ("p." and "pp." are used only in internal cross-references [cross reference omitted]:
Satisfying this rule without the manual retouching of citations will require some means of scanning the content of the relevant fields. In this particular instance, the ability to identify whether the first or the last character in a field matches any of 0-9 would be sufficient. I don't know what the prospects are of getting this into CSL and the formatting engine, but it's the only way I can see of solving this problem.
(a) Pages. Give the page number or numbers before the data parenthetical, without any introductory abbreviation ("p." and "pp." are used only in internal cross-references [cross reference omitted]:
Use "at" if the page number may be confused with another part of the citation; use a comma to set off "at" from preceding numerals:Arthur E. Sutherland, Constitutionalism in America 45 (1965).
[...]
Biographical Directory of the Governors of the United States 1978-1983, at 257 (Robert Sobel & John W. Raimo eds., 1983).
Thomas I. Emerson, Forword to Catharine A. MacKinnon, Sexual Harassment of Working Women, at vii, ix (1979).
Satisfying this rule without the manual retouching of citations will require some means of scanning the content of the relevant fields. In this particular instance, the ability to identify whether the first or the last character in a field matches any of 0-9 would be sufficient. I don't know what the prospects are of getting this into CSL and the formatting engine, but it's the only way I can see of solving this problem.
A lot of styles have a lot of stupid rules (as in, difficult to formalize in the language of computers), and so we have to exercise some judgment about the right balance of simplicity, consistency, and breadth.
Gaining general acceptance for a change of this scale would require agreement from the consortium that maintains the style (The Columbia Law Review Association, The Harvard Law Review Association, the University of Pennsylvania Law Review, and The Yale Law Journal) ... I would be doubtful about the prospect of success on that front.
If you are definite that this cannot be accommodated, I'll add it to the list of errata in the notes on the zotero style for the present. Should I take it that that is where things stand?
(Just one minor correction: my name is Frank. ;)
I'm not making any categorical statements (certainly not about Zotero) but just observing that adding regular expression support or otherwise scanning content is adding a significant amount of complexity with unclear payoff.
I'm also noting that the difficulty is based on the fact that these rules are about the convenience of human authors, who can make contextual judgments in ways that are just hard to do here (and unnecessary with computers). But academia is, as a friend once said to me, more conservative than the Vatican ;-)
But ... it sounds like the logic here might be understood as if there's a numeric variable after the page number, add the "at." In that case, it might be possible to do this currently using the "is-numeric" condition. See if you can get that to work and let us know.
I didn't know about the is-numeric condition. Growing pains. I'll have a go. No problem about trade-offs; I have quite a knack for writing awful code, and it's actually comforting when someone has the good sense to apply the brakes.
<choose>
<if locator="page">
<choose>
<if is-numeric="locator">
<text value=", XXat"/>
</if>
</choose>
</if>
</choose>
<text variable="locator" prefix=" "/>
This is just a simple case for testing (the logic is backwards), but it always produces a bare number (" 23" or " xvii"). As a wild guess, the locator field seems to contain a couple of entities, and is-numeric seems to be checking against the label rather than the content. If that's right, and if I have not overlooked a means of discriminating the two, it's a bug.
diff -r -u chrome.orig/content/zotero/xpcom/csl.js chrome/content/zotero/xpcom/csl.js
--- chrome.orig/content/zotero/xpcom/csl.js 2008-12-16 18:11:52.000000000 +0900
+++ chrome/content/zotero/xpcom/csl.js 2008-12-16 17:55:05.000000000 +0900
@@ -1882,7 +1882,9 @@
"issue":"issue",
"number-of-volumes":"numberOfVolumes",
"edition":"edition",
- "number":"number"
+ "number":"number",
+ "note":"extra",
+ "locator":"locator"
}
/*
* Gets a numeric object for a specific type. <number variable="edition" form="roman"/>
Your humble servant,
FB