Corrections to AGU style format
Hi,
I made some corrections to the AGU style. They deal with initializing author's first names and using et al for bibliographic references with more than 10 authors instead of more than 2 (and only using the first author in this case).
I am waiting for SVN access to make the corrections myself. In the mean time, here is a diff of the original and changed files:
*** agu-old.csl 2008-05-20 14:07:10.000000000 +0200
--- agu-new.csl 2008-05-20 14:31:08.000000000 +0200
***************
*** 25,31 ****
</macro>
<macro name="author">
<names variable="author">
! <name name-as-sort-order="first" and="text" sort-separator=", " delimiter=", " form="long" delimiter-precedes-last="always"/>
<label form="short" prefix=" (" suffix=".)" text-case="capitalize-first"/>
<substitute>
<names variable="editor"/>
--- 25,33 ----
</macro>
<macro name="author">
<names variable="author">
! <name name-as-sort-order="first" and="text" sort-separator=", "
! delimiter=", " form="long" delimiter-precedes-last="always"
! initialize-with=". "/>
<label form="short" prefix=" (" suffix=".)" text-case="capitalize-first"/>
<substitute>
<names variable="editor"/>
***************
*** 155,162 ****
</citation>
<bibliography>
<option name="hanging-indent" value="true"/>
! <option name="et-al-min" value="3"/>
! <option name="et-al-use-first" value="2"/>
<sort>
<key macro="author"/>
<key variable="title"/>
--- 157,164 ----
</citation>
<bibliography>
<option name="hanging-indent" value="true"/>
! <option name="et-al-min" value="11"/>
! <option name="et-al-use-first" value="1"/>
<sort>
<key macro="author"/>
<key variable="title"/>
***************
*** 206,209 ****
<text prefix=" " macro="access"/>
</layout>
</bibliography>
! </style>
\ No newline at end of file
--- 208,211 ----
<text prefix=" " macro="access"/>
</layout>
</bibliography>
! </style>
I made some corrections to the AGU style. They deal with initializing author's first names and using et al for bibliographic references with more than 10 authors instead of more than 2 (and only using the first author in this case).
I am waiting for SVN access to make the corrections myself. In the mean time, here is a diff of the original and changed files:
*** agu-old.csl 2008-05-20 14:07:10.000000000 +0200
--- agu-new.csl 2008-05-20 14:31:08.000000000 +0200
***************
*** 25,31 ****
</macro>
<macro name="author">
<names variable="author">
! <name name-as-sort-order="first" and="text" sort-separator=", " delimiter=", " form="long" delimiter-precedes-last="always"/>
<label form="short" prefix=" (" suffix=".)" text-case="capitalize-first"/>
<substitute>
<names variable="editor"/>
--- 25,33 ----
</macro>
<macro name="author">
<names variable="author">
! <name name-as-sort-order="first" and="text" sort-separator=", "
! delimiter=", " form="long" delimiter-precedes-last="always"
! initialize-with=". "/>
<label form="short" prefix=" (" suffix=".)" text-case="capitalize-first"/>
<substitute>
<names variable="editor"/>
***************
*** 155,162 ****
</citation>
<bibliography>
<option name="hanging-indent" value="true"/>
! <option name="et-al-min" value="3"/>
! <option name="et-al-use-first" value="2"/>
<sort>
<key macro="author"/>
<key variable="title"/>
--- 157,164 ----
</citation>
<bibliography>
<option name="hanging-indent" value="true"/>
! <option name="et-al-min" value="11"/>
! <option name="et-al-use-first" value="1"/>
<sort>
<key macro="author"/>
<key variable="title"/>
***************
*** 206,209 ****
<text prefix=" " macro="access"/>
</layout>
</bibliography>
! </style>
\ No newline at end of file
--- 208,211 ----
<text prefix=" " macro="access"/>
</layout>
</bibliography>
! </style>
Changelog:
Modifying agu.csl to better conform to my understanding of
AGU style based on their style guide and experience with AGU
references. Changes consist of:
1) Make it so that all references use initials for author's first
names. I am not sure if this is officially in the style document, but
I have never seen an AGU publication that didn't use initials for
author names.
2) Make it so that et al. is only used in bibliographic references
when you have more than 10 authors and that only the first author name
is used in this case, in accordance with
http://www.agu.org/pubs/AuthorRefSheet.pdf.
3) Set the disambiguate-add-givenname option to false for citations.
This MAY BE INCORRECT (not clear from above mentioned style guide - no
mention of what to do in ambiguous cases), but setting it to true
caused it to add the disambiguation in OpenOffice citations in wierd
places. I think this might be a BUG in Zotero. I am guessing that
upon citation Zotero was using disambiguate with respect to all
references in my database, not just those in the document, AND that
it considered names like John L. Doe and J. L. Doe different (or
something like that), causing it to add initials when it really wasn't
necessary. Turning this off will produce the correct result in the
vaste majority of cases, but will fail in just those cases where
disambiguate is supposed to be used.
1) Changed use of "&" in citations to "and".
2) Citation prefix and suffix are now [ and ].
2) Added italic to author names in citations and to volume in
bibliographic references.
I will continue modifying as needed without specific comments here unless I am told to do otherwise.
1) Setting disambiguate-add-names and disambiguate-add-givenname to false appears to do nothing, i.e., the citations continue to have initials and full names associated with references. This is in addition to the other issues related to not equating names correctly and deciding when to disambiguate based on entire Zotero database rathen than references in the particular citation or document.
2) Sorting of references in bibliography doesn't do what one wants. This is in part because I can't figure out the correct way to implement the sorting used by AGU in the style sheet (if there is one - I think this may be a current limitation of CSL). AGU uses a particular sorting of references. From AuthorRefSheet.pdf, sorting should be:
1. First author alone, list chronologically, earliest work first.
2. One coauthor, list alphabetically by coauthor and then chronologically.
3. Two or more coauthors (i.e., cited as "et al." in text), list chronologically.
The name-as-sort-order="first" or "all" does not appear to allow enough flexibility for this type of sorting (where sorting depends on number of authors). Furthermore, small changes in the way author names are entered (such as lacking a period on an initial) cause it to treat authors as different and therefore not be ordered as desired.
You're correct: CSL does not support this, nor I imagine does any other similar styling format (except maybe BibTeX's BST, where you'd have to program the sorting rules).
The question is: can you figure out a solution to this that is simple to implement (in CSL, in particular)? I'm drawing a blank.
The other stuff is about Zotero, so I'll leave that aside.
http://www.ur097.ird.fr/team/dkaplan/papers/kaplan.et.al.2005.JGR.pdf
Bibtex supports this format. In fact, AGU has probably had a bibtex format for decades and still receives many submissions in Latex. I think there are a number of other journals that use a similar sorting, not to mention that AGU publishes probably a dozen journals, so this should be supported by CSL somehow.
I think that the one thing that is missing from CSL to sort this appropriately is an "if" check on the number of authors. If this exists, then you create a special macro "author-for-sort" that will spit out the normal authors if the number of authors is 1 or 2, but will spit out "First Author, ZZZZZ" for cases of more author (but this probably doesn't internationalize well). Then you use this macro as the sort key (I am a bit vague on how exactly <sort> works, but I think this is correct).
I have also noticed some more oddities. The name-as-sort-order in name-related macros does things I didn't expect it to do. I expected from reading http://dev.zotero.org/csl_syntax_summary that it was related to the sorting of the references in the bibliography (i.e., whether to use all authors or just the first in determining the sorting). But according to the schema at http://xbiblio.svn.sourceforge.net/viewvc/xbiblio/csl/schema/trunk/csl.rnc?view=markup and according to what changing this actually does, it seems this controls whether or not to have the first name before or after the last name. This isn't clear from the text in http://dev.zotero.org/csl_syntax_summary, which perhaps should be modified.
I have also noted that changing this value in the editor macro does nothing to how editor names for articles in books are formatted in the bibliography (first names always appear before last, no matter what order the editor is in the list of editors). Also, can someone explain how it is decided whether or not to use "et al." for a list of editors for an article in a book, as opposed to the list of authors? Does this just follow the same rules as that established for the author list? I think that often the number of editors listed is more limited than that for the number of authors.
<sort>
<text macro="author"/>
<text macro="date"/>
</sort>
So the question is, how to add a second key that is "number of authors"? We could just add a variable:
<sort>
<text macro="author"/>
<text variable="author-count"/>
<text macro="date"/>
</sort>
Does that work (conceptually)?
As for "name-as-sort-order", this is designed to account for the fact that much of the world does not use Western name rules (where "first" name is often the family name, and hence one sorts on the first name).
I think conceptually the way to work this would be able to have the key depend on context and to have some filler that would automatically be after all other references in a group. For example, this isn't very well thought out, but perhaps something like the following pseudo-code might eventually work:
<sort>
<text macro="first-author"/>
<choose>
<if author-count=1/>
<if author-count=2>
<text macro="second-author"/>
</if>
<else>
<text macro="text-that-will-trump-all-second-authors"/>
</else>
</choose>
<text macro="year-date"/>
</sort>
Creating this sort of thing requires (1) a test on author count, (2) a way to get just one particular author from a list and (3) something that will force one reference to occur after another. Adding a general way to get at individual authors and the author-count seems like a good thing as this would allow you to create macros that decide which names to put in the citation/reference list, as opposed to the current way which basically uses a set of options that try to represent all possible choices. The current approach seems to be running into limitations and replacing the options with perhaps some standard macros seems like a natural evolution.
As for the comment on "name-as-sort-order", the current implementation and schema certainly don't give the impression that the function has to do with non-Western names. The value "all" in particular doesn't make much sense to me in this context. Furthermore, changing its value changes where first names of authors other than the first appear in the reference (i.e. before or after the last name). This could be useful for sorting in a non-Western context, but more likely people will use it because some reference formats use Doe, J., and J. Smith, whereas others use Doe, J., and Smith, J. Also, won't whether the first or last name is used as the principal sorting text depend on the author, not the format - i.e., you could have a mix of western and non-western authors, some of which use first, some use last. I know of no software that correctly deals with this case.
On the names, it probably needs to be better documented. Re: That shouldn't be a problem, and the current design is in fact based on this assumption. There just needs to be a way to denote the way a name should be handled. Correct. But the design of CSL should (I hope) make it possible for implementations to do the right thing. Certainly Zotero will at some point.
It would have been a mistake to go in another possible direction: explicit first/last/middle variables, and forcing styles to rely on them.
<sort>
<key macro="first-author"/>
<choose>
<if author-count=1>
<key variable="author-count"/>
</if>
<else-if author-count=2>
<key variable="author-count"/>
<key macro="second-author"/>
</else-if>
<else>
<key text="3"/>
</else>
</choose>
<text macro="year-date"/>
</sort>
Something like this should work and actually seems feasible to program. The only additions I think are the ability to get at the number of the author and individual author names based on the order of the authors. If choose isn't directly allowed in the <sort>, you could probably move all the choosing into a macro.
... ORDER BY first_author, author_count, author, year, title
where: 'author_count' contains a number of 1 (single author), 2 (one coauthor), or 3 (more than one coauthor)
If you need to sort 3 or more authors by first author, then by year (i.e. ignoring any coauthors), then it's correct that the my proposed pattern isn't entirely correct for 3 or more authors. Seems I misunderstood your original post, sorry.
As for SQL, I wouldn't worry about it. I'd expect even if one was using a relational database, one would query for a list of results, and then sort internally (to whatever process you're using to format the stuff).
I think a more robust way to move forward is more or less what we worked out. By adding ways to get at the number of authors and individual authors you can replace most of the options with macros and ultimately accommodate a lot more formats I think.
I would push for that, but I could try the option approach if necessary. But I have already started thinking about the other journal formats I use regularly, and I think I can see more options down the road.
So if you can find the time, I'd appreciate if you could consider posting a formal proposal with both options, preferably on the xbib/csl devel list so that other implementers might be involved in the discussion.
http://sourceforge.net/mailarchive/forum.php?forum_name=xbiblio-devel&max_rows=25&style=nested&viewmonth=200805&viewday=26
Test cases:
http://bitbucket.org/fbennett/citeproc-js/src/tip/tests/std/humans/sort_AguStyle.txt
http://bitbucket.org/fbennett/citeproc-js/src/tip/tests/std/humans/sort_AguStyleReverseGroups.txt
(Note that the URLs of these test cases will change fairly soon; the proper location for these files is the CSL repository on SourceForge.)
Frank Bennett
This looks great. Has this been integrated into the released CSL processor that comes with zotero 2.0 yet? If so, has this been integrated into the agu.csl style?
Thanks,
David
See also https://github.com/citation-style-language/styles/commit/1d4482d1242b4a07a4baa4bcf5a7d6bd3eaa6374