Corrections to AGU style format

Hi,

I made some corrections to the AGU style. They deal with initializing author's first names and using et al for bibliographic references with more than 10 authors instead of more than 2 (and only using the first author in this case).

I am waiting for SVN access to make the corrections myself. In the mean time, here is a diff of the original and changed files:

*** agu-old.csl 2008-05-20 14:07:10.000000000 +0200
--- agu-new.csl 2008-05-20 14:31:08.000000000 +0200
***************
*** 25,31 ****
</macro>
<macro name="author">
<names variable="author">
! <name name-as-sort-order="first" and="text" sort-separator=", " delimiter=", " form="long" delimiter-precedes-last="always"/>
<label form="short" prefix=" (" suffix=".)" text-case="capitalize-first"/>
<substitute>
<names variable="editor"/>
--- 25,33 ----
</macro>
<macro name="author">
<names variable="author">
! <name name-as-sort-order="first" and="text" sort-separator=", "
! delimiter=", " form="long" delimiter-precedes-last="always"
! initialize-with=". "/>
<label form="short" prefix=" (" suffix=".)" text-case="capitalize-first"/>
<substitute>
<names variable="editor"/>
***************
*** 155,162 ****
</citation>
<bibliography>
<option name="hanging-indent" value="true"/>
! <option name="et-al-min" value="3"/>
! <option name="et-al-use-first" value="2"/>
<sort>
<key macro="author"/>
<key variable="title"/>
--- 157,164 ----
</citation>
<bibliography>
<option name="hanging-indent" value="true"/>
! <option name="et-al-min" value="11"/>
! <option name="et-al-use-first" value="1"/>
<sort>
<key macro="author"/>
<key variable="title"/>
***************
*** 206,209 ****
<text prefix=" " macro="access"/>
</layout>
</bibliography>
! </style>
\ No newline at end of file
--- 208,211 ----
<text prefix=" " macro="access"/>
</layout>
</bibliography>
! </style>
  • I have committed changes to the AGU style to the SVN repository. My changelog entry is copied below. Please comment. In particular, I am not sure what to think about the third change mentioned regarding disambiguation of authors in citations - this might indicate a Zotero bug. Furthermore, this brings up a good point, namely that it would be very useful in Zotero to have the functionality to look for duplicate and incorrect names (e.g. a function that would find J. L. Doe and change it into John L. Doe if you were certain these were actually the same person). I think the old Papyrus software (and possibly EndNotes - I don't know) had this functionality and it was very useful.

    Changelog:


    Modifying agu.csl to better conform to my understanding of
    AGU style based on their style guide and experience with AGU
    references. Changes consist of:

    1) Make it so that all references use initials for author's first
    names. I am not sure if this is officially in the style document, but
    I have never seen an AGU publication that didn't use initials for
    author names.

    2) Make it so that et al. is only used in bibliographic references
    when you have more than 10 authors and that only the first author name
    is used in this case, in accordance with
    http://www.agu.org/pubs/AuthorRefSheet.pdf.

    3) Set the disambiguate-add-givenname option to false for citations.
    This MAY BE INCORRECT (not clear from above mentioned style guide - no
    mention of what to do in ambiguous cases), but setting it to true
    caused it to add the disambiguation in OpenOffice citations in wierd
    places. I think this might be a BUG in Zotero. I am guessing that
    upon citation Zotero was using disambiguate with respect to all
    references in my database, not just those in the document, AND that
    it considered names like John L. Doe and J. L. Doe different (or
    something like that), causing it to add initials when it really wasn't
    necessary. Turning this off will produce the correct result in the
    vaste majority of cases, but will fail in just those cases where
    disambiguate is supposed to be used.
  • More changes:

    1) Changed use of "&" in citations to "and".
    2) Citation prefix and suffix are now [ and ].
    2) Added italic to author names in citations and to volume in
    bibliographic references.

    I will continue modifying as needed without specific comments here unless I am told to do otherwise.
  • I have noticed some more problems with citation and bibliography generation that appear to be general (i.e., not strictly related to this format):

    1) Setting disambiguate-add-names and disambiguate-add-givenname to false appears to do nothing, i.e., the citations continue to have initials and full names associated with references. This is in addition to the other issues related to not equating names correctly and deciding when to disambiguate based on entire Zotero database rathen than references in the particular citation or document.

    2) Sorting of references in bibliography doesn't do what one wants. This is in part because I can't figure out the correct way to implement the sorting used by AGU in the style sheet (if there is one - I think this may be a current limitation of CSL). AGU uses a particular sorting of references. From AuthorRefSheet.pdf, sorting should be:

    1. First author alone, list chronologically, earliest work first.
    2. One coauthor, list alphabetically by coauthor and then chronologically.
    3. Two or more coauthors (i.e., cited as "et al." in text), list chronologically.

    The name-as-sort-order="first" or "all" does not appear to allow enough flexibility for this type of sorting (where sorting depends on number of authors). Furthermore, small changes in the way author names are entered (such as lacking a period on an initial) cause it to treat authors as different and therefore not be ordered as desired.
  • What kind of a sadist came up with these sorting rules???

    You're correct: CSL does not support this, nor I imagine does any other similar styling format (except maybe BibTeX's BST, where you'd have to program the sorting rules).

    The question is: can you figure out a solution to this that is simple to implement (in CSL, in particular)? I'm drawing a blank.

    The other stuff is about Zotero, so I'll leave that aside.
  • It isn't actually a sadist, just looks that way. Sorting this way makes some sense as it makes it easier for the reader to associate an "et al." citation in the text with the appropriate reference as references with more than two authors that have the same first author are grouped together in the bibliography (it may not be clear from my comment above that references are globally grouped in terms of the author last names, but the rules apply only when the first (and perhaps second) authors are the same). The following link is to one of my papers published in an AGU journal for an example of the format:

    http://www.ur097.ird.fr/team/dkaplan/papers/kaplan.et.al.2005.JGR.pdf

    Bibtex supports this format. In fact, AGU has probably had a bibtex format for decades and still receives many submissions in Latex. I think there are a number of other journals that use a similar sorting, not to mention that AGU publishes probably a dozen journals, so this should be supported by CSL somehow.

    I think that the one thing that is missing from CSL to sort this appropriately is an "if" check on the number of authors. If this exists, then you create a special macro "author-for-sort" that will spit out the normal authors if the number of authors is 1 or 2, but will spit out "First Author, ZZZZZ" for cases of more author (but this probably doesn't internationalize well). Then you use this macro as the sort key (I am a bit vague on how exactly <sort> works, but I think this is correct).
  • I have committed some more changes to the agu.csl, mainly relating to formatting and attempting to make sorting as good as currently possible.

    I have also noticed some more oddities. The name-as-sort-order in name-related macros does things I didn't expect it to do. I expected from reading http://dev.zotero.org/csl_syntax_summary that it was related to the sorting of the references in the bibliography (i.e., whether to use all authors or just the first in determining the sorting). But according to the schema at http://xbiblio.svn.sourceforge.net/viewvc/xbiblio/csl/schema/trunk/csl.rnc?view=markup and according to what changing this actually does, it seems this controls whether or not to have the first name before or after the last name. This isn't clear from the text in http://dev.zotero.org/csl_syntax_summary, which perhaps should be modified.

    I have also noted that changing this value in the editor macro does nothing to how editor names for articles in books are formatted in the bibliography (first names always appear before last, no matter what order the editor is in the list of editors). Also, can someone explain how it is decided whether or not to use "et al." for a list of editors for an article in a book, as opposed to the list of authors? Does this just follow the same rules as that established for the author list? I think that often the number of editors listed is more limited than that for the number of authors.
  • edited May 23, 2008
    I think that the one thing that is missing from CSL to sort this appropriately is an "if" check on the number of authors.
    Consider that one sorts like:

    <sort>
    <text macro="author"/>
    <text macro="date"/>
    </sort>

    So the question is, how to add a second key that is "number of authors"? We could just add a variable:

    <sort>
    <text macro="author"/>
    <text variable="author-count"/>
    <text macro="date"/>
    </sort>


    Does that work (conceptually)?

    As for "name-as-sort-order", this is designed to account for the fact that much of the world does not use Western name rules (where "first" name is often the family name, and hence one sorts on the first name).
  • Hmmm, as for the first comment regarding sort for AGU-like styles, I think that won't be enough. AGU-style sorting sorts treats 3 or more authors as the same. Furthermore, any references having 3 or more authors will be sorted immediately after any references having two or less authors that has the same first author, regardless of what the second author's name is on the paper with 3 or more.

    I think conceptually the way to work this would be able to have the key depend on context and to have some filler that would automatically be after all other references in a group. For example, this isn't very well thought out, but perhaps something like the following pseudo-code might eventually work:

    <sort>
    <text macro="first-author"/>
    <choose>
    <if author-count=1/>
    <if author-count=2>
    <text macro="second-author"/>
    </if>
    <else>
    <text macro="text-that-will-trump-all-second-authors"/>
    </else>
    </choose>
    <text macro="year-date"/>
    </sort>

    Creating this sort of thing requires (1) a test on author count, (2) a way to get just one particular author from a list and (3) something that will force one reference to occur after another. Adding a general way to get at individual authors and the author-count seems like a good thing as this would allow you to create macros that decide which names to put in the citation/reference list, as opposed to the current way which basically uses a set of options that try to represent all possible choices. The current approach seems to be running into limitations and replacing the options with perhaps some standard macros seems like a natural evolution.

    As for the comment on "name-as-sort-order", the current implementation and schema certainly don't give the impression that the function has to do with non-Western names. The value "all" in particular doesn't make much sense to me in this context. Furthermore, changing its value changes where first names of authors other than the first appear in the reference (i.e. before or after the last name). This could be useful for sorting in a non-Western context, but more likely people will use it because some reference formats use Doe, J., and J. Smith, whereas others use Doe, J., and Smith, J. Also, won't whether the first or last name is used as the principal sorting text depend on the author, not the format - i.e., you could have a mix of western and non-western authors, some of which use first, some use last. I know of no software that correctly deals with this case.
  • I'll need to think more on the sorting issue.

    On the names, it probably needs to be better documented. Re:
    Also, won't whether the first or last name is used as the principal sorting text depend on the author, not the format - i.e., you could have a mix of western and non-western authors, some of which use first, some use last.
    That shouldn't be a problem, and the current design is in fact based on this assumption. There just needs to be a way to denote the way a name should be handled.
    I know of no software that correctly deals with this case.
    Correct. But the design of CSL should (I hope) make it possible for implementations to do the right thing. Certainly Zotero will at some point.

    It would have been a mistake to go in another possible direction: explicit first/last/middle variables, and forcing styles to rely on them.
  • I thought a bit more about how to denote this sorting. Having the author-count as a key in the sort is the correct way to go, except that author-count must max out at 3. I think a more appropriate pseudo-code is:

    <sort>
    <key macro="first-author"/>
    <choose>
    <if author-count=1>
    <key variable="author-count"/>
    </if>
    <else-if author-count=2>
    <key variable="author-count"/>
    <key macro="second-author"/>
    </else-if>
    <else>
    <key text="3"/>
    </else>
    </choose>
    <text macro="year-date"/>
    </sort>

    Something like this should work and actually seems feasible to program. The only additions I think are the ability to get at the number of the author and individual author names based on the order of the authors. If choose isn't directly allowed in the <sort>, you could probably move all the choosing into a macro.
  • Having the author-count as a key in the sort is the correct way to go, except that author-count must max out at 3
    Right, in our app we use:

    ... ORDER BY first_author, author_count, author, year, title
    where: 'author_count' contains a number of 1 (single author), 2 (one coauthor), or 3 (more than one coauthor)
  • Which "app" are you referring to? The backend SQL search in Zotero?
  • Also, I don't think the query you mention will get the ordering quite right as the second author is only important for ordering if there are only two authors. If there are 3 or more, year is used and authors 2-infinity are ignored for sorting.
  • I was referring to a third-party app (refbase), sorry for the confusion. Note that, in my above example, 'author' was meant to contain a list of all authors.
    If you need to sort 3 or more authors by first author, then by year (i.e. ignoring any coauthors), then it's correct that the my proposed pattern isn't entirely correct for 3 or more authors. Seems I misunderstood your original post, sorry.
  • Don't apologize, it is good to work these things through. I think the approach of thinking how would this be coded as SQL is a good one since CSL is probably eventually translated into something like an SQL query. How would this type of ordering by achieved with an SQL query? How would a general context-dependent "if" be coded into an SQL ordering? I think these are possible, but somewhat hard.
  • @dmk: my impulse is that this is getting into too-complicated territory. Perhaps an analogous issue is et al handling, which we configure with simple options? Could you imagine one or more options that would work?

    As for SQL, I wouldn't worry about it. I'd expect even if one was using a relational database, one would query for a list of results, and then sort internally (to whatever process you're using to format the stuff).
  • This probably could be done with options, but I honestly don't think it is a good approach for moving forward. I am fairly certain there will be other orderings for other formats that will require still more options.... There would also need to be more options for specifying which part of the name to put first in the formatting of the reference for first versus non-first authors and options for which part to use for sorting. EndNotes and BibTeX formatting are fairly complicated and I think this complication is justified because there are a lot of different formats out there.

    I think a more robust way to move forward is more or less what we worked out. By adding ways to get at the number of authors and individual authors you can replace most of the options with macros and ultimately accommodate a lot more formats I think.

    I would push for that, but I could try the option approach if necessary. But I have already started thinking about the other journal formats I use regularly, and I think I can see more options down the road.
  • I'm not saying we can't do what you suggest. I'm saying we need to consider different options, and weigh the pros and cons of each.

    So if you can find the time, I'd appreciate if you could consider posting a formal proposal with both options, preferably on the xbib/csl devel list so that other implementers might be involved in the discussion.
  • I have posted a comment to the xbib/csl devel list:

    http://sourceforge.net/mailarchive/forum.php?forum_name=xbiblio-devel&max_rows=25&style=nested&viewmonth=200805&viewday=26
  • How are things going with this?
  • edited December 11, 2009
    I'm happy to report that the AGU sort algorithm has been implemented in the new CSL processor, using additional functionality offered in CSL v.0.9. The processor is not yet ready for deployment, but it's getting there. As a bit of reassurance, here is a link to the relevant test cases, which (run successfully and) prove that the new processor will be able to handle this sort:

    Test cases:
    http://bitbucket.org/fbennett/citeproc-js/src/tip/tests/std/humans/sort_AguStyle.txt
    http://bitbucket.org/fbennett/citeproc-js/src/tip/tests/std/humans/sort_AguStyleReverseGroups.txt

    (Note that the URLs of these test cases will change fairly soon; the proper location for these files is the CSL repository on SourceForge.)

    Frank Bennett
  • Hi,

    This looks great. Has this been integrated into the released CSL processor that comes with zotero 2.0 yet? If so, has this been integrated into the agu.csl style?

    Thanks,
    David
  • edited December 3, 2009
    Getting synchronization nailed down solid has been the priority for Zotero 2.0, and it is still on the "old" CSL processor. The new kit will be included in version 2.1, which will probably start to surface sometime early next year. So a bit more waiting time, but meanwhile I can report that work on the new processor itself is pretty well finished. The true stress test will be in the Zotero deployment, but several other projects are working with it, and don't seem to be encountering any huge problems. As Steve McQueen says in the Magnificent Seven: "So far, so good."
  • edited October 10, 2011
    I recently noticed that AGU style still didn't sort as required by AGU, even though this already has been possible for a while (since CSL 1.0 and Zotero 2.1). The newest version should sort correctly.

    See also https://github.com/citation-style-language/styles/commit/1d4482d1242b4a07a4baa4bcf5a7d6bd3eaa6374
Sign In or Register to comment.