Complex reference ordering rules
HI
I am happy to be directed to a discussion my searching did not find, but I wanted to ask about ordering of references. I have searched around, but I have not found a solution.
I have a reference style I have been asked to implement, and the rules for ordering the reference list are somewhat complicated, or at least they seem that way to me.
The rules are:
1. Alphabetically by first author, within that by year, within that by order cited. I think this happens already. BUT...
2. If we have several papers with same first author, list 2-author papers alphabetical by second author and within that by year. BUT...
3. Then go to 3+ author papers, and order by year and then by order in main text.
This is not as arbitrary as it looks. 1 and 2 author papers are cited as (Jones 1990) and (Jones & Brown 1992), but 3+ authors are (Jones et al 1990), so because the 2nd author is not shown in the citation, the 2nd author is not used for ordering in the ref list.
So if we had a bunch of stuff from AB Jones, we could have
Jones AB & Brown LM (2004). title etc...
Jones AB & Smith CD (1989). [Smith comes after Brown]
Jones AB, Brown LM & Smith CD (1989). [3+ author paper comes after all 2 author are sorted]
Jones AB, Wilson RS, Brown LM & Smith CD (1990a). [ordered by year and then by appearance in text -- this and next 3 all appear as (Jones et al 1990X) in citations, so are disambiguated by letters, X = a, b, c]
Jones AB, Brown LM & Smith CD (1990b).
Jones AB, Wilson RS & Smith CD (1990c).
Jones AB, Wilson RS & Brown LM (2000). [2000 is later than 1990]
Jones M (1995). [Jones M comes after Jones AB]
Jones M (2010).
SO what we have here is:
(1) Sort by first author
(2) Group by number of authors into 3 groups (1, 2, 3 or more), and sort by group 1 then 2 then 3.
(3) Sort 1-author papers by year then by order cited
(4) Sort 2-author papers by 2nd author and then by year then by order cited
(5) Sort 3+ author paper by year and then by order cited
Is this even possible?
I am happy to be directed to a discussion my searching did not find, but I wanted to ask about ordering of references. I have searched around, but I have not found a solution.
I have a reference style I have been asked to implement, and the rules for ordering the reference list are somewhat complicated, or at least they seem that way to me.
The rules are:
1. Alphabetically by first author, within that by year, within that by order cited. I think this happens already. BUT...
2. If we have several papers with same first author, list 2-author papers alphabetical by second author and within that by year. BUT...
3. Then go to 3+ author papers, and order by year and then by order in main text.
This is not as arbitrary as it looks. 1 and 2 author papers are cited as (Jones 1990) and (Jones & Brown 1992), but 3+ authors are (Jones et al 1990), so because the 2nd author is not shown in the citation, the 2nd author is not used for ordering in the ref list.
So if we had a bunch of stuff from AB Jones, we could have
Jones AB & Brown LM (2004). title etc...
Jones AB & Smith CD (1989). [Smith comes after Brown]
Jones AB, Brown LM & Smith CD (1989). [3+ author paper comes after all 2 author are sorted]
Jones AB, Wilson RS, Brown LM & Smith CD (1990a). [ordered by year and then by appearance in text -- this and next 3 all appear as (Jones et al 1990X) in citations, so are disambiguated by letters, X = a, b, c]
Jones AB, Brown LM & Smith CD (1990b).
Jones AB, Wilson RS & Smith CD (1990c).
Jones AB, Wilson RS & Brown LM (2000). [2000 is later than 1990]
Jones M (1995). [Jones M comes after Jones AB]
Jones M (2010).
SO what we have here is:
(1) Sort by first author
(2) Group by number of authors into 3 groups (1, 2, 3 or more), and sort by group 1 then 2 then 3.
(3) Sort 1-author papers by year then by order cited
(4) Sort 2-author papers by 2nd author and then by year then by order cited
(5) Sort 3+ author paper by year and then by order cited
Is this even possible?
https://github.com/citation-style-language/styles/blob/master/american-geophysical-union.csl#L677
<key macro="author" names-min="1" names-use-first="1"/>
<key macro="author-count" names-use-first="3"/>
First line sorts by 1st author only, second line by number of authors but only uses first 3 so cannot give values higher than 3.
I can then sort on 2nd author name or year, but I'm struggling to make 2 author papers sort on 2nd author name then year when 3 author papers sort only on year. I can't image how to sort differently based on number of authors.
Thank you for your advice. It is much appreciated. I have a style which looks very neat on the page, but really suits manual formatting. It has conditionals that make it hard. For example, the citing part of it says that you separate citations with commas unless one of them has commas in it -- then you use semicolons. So for example you'd have (Smith 2006, Jones 2007), but you'd have (Smith 2006, 2007; Jones 2007) -- semicolon because Smith 2006, 2007 needs a comma. I am trying to convince the custodians to just use a semicolon all the time...
after-collapse-delimiter=", "
I have done some tests. So, without your suggestion but with delimiter set to semicolon in cs:layout element
<layout delimiter="; " prefix="(" suffix=")">
I get semicolons between all entries, but commas within collapsed ones (version 1):
(Jones AB & Smith 1989; Jones AB et al 1989, 1990; Jones MN 1995; Jones AB 2004; Jones AB & Brown 2004; Campbell & Pedersen 2007)
(Jones AB & Smith 1989; Jones AB & Brown 2004) --- neither entry has comma
The first follows the style correctly. Because Jones AB et al has 2 years sep by a comma, all entries are separated by semicolons (which I want when any one of them includes a comma), but the second example is wrong, because neither entry has a comma so the separator should be a comma. Sadly, I cannot base the separator following an entry on whether that entry is collapsed or not, because I have to change the separator between all entries if any one of them is collapsed in such a way as to have a comma.
If I add the after-collapse-delimiter specifier to the
<citation>
line, I get (version 2):(Jones AB & Smith 1989, Jones AB et al 1989; 1990, Jones MN 1995, Jones AB 2004, Jones AB & Brown 2004, Campbell & Pedersen 2007)
(Jones AB & Smith 1989, Jones AB & Brown 2004)
Which is now correct for the second one, but the reverse of what I need for the first one -- the years 1989 and 1990 are separated by a semicolon and the rest of the entries by commas. I did not explain clearly in my posting above, I see on rereading, for which I apologise.
If I set a comma in the cs:layout delimiter field and semicolon in the after-collapse-delimiter, I get same as my first version, above, not the reverse of version 2.
In fact, I am finding that all entries are being separated by the after-collapse-delimiter, whether they are collapsed or not, and the delimiter set in cs:layout is only affecting the pairs of years in the collapsed entry -- that is, between 1989 and 1990, unless the after-collapse-delimiter is not set, in which case they all follow the cs:layout delimiter. This seems to suggest that the after-collapse-delimiter is overriding the other one more often than it should or my entries are all somehow tagged as collapsed, even when not.
Question: can CSL format all entries in a citation based on the properties of any one? That is, if any one entry is collapsed to give a comma (eg (Wu 2008, 2009)), then separate all entries by semicolons, whether the other ones are collapsed or not... otherwise, use commas?
after-collapse-delimiter="; "
right?First key sorts by the first author
Second key sorts items with 1 author before 2 authors before 3 or more authors
Third line sorts by up to two author names (i.e. alphabetically by 2nd author), but only by the first author for 3+ authors
Fourth line sorts by date.
That seems like it's exactly what you're describing.
I really appreciate your engaging with this. I am also working with the style custodians to see if we can simplify some of their requirements.
If I have a citation like this:
(Wu 2008, Wu 2009, Smith 2010, Jones 2011)
It must collapse like this:
(Wu 2008, 2009; Smith 2010; Jones 2011)
That is, because one entry (Wu) has a comma, all entries are separated by semicolons.
If I have a citation like this:
(Wu 2008, Smith 2010, Jones 2011)
It is already correct -- no entries have commas within them, so we separate using commas. The idea behind the style is an absolute minimum of clutter. A comma is less 'busy' than a semicolon, so is used where possible.
I've spoken to the editor, and the reasons are as follows:
In a citation like your second example:
(Wu 2008, 2009; Smith 2010, Jones 2011)
the comma is doing 2 different things -- separating years in one case and separating separate entries in another. They feel that, at least within a single grouped citation, punctuation should have a consistent function, or readability is impaired because the reader has to switch in interpreting the punctuation. 'Bad information design,' was their phrase (I know you are not endorsing that example, by the way, just saying it's possible.)
I then said, why not always separate with a semicolon, whether you have collapsed entries or not? (That's what I would prefer.)
They said using commas where possible reduces clutter, which is the thrust of their corporate style. As simple as possible -- not simple for the implementer, but for the reader.
I think readability is in the eye of the beholder, somewhat. I can kind of see their point. I would find it a bit odd to use different marks for the same job inside the same set of parentheses. But I reckon changing from commas in one group of citations to semicolons in the next is also pretty undesirable! So, I am going to standardise on semicolons, and then just document the departure from the written style. People will use the CSL stylesheet anyway and it will be the de facto standard.
I really appreciate your time spent on this. This is the most responsive and useful forum I've ever seen!