Text substitution
Hi -
I'm designing a citation style for the Astrophysical Journal (ApJ). This style requires certain journal abbreviations (for example, ApJ!). However, most citation sources spell out the journal name (Astrophysical Journal). I don't want to change the Zotero database entries, because then they won't be correct for other styles. Instead, what I want to do is translate from the full names to the shortened names. Basically something like:
<if publication = "Astrophysical Journal"> <text value="ApJ">
...
But I can't find any way to actually compare a variable *value* with something. I can only check whether or not a variable has a value at all. Is there some way to do this?
I'm designing a citation style for the Astrophysical Journal (ApJ). This style requires certain journal abbreviations (for example, ApJ!). However, most citation sources spell out the journal name (Astrophysical Journal). I don't want to change the Zotero database entries, because then they won't be correct for other styles. Instead, what I want to do is translate from the full names to the shortened names. Basically something like:
<if publication = "Astrophysical Journal"> <text value="ApJ">
...
But I can't find any way to actually compare a variable *value* with something. I can only check whether or not a variable has a value at all. Is there some way to do this?
Check e.g. the Nature style to see how this is called in .csl.
Style-specific abbreviations will almost certainly not make it into CSL-1.0. It'd be nice to come up with a solution to this.
For everything else - are Journal abbreviations really style dependent? The idea would be that there is some fixed rule to create abbreviations, I guess. Since we've seen a number of styles that also want that for authors (in sometimes bizarre ways) I guess allowing users to integrate algorithms to create abbreviations would be good. Maybe very hard?
If you write for ApJ, you'd abbreviate "The Astrophysical Journal" to "ApJ".
If you write for Nature, you'd abbreviate it as "Astrophys. J."
IIRC, Endnote handles this situation by allowing you to make a database of journal names, each of which can have three or so abbreviations. You then select which abbreviation to use. This is somewhat crude & not ideal. Zotero could potentially do the same, outside of CSL.
But I don't know why this couldn't be handled inside CSL, so that other CSL users could benefit. Why "hard?" I think one good design would be to point to some URI in the CSL file that had a list of substitution patterns to follow. Generating these initial lists could be done automatically for some journals (due, in part, to open data from the data incubator project & similar free use sources) & could be slowly cleaned up by those who use the styles.
So, right: a) it won't be in 1.0, and b) I can't promise it will ever be in CSL.
Yet I doubt we'll ever change that either - or at least not in the medium future.
So we'll just have to do our best to accommodate as much as reasonably possible of the existing styles.
Admittedly, "reasonable" is open to interpretation, but Journal abbreviations seems quite common in the hard sciences.
What does bibtex do with that btw?
Others use a reference manager (e.g. JabRef & probably even EndNote) that has multiple abbreviation lists & will make a new .bib for each journal.
One could put the abbreviations in the .bst file. But those are somewhat hard to write & the input from a .bib file is not standard & those lists aren't maintained.
makebst does come with a few lists, so some newer styles written w/ makebst now come with style-specific abbreviations.
I'd have to check on BibLaTeX....
Maybe this is a science thing; the range of possible journals is too large in many other fields to be having to account for every possible variation, and to do so would defeat the purpose anyway.
I don't actually think, though, that this has much of any benefit for readers (I find it really annoying to have to look up acronyms, for example); more likely a cost-saving measure for print publishers.
That said, feel free to submit a ticket to xbib, with a suggested solution.
But I'll say upfront that there are some devil-in-details issues here.
For example, I have always resisted adding regular expression support to CSL. I will almost surely continue to resist that.
Also ,it might as well be a general abbreviation solution, considering that historians and others, for example, sometimes use these sorts of abbreviations for archival collections, or committees and such. Maybe:
<string-substitute match="some-science-journal">
<text value="SSJ"/>
</string-substitute>
That might make sense for CSL, but might have some performance implications if all strings had to be run through this sort of a process.
Hmm. Algorithms are never going to work for this one.
The processor API can be set up to accept a hashed list of abbreviations that is maintained elsewhere. Inside the processor, applying the substitution to journal titles is trivial, and would have no impact on the CSL schema and almost no impact on performance. So that part is easy. In fact, I'll implement it this afternoon. :)
The heavy lifting is in composing and maintaining the per-journal abbreviation lists themselves. That would require a smooth online revision mechanism and a process of review. After that's in place, you just need a means of delivering the lists to the processor together with journal CSL files. No list, no abbreviations. Simple.
If we are really worried about the high rate of initial changes to abbreviation lists and/or the inherent overlap, perhaps that's one reason to think about linking to the substitution lists from the CSL file.
The idea behind keeping it off CSL markup is that, if you have one list per journal, and the lists can change independently, then the link is implicit. There might be lots of shared lists and shared sublists and whatnot, but there's nothing gained by coding those relations into the CSL files. Basically you just need some mechanism for answering the question, "What abbreviations should I use with this style"?
If an archive or interface is set up someplace, I'll be happy to feed in the stuff for Bluebook.
Am wondering if it's at all feasible to put in data, as with the linked periodical data stuff I've previously mentioned.
Wow! I wasn't expecting such a lively debate on what I thought was a simple question. I will throw in my two cents since I started the whole thing:
1) Yes, it's annoying that journals use different citation styles, and yes, it's annoying that journals require different abbreviations. However, isn't that the whole *point* of CSL? If there were only one citation style, we wouldn't need CSL at all. As a random user, I would want CSL to be sophisticated enough that it could handle what I need to do to satisfy someone's writing guidelines. The guidelines aren't going to change just because I'm using CSL. If CSL can't do what I need to do as an author, I will just have to create/edit the bibliography by hand, which defeats the whole purpose. I need to emphasize that these abbreviations are NOT OPTIONAL when submitting papers.
2) I'm finding it hard to believe that performance is really a big issue here. What kind of bibliography is someone going to have to create that running a few dozen IF statements or regular expressions is going to cause a noticeable delay?
3) While I understand people are trying to solve the Big Picture problem, I just want to write a paper that's due in a few weeks :-) No one actually answered the original question - can I do a comparison of a variable against a constant string? That certainly works for me in the short term, and probably even in the long term. It also seems like a feature that could have many applications in various styles, and wouldn't be difficult to implement.
Thanks,
Rob
Am not saying this example necessarily applies, but we do have to balance competing issues here. Well, the solution I am contemplating here wouldn't involve any if statements: it would run all strings (maybe from a limited list of variables) through the function. But you're probably right. No. You'll just have to run a manual search-and-replace (or script) at the end when you submit.
Don't get me wrong - I love Zotero and the CSL plugins, and it has really changed the way I work.
Rob
<text variable="container-title" form="short"/>
<text variable="container-title" abbreviation="BIOSIS"/>
The first line would use the abbreviated journal field. The second line can use any identifying information (ISSN, journal name, journal abbreviation) to look up the correct abbreviation in a list of journal abbreviations (or the journal abbreviation can be generated using an algorithm).
(BIOSIS is a commonly used abbreviation-list in the life sciences, which, very strangely, isn't made available online: http://www.library.illinois.edu/biotech/j-abbrev.html. BTW, can we expect any legal hassle when we'd ship these lists with CSL?)
I'm also relatively unconcerned about legal hassles: there are multiple open source projects that ship with abbreviation lists & there are lists that are maintained privately by libraries, at least one of which would likely grant permission for use. The more narrow, journal-specific lists are often too short and lack any sort of novelty to retain a US copyright & the journals would have a material benefit and no material harm from having them adopted. We won't be able to "copy" verbatim from some of the commercial/subscription lists, but I don't think we have to.
The tests illustrate the data format of the list (a simple JSON key/value list, with the full name of the journal as key). There's a trivial API for installing a list in the processor, which can be adapted to whatever scheme emerges from this discussion.
EDIT: Adjusted URL of link to tests.
I know it's more work and someone has got to do it, but this really seems like the simplest solution that solves the most problems.
The way things are at the moment is a pretty terrible band-aid solution: without the ability to at least batch edit the abbreviation fields, it's tedious work making sure everything is proper even for a single journal!