Quick fix for BibTeX bug

New Zotero user here, trying to get the BibTeX support up and running for my library. The BibTeX exporter doesn't currently respect the "first name" and "last name" distinctions that we can type into the Zotero client. It's a one-line fix, though.

If you change the original code:

BibTeX.js:
...
for each(var creator in item.creators) {
var creatorString = creator.lastName;

if (creator.firstName) {
creatorString = creator.firstName + " " + creator.lastName;
}
...

to:


for each(var creator in item.creators) {
var creatorString = creator.lastName;

if (creator.firstName) {
creatorString = creator.lastName + ", " + creator.firstName;
}

then the BibTeX processor itself should do all your dirty work for you, as it has support built-in for comma-separated names. This will make things like "First name: Simon", "Last name: Conway Morris" work correctly, without you having to do anything like add braces in the translator (which breaks things like "van der Sloot") or add braces in Zotero itself (which breaks sorting).

I'd submit a patch if I knew where. Or, rather, I suppose I just did submit a patch, is this the right place to put it?

  • I feel like this has been discussed, and that the local BibTeX experts have a good reason for the current behavior, but I can't recall what it is. Hopefully one of them will show up and explain.
  • The one thing that *can* go wrong with that format, it seems to me, is suffixes -- you'd need to specify John van Doe Jr. as "van Doe, Jr." for last, and "John" for first. But I'd think anybody who's really worried about quality BibTeX export could just *do* that...
  • I don't know if there's a reason not to do this. It is better than bracketing name parts, as had been previously suggested.
  • So given the current recommended practice for entering particles:

    lastName: Johnland
    firstName: John, de

    and suffixes:

    lastName: Frankson, Sr.
    firstName: Frank

    we'd get author lists like:
    Johnland, John, de and Frankson, Sr., Frank
    instead of the current:
    John, de Johnland and Frank Frankson, Sr.

    I suppose that's no worse.
  • So given the current recommended practice for entering particles
    Who recommends this, where is it recommended, and why is it recommended? Is this to preserve sorting within the zotero interface? My understanding was that the updated CSL parser would be able to handle particles in the lastName...
  • Hmm, ok, I didn't know that was the current recommended practice. (I've always considered the "de/van/von/van der/etc" part of the last name...)

    Say you had a worst case scenario name, "ll, ss, ff, vv" ("Johnland, Sr., John, de"). BibTeX would see that and parse it as:

    Last: Johnland
    First: John, de
    von: (blank)
    Suffix: Sr.

    So yeah, not great, but it's an easy tweak if you're worried about it.
  • The particle issues are discussed in citeproc-js's Dirty Tricks: http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#dirty-names

    Dropping particles have to be in the first name, it seems.

    Of course, this is all a bit of a hack-- I still hope to see a more robust name entry system in a future release of Zotero, so we can explicitly set these parts as necessary.
  • Ahh, now there's a distinction BibTeX doesn't have -- between "dropping" and "non-dropping" particles. For BibTeX, all the "von part" (which is "de, de la, van, von, van der, von der, St", and a few other things) are "non-dropping" in the CSL sense. And hence putting them in the last name field is actually appropriate. As for the "dropping" participles, putting them the way the CSL docs say ("Humboldt", "Alexander von") will work just fine in BibTeX.

    So actually, BibTeX is completely compatible with that documentation file you sent along...
  • Not to be overly topical, but a recommendation like that is not much better than telling LaTeX users to put braces in Zotero fields...
  • Completely a fair criticism. Unfortunately, names support is hard. The only part of the hack that is particularly dirty is for dropping particles, so most people can use the two-field names rather naively and get correct output.
  • Non-dropping particles are also (much) more common than the dropping ones.
  • Sure. Just pointing out that this is something that should be fixed properly if feasible.
  • To fully define structured names, you'd need 5 fields (given, dropping particle, non-dropping particle, family and suffix).
  • So are we OK with changing to comma-delimited, inverted names in our BibTeX export?
  • Given that this change covers basically every case that Zotero currently lets you input, I certainly don't see why we wouldn't! (Of course, I'm a bit biased here, as it's making trouble for my library.)

    (I don't, on a related note, have a good idea for what to do with the more general names problem -- users would almost certainly be confused by *five* fields into which you can type inputs for every single name, especially users in languages that don't have particles, or that don't maintain a distinction between dropping and non-dropping particles. OTOH, there's no reliable way to detect the difference between dropping and non-dropping particles, so there's no clear way to deal with that problem with more code.)
  • I'm mainly trying to be conservative here, since we have tens (hundreds?) of thousands of people who use Zotero's BibTeX export on a regular basis.
  • I think it is a good idea & it had actually been on my todo list for a while.
  • Please go to http://github.com/ajlyon/zotero-bits/raw/master/BibTeX.js and save the file to the translators directory of your Zotero data directory (http://www.zotero.org/support/zotero_data).

    It should work the way you request. If this works for you, please post here so that I can submit this change to be pushed to all users.
  • Aha. Found a bug (may be the reason you haven't done this before). I can't find the code that wraps things in braces, but the author name "Godfrey-Smith, Peter" gets turned into "{Godfrey-Smith,} Peter", which makes BibTeX ignore the comma.

    Where are those braces applied? They need to be applied *before* the comma-joining happens.
  • They're applied in the writeField(..) function, which adds braces around anything with a non-word-initial capital letter (incidentally, that algorithm isn't internationalized and won't handle non-initial capitals beyond those in ASCII).

    The easiest approach is to put a space between the last name and the comma, or we could add a special case for creators that would move the comma back out.
  • incidentally, that algorithm isn't internationalized and won't handle non-initial capitals beyond those in ASCII
    not a horrible assumption for most BibTeX that is generated, but feel free to fix it.
    The easiest approach is to put a space between the last name and the comma, or we could add a special case for creators that would move the comma back out.
    Do we need to brace the commas for any other field? If not:

    https://github.com/karnesky/zotero-bits/commit/a8a7f87bba141a6c56ff2e5d631312dba09feed6#BibTeX.js
  • Looking again, it's not that simple to handle arbitrary capital letters in regular expressions in JavaScript, so I suppose that issue can remain until someone actually reports running into problems as a result of it.

    Your proposed fix is something I considered, but I wasn't sure if there was ever a good reason to have a braced final comma. I can't think of any reason either-- so let's try this version:

    Please go to http://github.com/ajlyon/zotero-bits/raw/master/BibTeX.js and save the file to the translators directory of your Zotero data directory (http://www.zotero.org/support/zotero_data).
  • That works perfectly on the dataset that I've got available to me at the moment (the latest draft of an article I'm working on).
  • Ok. This is now in the repository. Hopefully this won't cause any issues...
Sign In or Register to comment.