Help understanding a few passages from a doc of Better BibTeX for Zotero

Annie_of_the_Stars · May 18, 2021

Hello,

I'm reading the documentation for the Better BibTeX add-on. I've read [this doc](https://retorque.re/zotero-better-bibtex/citing/) and found that I couldn't make sense of a few passages.

The passages are the following:
> `postfix=`/`postfix+1=`: a pseudo-function that sets the citekey disambiguation postfix using an [sprintf-js](https://www.npmjs.com/package/sprintf-js) format spec for when a key is generated that already exists. Does not add any text to the citekey otherwise. You _must_ include _exactly_ one of the placeholders `%(n)s` (number), `%(a)s` (alpha, lowercase) or `%(A)s` (alpha, uppercase). For the rest of the disambiguator you can use things like padding and extra text as sprintf-js allows. With `+1` the disambiguator is always included, even if there is no need for it because no duplicates exist. The default format is `%(a)s`.

> `0`: an alias for `[postfix=-%(n)s]`. Emulates the disambiguator of the standard Zotero exports. When you use `[zotero]` in your pattern, `[zotero][0]` is implied

> `replace`\=find (string), replace (string), mode? (‘string’ | ‘regex’): replaces text, case insensitive; `:replace=.etal,&etal` will replace `.EtAl` with `&etal`

> `select`\=start? (number), n? (number): selects words from the value passed in. The format is `select=start,number` (1-based), so `select=1,4` would select the first four words. If `number` is not given, all words from `start` to the end of the list are selected.

> `substring`\=start? (number), n? (number): (`substring=start,n`) selects `n` (default: all) characters starting at `start` (default: 1)

The first and second passages come from the "Functions" section, while the rest from the "Filters" section.

Could someone give me an explanation as to what they are saying?

adamsmith · May 18, 2021

What are you looking for? It's much more efficient to help you with a specific issue than to try to translate technical documentation.

emilianoeheyns · May 18, 2021

Broadly, functions grab text from your item, filters transform that text. But while I applaud trying to understand the tools you use, pretty much what @adamsmith says: if you tell us what you want the citekey to look like, we can take a stab at telling you the pattern that will do it, and then the docs will probably make more sense. (you can blame me for the state of the docs. It's not my talent)

Annie_of_the_Stars · May 19, 2021

@adamsmith

I just wanted to understand the passages, that's all.

However, now that we are on the topic of citekeys, I'd like my citekeys to be a timestamp of the time at the moment of adding the item. For example, if I add an item on May 19, 2021, at 09:12:13, the citekey would be `20210519091213`.

I'd also like that if my citekeys would get duplicated, like say I'm batch adding items to Zotero, the seconds in the timestamps would change to prevent duplicates. For instance, an item would have the citekey `20210519091314` and another added at the same time would get `20210519091315`.

I think that I can make the first part of the citekey myself, but the other part about preventing duplicates doesn't seem to be possible using BBT's functions. What are your thoughts?

Annie_of_the_Stars · May 19, 2021

@emilianoeheyns

Thanks for the clarification on what functions and filters do. I didn't quite get that when I read the doc.

As for what I'm looking for, I just wanted to understand the doc, though would you mind reading my reply to @adamsmith? There I describe the type of citekey I'd like to have.

emilianoeheyns · May 19, 2021

I could try to re-describe those passages, but I think I'd just end up reproducing the text -- do you have more specific questions than "what does this mean"? I know the docs ain't great, but I have a blind spot here -- the docs make sense from my prespective, and I don't know your perspective.

I'll take a look at adding dateadded to the functions up to the second mark, but not a rule to use-seconds-when-necessary-to-disambiguate. It'd be trivial though to write a bit of javascript for the javascript runner that pins the citekey to dateadded though.

emilianoeheyns · May 19, 2021

Alright, if that wording is better than what it was, I've added it to the site, that will be live in 5 minutes or so.

WRT your response, the short answer is "currently, you can't", and the slightly longer answer is "I can get you partway there".

Annie_of_the_Stars · May 19, 2021

@emilianoeheyns

In regards to my desired citekeys, it's a shame that there's no way to make them automatically. However, you were right: they are possible if I resort to workarounds.

What I did was to make the pattern `[Extra:select=2,1]`, add it to the citation key format, then add to the `Extra` field of an item the text `Citekey: [some timestamp]`. This doesn't conflict with other text in the field and generates the type of citekey I wanted.

As for the passages I didn't understand, would you believe me if I told you that I now understand them? I couldn't make sense of them before no matter how I read them, but after engaging with you two and toying a little with the citation key format, they are clearer to me.

Thanks a ton to @adamsmith and you for your immeasurable help!

emilianoeheyns · May 19, 2021

I can get you a down-to-the-second dateadded function, but not a down-to-the-minute-but-down-to-the-second-on-conflict function. If you're OK with always down-to-the-second, I have no issue with that.

What I did was to make the pattern `[Extra:select=2,1]`, add it to the citation key format, then add to the `Extra` field of an item the text `Citekey: [some timestamp]`. This doesn't conflict with other text in the field and generates the type of citekey I wanted.

If you're adding things to the extra field, you might as well just add

Citation Key: [some timestamp] because that's an instruction to BBT to not generate a citekey but just take whatever's there. Side benefit: this will sync.

emilianoeheyns · May 19, 2021

@adamsmith, just as a point of interest, do you know if dateAdded is in UTC, or in local time, when I request it from an item (perhaps separately from how it sits in the DB)?

dstillman · May 19, 2021

It's UTC.

Annie_of_the_Stars · May 19, 2021

@emilianoeheyns

You are right: what I made doesn't generate citekeys; it just takes what is in the `Extra` field.

About adding `Citation Key: [some timestamp]` to `Extra`, may I ask why? Is there any difference between "Citation Key" and "Citekey"?

Regarding the topic of workarounds, now you got me interested in your workaround. I could use that and still use BBT's way of resolving conflicts between citekeys, which would automatize the process quite a lot for me. Could you tell me more about it?

Also, may I add that the function was down-to-the-minute? That would shorten the citekey a little, and that'd be good for my purposes.

emilianoeheyns · May 19, 2021

You are right: what I made doesn't generate citekeys; it just takes what is in the Extra field.

Technically that would still be a generated key from BBTs POV.

About adding Citation Key: [some timestamp] to Extra, may I ask why? Is there any difference between "Citation Key" and "Citekey"?

Citekey doesn't mean anything special to BBT, Citation Key does; the latter means "use this citekey, don't generate", and it will also work for the regular Bib(La)TeX exporter, so it will show up on overleaf for example. In BBT terms this is called "pinning the citekey", and you'll see a pin displayed next to the citekey in Zotero. In this case, the Extra:select etc won't actually do anything, as this disables pattern-based generation for that item.

Regarding the topic of workarounds, now you got me interested in your workaround. I could use that and still use BBT's way of resolving conflicts between citekeys, which would automatize the process quite a lot for me. Could you tell me more about it?

The disambiguation simply adds one of three types of postfixes to the generated key:

a number
a lowercase letter, overflowing into multi-letter postfixes (so if you've used up a to z it'll proceed into aa etc
an uppercase letter, overflowing into multi-letter postfixes (so if you've used up A to Z it'll proceed into AA etc

you can add text around the disambiguator, eg to emulate the standard Zotero keys to make the postfix -1, -2 etc. But these are the only options, and the choice for disambiguator not under further control, it will just pick the first available; if you have eg postfix [postfix=-%(n)s], and heyns2021 and heyns2021-2 exist, the next entry to generate heyns2021 will get heyns2021-1.

Pinned keys are exempt from all of this. If you pin a key, BBT will use that as-is, even if you have 50 copies of that key pinned.

Also, may I add that the function was down-to-the-minute? That would shorten the citekey a little, and that'd be good for my purposes.

Oh it doesn't actually matter what comes after the down-to-the; there is a date-formatting filter that you could use to format the date however you please. I just meant that the formatting cannot be variable in length.

emilianoeheyns · May 19, 2021

If you use down-to-the-minute (which you can't yet, I'd have to add a function for it), you can sort of get the both of best worlds by using delayed auto-pinning; the key will automatically be pinned a configurable number of seconds after creation, but by delaying it you do get disambiguation applied first. If you let it do it's thing, that should prevent duplicates even if articles are added within the same minute.

Annie_of_the_Stars · May 20, 2021

@emilianoeheyns

The down-to-the-minute function alongside delayed auto-pinning would be a great workflow for me. However, don't bother to make the function just for my use case. I'll just use what is available or do it manually.

Oh, and on second thought, my idea about increasing the seconds was indeed a bit of a crazy idea.

Regarding the date-formatting filter, is it possible to achieve what I want using the filter and the `Date Added` field plus the delayed auto-pinning? I made the pattern `[Date Added:format-date=YYYYMMDDHHmmss]` but it doesn't do anything.

If it's not possible, then I'll just add the citekeys manually using the format you described: `Citation Key: [some timestamp]`. I can make that quick enough to appear "automatic" with Ubuntu's shortcuts or AutoKey, so I guess I kind of get what I was looking for.

emilianoeheyns · May 20, 2021

You'd need to use DateAdded, but also the :format-date filter currently discards time information.

I can make the change that makes this possible, but you'll need to open an issue on github for it. My support/dev infra builds on GH issue workflows. It's not a big change.

Annie_of_the_Stars · May 20, 2021

@emilianoeheyns

First and foremost, I must thank you for your follow-ups. You've been very kind and attentive to me, and I appreciate that.

With regards to the filter, how much time and effort would it take you to make it, and would it be beneficial for others, including you? I'm kind of interested in automatizing my citekeys, but if this is only going to help me, then I'd prefer doing things manually so you don't have to go through any troubles.

Annie_of_the_Stars · May 20, 2021

@emilianoeheyns

I've only now realized that you mentioned in your last comment that this isn't a big deal, so I'll just open an issue in GitHub. Sorry for not paying attention to that.

emilianoeheyns · May 20, 2021

No worries.

emilianoeheyns · May 20, 2021

Format-date can already format times, but dateAdded is not a regular zotero item field. I made a special case for getting that off the item, and that was it; it was a less-than-one-line fix.

TBH I don't think this'll be used much, but philosophically, it is fitting that the capitalized attribute access can grab any simple value on the item (so not attachments or tags for example), so why not dateAdded/dateModified.

Annie_of_the_Stars · May 22, 2021

Thanks! I'll make the most out of this.

Could you give me some pointers on why you think the filter won't be used? Or time-based citekeys in general?

I find time-based IDs to be a very efficient tool to identify things because they are always roughly of the same length and are independent of everything other than the time of creation.

I'll take a shot in the dark here and guess that their readability is what makes time-based citekeys so rare.

emilianoeheyns · May 22, 2021

I'm didn't mean the filter wouldn't be used -- I made no change to the filter BTW -- but I don't know anyone except now you who'd want to use the dateadded function in the citekey, never mind the citekey being just the dateadded.

I'm not contesting that they work for you, but everyone I know wants the article being cited be directly identifiable from the citekey, so that if you read the TeX source, it makes sense as part of a sentence -- I don't see how the dateadded could serve that function. I'm not entirely sure how to interpret the word "readable" here; sure, these dateadded-keys are readable, in the same sense that "chembatal" is readable, but it doesn't convey anything to me that tells me something I need in the context I would use citekeys.

No judgement. Just surprised.

Annie_of_the_Stars · May 23, 2021

To be honest, I thought the only purpose of citekeys was to find items in a reference manager, not to identify items from the citekeys. This makes me question my choice of citation key format.

However, I realize that setting on a format will take a while to accomplish, and I have limited time. Therefore, I'll keep things simple, stick to what I've chosen, then change the format later on if needed.

Thanks for sharing your thoughts on this.

emilianoeheyns · May 23, 2021

I mean no slight, but that would indicate to me you haven't actually used TeX much. Citekeys are a meaningful part of the document, not just lookup keys (even though they're of course also lookup keys). Everyone I know who uses LaTeX uses it to write sentences like

according to \citeauthor{nozick1974}
and \citeauthor{kagan1997}, experiences must be veridical to contribute to well-being.

which means I can read the manuscript and understand what it says. If that were

according to \citeauthor{1621801224}
and \citeauthor{1653337224}, experiences must be veridical to contribute to well-being.

that wouldn't make sense to me.

Annie_of_the_Stars · May 30, 2021

@emilianoeheyns

You know what, I've made up my mind; I'll use citekeys like those in the first example instead of time-based ones.

Regarding my TeX usage, I'm pretty new to this realm of writing and referencing. I'm learning Zotero and setting up things like citekeys so I can provide accurate references in my Zettelkasten. But, I might learn TeX in the future. It sounds interesting.

Thank you so much for all your comments. You've saved me from using a citation key format that could have possibly been harmful.

Annie_of_the_Stars · May 30, 2021

By the way, do you use the delayed auto-pinning you mentioned? If so, what value would you recommend?

emilianoeheyns · May 30, 2021

I don't use it myself, no. The delay is meant for cases where eg you want keys that sync, but straight autopinning would mean as the item is created, it'd immediately get a fixed key. Some people want a few seconds or minutes to go over the item and fix it where necessary before the key gets pinned - however much time you need, that's what you should set it to.

jvoros · May 31, 2021

Not autopinning is very useful because, as emiliano says, it gives you a chance to modify the citekey to something that might need to be altered or shortened. See, for example, the pattern I use, described at: https://forums.zotero.org/discussion/comment/381533/#Comment_381533 with the modification emiliano suggested in a followup post to that. I still need to remove the "-" in a hyphenated first-author name, as I have not yet figured out how to modify the pattern to get rid of it (without breaking the key, that is ;-)

That format means I always pretty-much know what the citekey will be so, when accessing it via notes in my zettelkasten (implemented via Zettlr), the key-stroke sequence always brings up only a handful of options from which one can choose to add the link. Some more info about implementation of a zettelkasten with Zettlr/Zotero can be found at a blog post (lower down the page somewhat).

The take-away is that a simple, consistent and informative pattern for citekeys will save you endless hours of effort -- luckily I started with a good one 30 years ago, so it has only gotten easier over time... Best of luck!

Annie_of_the_Stars · June 1, 2021

@emilianoeheyns
Ah, I see. In that case, I won't use delayed auto pinning. I'll stick to the routine you've explained instead.

Thank you once again for your sharing your thoughts.

Annie_of_the_Stars · June 1, 2021

@jvoros
Well, what better way to settle on a good citekey than to rob yours? Hope you don't mind. Thanks!

Also, I'll look into integration between software in the future. I didn't think of that, and it might be useful.

jvoros · June 2, 2021

Rob away! The point is to use one that requires the least amount of brainpower to remember or use -- and use the leftover to put into research and writing ;-)