Help understanding a few passages from a doc of Better BibTeX for Zotero
Hello,
I'm reading the documentation for the Better BibTeX add-on. I've read [this doc](https://retorque.re/zotero-better-bibtex/citing/) and found that I couldn't make sense of a few passages.
The passages are the following:
> `postfix=`/`postfix+1=`: a pseudo-function that sets the citekey disambiguation postfix using an [sprintf-js](https://www.npmjs.com/package/sprintf-js) format spec for when a key is generated that already exists. Does not add any text to the citekey otherwise. You _must_ include _exactly_ one of the placeholders `%(n)s` (number), `%(a)s` (alpha, lowercase) or `%(A)s` (alpha, uppercase). For the rest of the disambiguator you can use things like padding and extra text as sprintf-js allows. With `+1` the disambiguator is always included, even if there is no need for it because no duplicates exist. The default format is `%(a)s`.
> `0`: an alias for `[postfix=-%(n)s]`. Emulates the disambiguator of the standard Zotero exports. When you use `[zotero]` in your pattern, `[zotero][0]` is implied
> `replace`\=find (string), replace (string), mode? (‘string’ | ‘regex’): replaces text, case insensitive; `:replace=.etal,&etal` will replace `.EtAl` with `&etal`
> `select`\=start? (number), n? (number): selects words from the value passed in. The format is `select=start,number` (1-based), so `select=1,4` would select the first four words. If `number` is not given, all words from `start` to the end of the list are selected.
> `substring`\=start? (number), n? (number): (`substring=start,n`) selects `n` (default: all) characters starting at `start` (default: 1)
The first and second passages come from the "Functions" section, while the rest from the "Filters" section.
Could someone give me an explanation as to what they are saying?
I'm reading the documentation for the Better BibTeX add-on. I've read [this doc](https://retorque.re/zotero-better-bibtex/citing/) and found that I couldn't make sense of a few passages.
The passages are the following:
> `postfix=`/`postfix+1=`: a pseudo-function that sets the citekey disambiguation postfix using an [sprintf-js](https://www.npmjs.com/package/sprintf-js) format spec for when a key is generated that already exists. Does not add any text to the citekey otherwise. You _must_ include _exactly_ one of the placeholders `%(n)s` (number), `%(a)s` (alpha, lowercase) or `%(A)s` (alpha, uppercase). For the rest of the disambiguator you can use things like padding and extra text as sprintf-js allows. With `+1` the disambiguator is always included, even if there is no need for it because no duplicates exist. The default format is `%(a)s`.
> `0`: an alias for `[postfix=-%(n)s]`. Emulates the disambiguator of the standard Zotero exports. When you use `[zotero]` in your pattern, `[zotero][0]` is implied
> `replace`\=find (string), replace (string), mode? (‘string’ | ‘regex’): replaces text, case insensitive; `:replace=.etal,&etal` will replace `.EtAl` with `&etal`
> `select`\=start? (number), n? (number): selects words from the value passed in. The format is `select=start,number` (1-based), so `select=1,4` would select the first four words. If `number` is not given, all words from `start` to the end of the list are selected.
> `substring`\=start? (number), n? (number): (`substring=start,n`) selects `n` (default: all) characters starting at `start` (default: 1)
The first and second passages come from the "Functions" section, while the rest from the "Filters" section.
Could someone give me an explanation as to what they are saying?
I just wanted to understand the passages, that's all.
However, now that we are on the topic of citekeys, I'd like my citekeys to be a timestamp of the time at the moment of adding the item. For example, if I add an item on May 19, 2021, at 09:12:13, the citekey would be `20210519091213`.
I'd also like that if my citekeys would get duplicated, like say I'm batch adding items to Zotero, the seconds in the timestamps would change to prevent duplicates. For instance, an item would have the citekey `20210519091314` and another added at the same time would get `20210519091315`.
I think that I can make the first part of the citekey myself, but the other part about preventing duplicates doesn't seem to be possible using BBT's functions. What are your thoughts?
Thanks for the clarification on what functions and filters do. I didn't quite get that when I read the doc.
As for what I'm looking for, I just wanted to understand the doc, though would you mind reading my reply to @adamsmith? There I describe the type of citekey I'd like to have.
I'll take a look at adding dateadded to the functions up to the second mark, but not a rule to use-seconds-when-necessary-to-disambiguate. It'd be trivial though to write a bit of javascript for the javascript runner that pins the citekey to dateadded though.
WRT your response, the short answer is "currently, you can't", and the slightly longer answer is "I can get you partway there".
In regards to my desired citekeys, it's a shame that there's no way to make them automatically. However, you were right: they are possible if I resort to workarounds.
What I did was to make the pattern `[Extra:select=2,1]`, add it to the citation key format, then add to the `Extra` field of an item the text `Citekey: [some timestamp]`. This doesn't conflict with other text in the field and generates the type of citekey I wanted.
As for the passages I didn't understand, would you believe me if I told you that I now understand them? I couldn't make sense of them before no matter how I read them, but after engaging with you two and toying a little with the citation key format, they are clearer to me.
Thanks a ton to @adamsmith and you for your immeasurable help!
extra
field, you might as well just addCitation Key: [some timestamp]
because that's an instruction to BBT to not generate a citekey but just take whatever's there. Side benefit: this will sync.dateAdded
is in UTC, or in local time, when I request it from an item (perhaps separately from how it sits in the DB)?You are right: what I made doesn't generate citekeys; it just takes what is in the `Extra` field.
About adding `Citation Key: [some timestamp]` to `Extra`, may I ask why? Is there any difference between "Citation Key" and "Citekey"?
Regarding the topic of workarounds, now you got me interested in your workaround. I could use that and still use BBT's way of resolving conflicts between citekeys, which would automatize the process quite a lot for me. Could you tell me more about it?
Also, may I add that the function was down-to-the-minute? That would shorten the citekey a little, and that'd be good for my purposes.
Technically that would still be a generated key from BBTs POV.
Citekey
doesn't mean anything special to BBT,Citation Key
does; the latter means "use this citekey, don't generate", and it will also work for the regular Bib(La)TeX exporter, so it will show up on overleaf for example. In BBT terms this is called "pinning the citekey", and you'll see a pin displayed next to the citekey in Zotero. In this case, theExtra:select
etc won't actually do anything, as this disables pattern-based generation for that item.The disambiguation simply adds one of three types of postfixes to the generated key:
a
toz
it'll proceed intoaa
etcA
toZ
it'll proceed intoAA
etcyou can add text around the disambiguator, eg to emulate the standard Zotero keys to make the postfix
-1
,-2
etc. But these are the only options, and the choice for disambiguator not under further control, it will just pick the first available; if you have eg postfix[postfix=-%(n)s]
, andheyns2021
andheyns2021-2
exist, the next entry to generateheyns2021
will getheyns2021-1
.Pinned keys are exempt from all of this. If you pin a key, BBT will use that as-is, even if you have 50 copies of that key pinned.
Oh it doesn't actually matter what comes after the down-to-the; there is a date-formatting filter that you could use to format the date however you please. I just meant that the formatting cannot be variable in length.
The down-to-the-minute function alongside delayed auto-pinning would be a great workflow for me. However, don't bother to make the function just for my use case. I'll just use what is available or do it manually.
Oh, and on second thought, my idea about increasing the seconds was indeed a bit of a crazy idea.
Regarding the date-formatting filter, is it possible to achieve what I want using the filter and the `Date Added` field plus the delayed auto-pinning? I made the pattern `[Date Added:format-date=YYYYMMDDHHmmss]` but it doesn't do anything.
If it's not possible, then I'll just add the citekeys manually using the format you described: `Citation Key: [some timestamp]`. I can make that quick enough to appear "automatic" with Ubuntu's shortcuts or AutoKey, so I guess I kind of get what I was looking for.
DateAdded
, but also the:format-date
filter currently discards time information.I can make the change that makes this possible, but you'll need to open an issue on github for it. My support/dev infra builds on GH issue workflows. It's not a big change.
First and foremost, I must thank you for your follow-ups. You've been very kind and attentive to me, and I appreciate that.
With regards to the filter, how much time and effort would it take you to make it, and would it be beneficial for others, including you? I'm kind of interested in automatizing my citekeys, but if this is only going to help me, then I'd prefer doing things manually so you don't have to go through any troubles.
I've only now realized that you mentioned in your last comment that this isn't a big deal, so I'll just open an issue in GitHub. Sorry for not paying attention to that.
TBH I don't think this'll be used much, but philosophically, it is fitting that the capitalized attribute access can grab any simple value on the item (so not attachments or tags for example), so why not dateAdded/dateModified.
Could you give me some pointers on why you think the filter won't be used? Or time-based citekeys in general?
I find time-based IDs to be a very efficient tool to identify things because they are always roughly of the same length and are independent of everything other than the time of creation.
I'll take a shot in the dark here and guess that their readability is what makes time-based citekeys so rare.
I'm not contesting that they work for you, but everyone I know wants the article being cited be directly identifiable from the citekey, so that if you read the TeX source, it makes sense as part of a sentence -- I don't see how the dateadded could serve that function. I'm not entirely sure how to interpret the word "readable" here; sure, these dateadded-keys are readable, in the same sense that "chembatal" is readable, but it doesn't convey anything to me that tells me something I need in the context I would use citekeys.
No judgement. Just surprised.
However, I realize that setting on a format will take a while to accomplish, and I have limited time. Therefore, I'll keep things simple, stick to what I've chosen, then change the format later on if needed.
Thanks for sharing your thoughts on this.
You know what, I've made up my mind; I'll use citekeys like those in the first example instead of time-based ones.
Regarding my TeX usage, I'm pretty new to this realm of writing and referencing. I'm learning Zotero and setting up things like citekeys so I can provide accurate references in my Zettelkasten. But, I might learn TeX in the future. It sounds interesting.
Thank you so much for all your comments. You've saved me from using a citation key format that could have possibly been harmful.
That format means I always pretty-much know what the citekey will be so, when accessing it via notes in my zettelkasten (implemented via Zettlr), the key-stroke sequence always brings up only a handful of options from which one can choose to add the link. Some more info about implementation of a zettelkasten with Zettlr/Zotero can be found at a blog post (lower down the page somewhat).
The take-away is that a simple, consistent and informative pattern for citekeys will save you endless hours of effort -- luckily I started with a good one 30 years ago, so it has only gotten easier over time... Best of luck!
Ah, I see. In that case, I won't use delayed auto pinning. I'll stick to the routine you've explained instead.
Thank you once again for your sharing your thoughts.
Well, what better way to settle on a good citekey than to rob yours? Hope you don't mind. Thanks!
Also, I'll look into integration between software in the future. I didn't think of that, and it might be useful.