Improving BibTeX export

I love zotero so much. But got disturbed by the poor support of bibtex. So I spent quite some time to improve the functionality of the bibtex export. It should be very useful for those who use bibtex/latex a lot, in particular for scientists like me. But I don't know where I should post my modified codes.

Here is the brief list of improvements from my hard work:
1. much more flexible (user control) bibtex key generation: from author names, initials, title, journal, year, volume, pages ,...etc
2. the suffix for key collisions can be numeric or alphabetic
3. key format strings can be read from prefs (extention.zotero.bibtex....)
4. attachments like pdf files can be exported with real path links only. now you can easily get the resulting bibtex file work with other external applications like JabRef and Mendeley.
5. unicode conversions for greek letters
6. user specified field like callNumber to export pre-stored keys
... and other improvements.

It's been working pretty well for me. But I'd like to share it with your guys.
Please drop me a note if somebody knows how to get these improvements added to the next release of zotero.

spartan
«134
  • post to zotero dev
    http://groups.google.com/group/zotero-dev
    and link to your code, which you can most easily share on github.
  • Thanks for the quick reply. just posted it agin on zotero-dev.
    never used github before. I'll try to post my code there.
  • Here is the link:
    Public Clone URL: git://gist.github.com/956623.git


    The readme file is attached below for your convenience:

    ===============
    For the better of the research world that enjoys Zotero, Latex/BibTeX,
    JabRef, Mendeley, SciPlore MindMapping, etc.

    updated: May 4, 2011
    ************************
    * spartanroc@gmail.com *
    ************************

    ===============
    List of files:

    reame -- this file
    bibtex_js_and_path.gz -- tarball of everything
    BibTeXTan.js -- to replace the translator BibTeX.js
    BibTeXTan.patch -- patch file relative to BibTeX.js of zotero v2.1.6
    BibTeXKeyOnly.js -- new translator for quickcopy / drag&drop export of cite keys
    BibTeXTanKeyOnly.patch -- patch file relative to BibTeXTan.js
    item_local.js -- to replace content/zotero/xpcom/translation/item_local.js
    item_local.patch -- patch file to that of zotero v2.1.6
    translate.js -- to replace content/zotero/xpcom/translation/translate.js
    translate.patch -- patch file to that of zotero v2.1.6


    ===============
    New Features and Improvements on BibTeX Export:

    1. very flexible bibtex key format definition using the fields of creators,
    title, journal, year, volume, pages. including options of lower/upper cases,
    initials, lengths, n-th author/word picking, etc. it is also easy to extend
    to include new fields for key generation.

    2. to resolve possible key collisions, users can specify their own suffix
    either alphabetic or numeric.

    3. a special field (e.g., callNumber or shortTitle) can be dedicated for export
    of keys either pre-assigned or stored that are better done manually. Until
    zotero provides a dedicated BibTeX key field or some type of local id field,
    this is still a very appealing hack for most latex/bibtex users.

    4. attachments like pdf files can be exported as file links only. The local
    real path is stored in the file field. No need to export the large files
    themselves. Keeping two copies on the machine is not good, in particular
    for people who like to make comments on pdf files. This is also making zotero
    cooperating better with other external applications like JabRef and SciPlore.
    The real file export still works as expected if the option of exportFileData is
    set to true. Only a small change in item_local.js is needed for such a benefit.

    5. unicode conversions for greek letters.

    6. the strings of controlling the 1,2,3 features are easily modified through
    browser prefs: extensions.zotero.bibtexKeyFormat,
    extensions.zotero.bibtexKeyCollisionFormat, extensions.zotero.bibtexKeyField.
    No need to touch the translator itself for user's own definition. No formats
    will be lost after upgrade. All of these only require a small modification in
    translate.js. It also opens up access to prefs for all the other translators
    which could be potentially very useful for others as well.

    7. A Cite Key only stripped down translator is provided as well in a similar way
    that Andrew Leifer did. This is very convenient for quickCopy or drag&drop in
    latex editing.

    8. some other minor improvements for bibtex export such as better treatment for
    latex special characters like $,\,_,^, etc.


    ===============
    How Key Format String is defined?

    var citeKeyFormat = Zotero.getPrefs("bibtexKeyFormat") ? Zotero.getPrefs("bibtexKeyFormat") : "%au%yr%ti";
    1. A general matching element is % followed by a two-letter label: for
    matching field and specifying the lower/upper case for 1st/remaining letters.
    2. Then it can be optionally followed by a number for the maximum length in
    number of characters. (default=0: means all characters)
    3. Last it can be also optionally followed by a number within curly brackets
    for selecting the n-th word of the field (start from 0-th, the default).
    4. The special repitive 2-letter labels like %TT, %aa, and %JJ are to group
    the first letters only of surnames/words in corresponding fields.
    5. Separators/indicators including +,-,:,., etc, and all letters and numbers
    can be used in between the elements.

    Some examples:
    %au4%yr2%tt2 %AU_%yr_%TI %Au4-%Au4{1}:%JJ%yr:Vol%vo:Pg%pg-%pg{1}
    %Au4{1} = first 4 letters of 2nd creator's surname (first letter capitalized)
    %ti6{2} = first 6 letters of 3rd word in title (all lowercase)
    %AA4 = initials of first 4 creators' surnames (uppercase)
    %tt6 = first letters of first 6 words in title (lowercase)
    %JJ5 = first letters of first 5 words in journal name (uppercase)
    %yr = %yr4 = four-digit year, %yr2 = two-digit year
    %vo = volume number
    %pg = first page number
    %pg{1} = last page number


    ===============
    Collision Format: %a for alphabetic suffix, %n for numeric suffix
    var citeKeyCollisionFormat = Zotero.getPrefs("bibtexKeyCollisionFormat") ? Zotero.getPrefs("bibtexKeyCollisionFormat") : "-%n";


    ===============
    Specified Key Field: e.g. callNumber for using the stored key
    var citeKeyField = Zotero.getPrefs("bibtexKeyField"); //e.g. callNumber
  • see more discussions in zotero-dev:
    http://groups.google.com/group/zotero-dev/browse_thread/thread/53d7c67357f74cef/26c751f6f0108e07
  • so when can we take advantage of this new functionality. I am eager to connect to Sciplore.
  • Regarding the "Collision Format" section, I'll just note that alphabetic suffixes for disambiguation are not as simple as they look at first, and you'd want to be careful with the algorithm before turning it loose on very large bodies of data, or files that generate many identical keys.

    The problem is that the "doubling up" (i.e. the step from "z" to "aa") can't easily be cast numerically (at least, I wasn't able to find a truly simple formula for it), because unlike numeric expressions, the counting system behind the sequence has no zero placeholder. It can be done (what I think is a correct algorithm for it is implemented in the citeproc-js citation processor used by Zotero), but it does want a little care.

    (People with stronger mathematical backgrounds than I will now pile in to tell me I'm wrong, but I'm always up for a learning experience. :)
  • This is pretty basic, and I haven't tried it at all.. but it might work.
    var key = "";
    var letters = ['a'..'z']; // length 26
    do {
    key = letters[N % 26] + key;
    N = (N - N % 26) / 26;
    } while ( N != 0 );
  • I coded it up to run, but no dice, unfortunately.

    25 = "z"
    26 = "ba"

    675 = "zz"
    676 = "baa"

    As I remember from a science video I saw as a kid:
    Persian: "I've invented the zero!"
    European: "What?"
    Persian: "Oh, nothing, nothing ..."
  • @fbennett: you already have this code running in citeproc-js for year-suffixes, right?
  • Yep, sorry for flying off on tangents. Here's the code.
  • edited May 9, 2011
    Continuing the tangent, and accounting for Persian mathematics:
    var N = 677;
    var key = "";
    var X;
    var letters = ['a','b','c','d','e','f','g','h','i',
    'j','k','l','m','n','o','p','q',
    'r','s','t','u','v','w','x','y','z']; // length 26
    do {
    X = ((N % 26) == 0) ? 26 : (N % 26);
    key = letters[X-1] + key;
    N = (N - X) / 26;
    Zotero.debug(N);
    } while ( N != 0 );
    Zotero.debug(key);
    18:20:24 ===>26<===(number)
    18:20:24 ===>0<===(number)
    18:20:24 za

    The idea was right...
  • Nice! This is much more compact and transparent than the code currently used in the processor. With your permission ... ?
  • edited May 9, 2011
    Of course. I'm not sure something this non-original would even be subject to copyright, and I sure hope it isn't subject to patent law, but I grant unlimited, worldwide and permanent licenses to any and all rights and interests in this code and the algorithm it implements to Frank Bennett, Jr. and to the Zotero project, with no restrictions whatsoever, to such an extent as is possible under law.
  • edited May 9, 2011
    I don't know about originality, but it's a lot better than the tangled mess that's in there at the moment. Thanks!

    ... and after this brief non-commercial interruption, we now return to the regularly scheduled topic of this thread ...
  • My solution for the key collision problem is only good for up to two-chars, partly because I thought the key format would be too bad if you got more than 600 collisions for one key. But clearly ajlyon provides a much more elegant solution. That is a great job.

    Back to the topic... I did a update on my implementation. Do any of you know how long it takes for the zotero team to review the code? Who has the final word, or is it already killed?

    Thanks,
    spartan
  • Dan Stillman as the lead developer has the final word.
    As I followed the discussion I don't think it's dead-on-arrival no - most certainly not all of it - as I understand the escaping of latex may still be a source of disagreement, even if triggered by a preference option, but I'd keep that discussion over at zotero dev (I don't have any opinion on that, I'm just conveying an impression).

    In any case I wouldn't expect this to be implemented super quickly, especially as core code is concerned - the core team isn't big, resources are limited so you'll have to be a bit patient.

    As a more general note on procedure - in hindsight it may have been better to discuss your ideas for changes on zotero-dev before implementing them, thus avoiding some of the discord there: Of course Zotero is open source and you are free to play with the code, but if you hope for your changes to be implemented in Zotero, it's a good idea to start the process of communication earlier rather than later.
    That isn't to say that your contributions aren't appreciated, of course.
  • Spartanroc, your patches look great. But I have a very basic question. I have tried to install your files (May 4, 2011). I just copied the files in the translators folder... but they don't work. Are there other steps that I have to follow in order to enjoy these zotero improvements? I am using zotero 2.1.6. When I export a reference using BibTeXTan.js or BibTeXKeyOnly.js I get an error message.

    I really appreciate any help. Thank you in advance.
  • edited May 20, 2011
    As his notes indicate: his translators require modifications to the following core parts of zotero:content/zotero/xpcom/translation/item_local.js
    content/zotero/xpcom/translation/translate.js
    and he has distributed modified versions of those files and the patches needed to modify the files yourself. Naturally, this customization is not supported. The patches are small and should be safe (though read Avram's comment on that thread regarding the translate.js patch), but you're on your own if something breaks.
  • Thank you noksagt. Basic question again:

    I can't find those paths and files (item_local and translate) in my laptop (windows). How can I apply those patches... copy and paste?

    Thank you.
  • are you sure you want to do this?
    What you'd need to do is to unpack the zotero.jar java archive (any package manager will do) apply the patches and recompile as java - I believe that's usually done through the command line, though I'm sure there are gui tools -
    it's not rocket science, but I'm not sure if you really want to be fiddling around with your citation software if you don't at least somewhat know what you're doing?
  • edited May 20, 2011
    I only want to have this copy-paste feature for latex citing (it doesn't work in my laptop), and to fix the problems with Spanish accents in my references. Any simple ideas?
  • copy-paste feature for latex citing (it doesn't work in my laptop)
    What, specifically, do you mean?
    fix the problems with Spanish accents in my references.
    What problems are you having? The proposed modifications add a few additional UTF-8->LaTeX entity transliterations (but only for greek characters), but shouldn't change any spanish accent handling...
  • edited May 20, 2011
    Thank you noksagt:

    1) I just want to copy a reference from Zotero to my latex editor and get something like "\cite{reference}". As far as I know BibTeXKeyOnly.js should do that, but it doesn't work in my laptop.

    2) I have a reference in Zotero where the author is for example "Sebastián". When I export the references, "Sebastián" remains in the .bib (zotero exported file), but when I build the pdf in my latex editor I get "Sebastin".

    3) Is there any way to define the cite key manually?

    Thank you in advance.
  • (1) http://forums.zotero.org/discussion/5094/drag-and-drop-bibtex-cite/
    (2) either use a bibtex/latex/pdf toolchain that supports utf-8 or select something other than utf-8 for your character encoding of exported BibTeX files in Zotero.
    (3) not yet
  • Thank you noksagt. 1 and 2 worked.
  • sdaza,

    noksagt gave you good advice. But I just wanted to point out that it is fairly easy to change the zotero core files if you know where your firefox profile folder is. The file zotero.jar is in the chrome directory of the zotero extension under your firefox profile folder. You can use any zip tool (e.g., 7-zip) to open it. Then you can use drag & drop to replace the files in the archive. No need for command line and compiling as adamsmith suggested.

    If you use my implementation, manual cite key can be stored in a field you can specify, for example, callNumber.

    spartan
  • There should be a lot of bibtex/latex users in the zotero community, right? Or they shy away from it due to the poor bibtex support?

    I guess that many of these users would love my implementations. If you do, please show some support and make some comments. If you are tech savvy, please also join the discussions in zotero-dev: http://groups.google.com/group/zotero-dev/browse_thread/thread/53d7c67357f74cef/26c751f6f0108e07

    We as users need to push the developers to support bibtex better. I feel that my implementations may eventually go down the drain if not many users show interest. I am not a code developer but a normal user. So I am not going to update my modifications for new releases of zotero. Nor will I be able to help answer emails/questions of manual patches in time. But I'll , of course, try my best to help in this thread if and when I can.

    spartan
  • I badly need this too !

    I'm using my own work based on what is described there : http://forums.zotero.org/discussion/17680/tools-for-using-zotero-together-with-emacs/

    But a better integrated solution would be great !
  • Thank you Spartanroc. I installed your files and they worked perfect. A question: how can I enable the manual cite key? What file I have to modify?

    var citeKeyField = Zotero.getPrefs("bibtexKeyField"); //e.g. callNumber

    Thank you in advance!
  • edited May 23, 2011
    I wrote:

    var citeKeyField = Zotero.getPrefs("callNumber"); //e.g. callNumber

    Both in bibtex.js and bibtexkeyonly.js, but it didn't work.
    Thank you.
Sign In or Register to comment.