Rich text formatting in bibtex exports

Hi,

I have an extensive list of publications with biological and chemical names in the titles requiring rich text formatting (italics, subscript etc), and I have manually formatted all the titles with html tags.

While this works for Word, my primary means of bibliography management is by exporting to a bibtex file (for use with latex), and the formatting does not work in this case (the tags just appear as text).

What do I need to do to get the proper formatting? Please help!
«1
  • how should they look in a bibtext file? Same html tags, or should they have LATeX mark-up (remind me how that would look). I don't know if there are objections to putting this into the general bibtex translator - though I wouldn't know what they'd be - but I could definitely make a translator available for you to custom install that does this right.
  • Eventually, the HTML file should be converted to LaTeX markup. Someone refresh my memory as to the HTML formatting we allow in these fields? The only reason this hasn't yet been done is time. Example of subscript:
    $_{\textrm{SUBSCRIPTED}}$
    Italics can be either presentational or semantic in both HTML and LaTeX:\textit{presentational, italicized text}
    \emph{semantic, emphasized text}
  • edited August 21, 2012
    http://www.zotero.org/support/kb/rich_text_bibliography

    (formerly, we suggested users use <sc/> for small-caps)
  • Here is what we allow:
    http://www.zotero.org/support/kb/rich_text_bibliography
    <i> <b> <sub> <sup> and <span style="small-caps"> (ugh).

    Since we use the presentational <i> it html we should probably use \textit in TeX
  • edited August 21, 2012
    Since we use the presentational <i> it html we should probably use \textit in TeX
    I agree, though half-heartedly. I'd imagine that most cases of title-based markup (foreign-language/species names, etc.) should really use semantic emphasis. We should decide on what behavior is expected for BST styles that already put titles in italics & make sure that things work (this is also an edge problem for the sub-/super-scripting: we need to break out of the mathstyle, but assume a regular (non-italics/non-bold) roman font)). Anyway, as a LaTeX refresher:$^{\textrm{SUPERSCRIPTED}}$
    \textbf{BOLD}
    \textsc{SMALL-CAPS}
  • edited August 23, 2012
    Thank you for the comments everyone. I am guessing if I change the html tags to the appropriate tex tags, this will then break the rich text formatting for use with Word?

    Although I want to use my bibliography with latex, I still have to use Word once in a while. So ideally the solution must work in all scenarios.

    If I understand correctly, the bibtex exporter (in zotero) will eventually be able to correctly translate the html tags?

    Edit:
    Is there any way to edit the library entries in bulk? For example:
    Find: <i>
    Replace with: \textit{
  • If I understand correctly, the bibtex exporter (in zotero) will eventually be able to correctly translate the html tags?
    yes
    Is there any way to edit the library entries in bulk? For example:
    Find: <i>
    Replace with: \textit{
    no, not currently.
  • adamsmith,

    You mentioned making a custom translator earlier (I missed it between all the comments). That would certainly be of great help, as my current plan involves manually editing the html tags of a few hundred references (which I manually put in, in the firs place!) :-p

    You may have already remembered/realized the relevant tags:

    Italics-
    html: <i>" "</i> to latex: \textit{" "}

    Subscript-
    html: <sub>" "</sub> to latex: $_{" "}$

    Superscipt-
    html: <sup>" "</sup> to latex: $^{" "}$

    There's more- bold and underlined, but I don't think these are ever used.

    I am however already using the custom bibtex exporter plugin for automatic updating, AutoZotBib.
  • edited August 28, 2012
    You mentioned making a custom translator earlier
    Given that there's no reason not to include this in the stock translator, I imagine it will (eventually) be & there's no need to make it a custom translator (except for testing).
    You may have already remembered/realized the relevant tags
    These are slightly wrong (they leave stuff in math font). You missed smallcaps (and, as you said, bold). Can underline be used? It wasn't documented. Anyway, all in one place:$^{\textrm{SUPERSCRIPTED}}$
    $_{\textrm{SUBSCRIPTED}}$
    \textit{presentational, italicized text}
    \textbf{BOLD}
    \textsc{SMALL-CAPS}
  • Given that there's no reason not to include this in the stock translator, I imagine it will (eventually) be & there's no need to make it a custom translator (except for testing).
    agreed - I'll try to get to this and one other bibtex issue I've wanted to deal with for a while asap. Once it's done I'll post here and you can install the translator - it might take a little extra time until it goes through review and makes it into Zotero, but that's definitely the goal.
  • I've issue a pull request for an updated translator - but since this is bibtex and thus quite crucial, expect this to take some time.

    It'd be very helpful if you could test this in the meantime, though.
    The file is here
    https://github.com/adam3smith/translators/raw/bibtex2/BibTeX.js

    download it and place it in the translator folder of your Zotero data directory:
    http://www.zotero.org/support/zotero_data
    replacing the old translator. You may have to restart Firefox/Zotero.

    Anything weird let us know. Testing this is risk-free, you can always restore the original translator by reseting translators from the advanced tab of the Zotero preferences.

    @noksagt - obviously, if you'd take a look as well that'd be greatly appreciated. I'm always on somewhat shaky feet with bibtex.
  • update - actually, this is now up on Zotero, so you can just update your translator - either by waiting 24hs or by updating from the general tab of the Zotero preferences.
    Any problems, of course, still let us know.
  • edited November 29, 2012
    Hello adamsmith!

    Sorry about the late response (I had given up after learning that it might take a while, and did not see your later updates).

    Thank you for the updated translator. I only just got around to updating firefox and zotero after a long time, and the italics, superscripts etc. appear as expected.

    I do have a new problem now :(
    I am using biber/biblatex in pdflatex to typeset my references, and it seems biber is stumbling over the accented characters and symbols when using the new .bib, while it seems to have no problems with the .bib generated by the old translator. Will have to sort that one out for now.

    Thank you once again for the update! I really wanted to see this issue solved, but I had not expected it to be resolved and updated so fast!

    Edit: The issue was solved by adding \usepackage[utf8]{inputenc} to the preamble. I wonder why this problem occurred in the first place!
  • which charset are you exporting to? biber/biblatex should just work with utf-8, no?
  • edited November 29, 2012
    The .bib file was always exported using UTF-8. It seems pdflatex does not work well with utf-8, and therefore requires the inputenc package with the utf8 option.

    On the other hand, like I said, I did not see this problem with pdflatex/biblatex/biber and the older .bib produced by zotero (3.0.8). So I am not sure what exactly might be going on here. Using the older .bib file (which was also utf-8 encoded) still produces the accents just fine (but without the new rich text formatting). So clearly something is different with the new .bib file that isn't sitting well with either pdflatex or biber (unless the inputenc package is used).
  • if you can narrow that down we're happy to try to fix it - but the encoding of accented characters hasn't changed afaik - are you sure that's the cause of the problem?
  • edited November 29, 2012
    Adding \usepackage[utf8]{inputenc} to the preamble appears to have fixed* the problem, so it seems as if it might be an encoding issue.

    I have no clue how to narrow down the problem, but if anyone one has any suggestions, I will try them to find the source of this odd behavior.

    *(it seems β is still a problem- weird!)
    Here is the error message for β:
    Package inputenc Error: Unicode char \u8:β not set up for use with LaTeX
  • I have no clue how to narrow down the problem, but if anyone one has any suggestions, I will try them to find the source of this odd behavior.
    You could provide a minimal example by showing the same entry for a single bibliographic record from the two files and a bare bones .tex file.

    http://www.tex.ac.uk/cgi-bin/texfaq2html?label=minxampl
  • edited December 6, 2012
    Hi noksagt,

    Here are the MWEs:

    The old .bib output:
    @ARTICLE{lopez-nicolas_kinetic_2007,
    author = {{L\'{o}pez-Nicol\'{a}s}, Jos\'{e} M. and {P\'{e}rez-L\'{o}pez}, Antonio
    J. and {Carbonell-Barrachina}, \'{A}ngel and {Garc\'{i}a-Carmona},
    Francisco},
    title = {Kinetic study of the activation of banana juice enzymatic browning
    by the addition of maltosyl-$\beta$-cyclodextrin},
    year = {2007},
    volume = {55},
    number = {23},
    month = nov,
    pages = {9655--9662},
    doi = {10.1021/jf0713399},
    url = {http://dx.doi.org/10.1021/jf0713399},
    abstract = {In recent years, the use of cyclodextrins {(CDs)} as antibrowning
    agents in fruit juices has received growning attention. However,
    there has been no detailed study of the behavior of these molecules
    as substances, which can lead to the darkening of foods. In this
    paper, when the color of fresh banana juice was evaluated in the
    presence of different {CDs}, the evolution of several color parameters
    was the opposite of that observed in other fruit juices. Moreover,
    a kinetic model based on the complexation by {CDs} of the natural
    browning inhibitors present in banana is developed for the first
    time to clarify the enzymatic browning activation of banana juice.
    Finally, the apparent complexation constant between the natural polyphenoloxidase
    inhibitors present in banana juice and {maltosyl-$\beta$-CD} was
    calculated {(Kci} = 27.026 {\textpm} 0.212 {mM-1).}},
    journal = {Journal of Agricultural and Food Chemistry}
    }
    The .tex MWE:
    \documentclass{article}
    \usepackage[backend=biber]{biblatex}
    \addbibresource{old.bib}
    \begin{document}
    This is how an entry \cite{lopez-nicolas_kinetic_2007} appeared with .bib file from the old translator.
    \printbibliography
    \end{document}
    Partial log file, for comparison with later log files (no error messages here):

    .
    .
    .
    ("/path\biblatex.cfg"
    File: biblatex.cfg
    )))
    Package biblatex Info: Trying to load language 'english'...
    Package biblatex Info: ... file 'english.lbx' found.

    ("path\english.lbx"
    File: english.lbx 2012/10/29 v2.3 biblatex localization (PK/JW/AB)
    )
    \@quotelevel=\count203
    \@quotereset=\count204

    (path\oldstyle.aux)
    LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 12.
    LaTeX Font Info: ... okay on input line 12.
    LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 12.
    LaTeX Font Info: ... okay on input line 12.
    LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 12.
    LaTeX Font Info: ... okay on input line 12.
    LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 12.
    LaTeX Font Info: ... okay on input line 12.
    LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 12.
    LaTeX Font Info: ... okay on input line 12.
    LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 12.
    LaTeX Font Info: ... okay on input line 12.
    Package biblatex Info: No input encoding detected.
    (biblatex) Assuming 'ascii'.
    Package biblatex Info: Automatic encoding selection.
    (biblatex) Assuming data encoding 'ascii'.
    Package biblatex Info: Trying to load bibliographic data...
    Package biblatex Info: ... file 'oldstyle.bbl' found.

    (C:\path\oldstyle.bbl)
    Package biblatex Info: Reference section=0 on input line 12.
    Package biblatex Info: Reference segment=0 on input line 12.
    LaTeX Font Info: External font `cmex10' loaded for size
    (Font) <7> on input line 15.
    LaTeX Font Info: External font `cmex10' loaded for size
    (Font) <5> on input line 15.
    [1

    {path/pdftex.map}]
    (path\oldstyle.aux)
    Package logreq Info: Writing requests to 'oldstyle.run.xml'.
    )
    The output (correcly appearing accented characters and symbols noted in green):
    Jos´e M. L´opez-Nicol´as et al. “Kinetic study of the activation of banana
    juice enzymatic browning by the addition of maltosyl-β-cyclodextrin”. In:
    Journal of Agricultural and Food Chemistry 55.23 (Nov. 2007), pp. 9655–
    9662. doi: 10 . 1021 / jf0713399. url: http : / / dx . doi . org / 10 . 1021 /
    jf0713399.
    I am not sure if this is the best way to post this information. I am breaking the MWEs to avoid making a humongous post. Please let me know if there is better way to do this.
  • edited December 6, 2012
    The new .bib output:
    @ARTICLE{lopez-nicolas_kinetic_2007,
    author = {López-Nicolás, José M. and Pérez-López, Antonio J. and Carbonell-Barrachina,
    Ángel and García-Carmona, Francisco},
    title = {Kinetic study of the activation of banana juice enzymatic browning
    by the addition of maltosyl-β-cyclodextrin},
    year = {2007},
    volume = {55},
    number = {23},
    month = nov,
    pages = {9655--9662},
    doi = {10.1021/jf0713399},
    url = {http://dx.doi.org/10.1021/jf0713399},
    urldate = {2011-06-10},
    abstract = {...},
    journal = {Journal of Agricultural and Food Chemistry}
    }
    The .tex MWE:
    \documentclass{article}
    \usepackage[backend=biber]{biblatex}
    \addbibresource{new.bib}
    \begin{document}
    This is how an entry \cite{lopez-nicolas_kinetic_2007} appears with .bib file from the new translator.
    \printbibliography
    \end{document}
    Partial log file:
    ("path\biblatex.cfg"
    File: biblatex.cfg
    )))
    Package biblatex Info: Trying to load language 'english'...
    Package biblatex Info: ... file 'english.lbx' found.

    ("path\english.lbx"
    File: english.lbx 2012/10/29 v2.3 biblatex localization (PK/JW/AB)
    )
    \@quotelevel=\count203
    \@quotereset=\count204

    (path\newstyle.aux)
    LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 12.
    .
    .
    (same as above)
    .
    .
    LaTeX Font Info: ... okay on input line 12.
    Package biblatex Info: No input encoding detected.
    (biblatex) Assuming 'ascii'.
    Package biblatex Info: Automatic encoding selection.
    (biblatex) Assuming data encoding 'ascii'.
    Package biblatex Info: Trying to load bibliographic data...
    Package biblatex Info: ... file 'newstyle.bbl' found.

    (path\newstyle.bbl)
    Package biblatex Info: Reference section=0 on input line 12.
    Package biblatex Info: Reference segment=0 on input line 12.

    ! Undefined control sequence.
    <argument&gt Jos\x
    {fffd}\x {fffd}\bibnamedelima M.

    l.16 \end
    {document}
    The control sequence at the end of the top line
    of your error message was never \def'ed. If you have
    misspelled it (e.g., `\hobx'), type `I' and the correct
    spelling (e.g., `I\hbox'). Otherwise just continue,
    and I'll forget about whatever was undefined.

    ! Undefined control sequence.
    <argument&gt Jos\x {fffd}\x
    {fffd}\bibnamedelima M.

    l.16 \end
    {document}
    The control sequence at the end of the top line
    of your error message was never \def'ed. If you have
    misspelled it (e.g., `\hobx'), type `I' and the correct
    spelling (e.g., `I\hbox'). Otherwise just continue,
    and I'll forget about whatever was undefined.

    ! Undefined control sequence.
    <argument&gt L\x
    {fffd}\x {fffd}pez-Nicol\x {fffd}\x {fffd}s

    l.16 \end
    {document}
    The control sequence at the end of the top line
    of your error message was never \def'ed. If you have
    misspelled it (e.g., `\hobx'), type `I' and the correct
    spelling (e.g., `I\hbox'). Otherwise just continue,
    and I'll forget about whatever was undefined.

    ! Undefined control sequence.
    <argument&gt L\x {fffd}\x
    {fffd}pez-Nicol\x {fffd}\x {fffd}s

    l.16 \end
    {document}
    The control sequence at the end of the top line
    of your error message was never \def'ed. If you have
    misspelled it (e.g., `\hobx'), type `I' and the correct
    spelling (e.g., `I\hbox'). Otherwise just continue,
    and I'll forget about whatever was undefined.

    ! Undefined control sequence.
    <argument&gt L\x {fffd}\x {fffd}pez-Nicol\x
    {fffd}\x {fffd}s

    l.16 \end
    {document}
    The control sequence at the end of the top line
    of your error message was never \def'ed. If you have
    misspelled it (e.g., `\hobx'), type `I' and the correct
    spelling (e.g., `I\hbox'). Otherwise just continue,
    and I'll forget about whatever was undefined.

    ! Undefined control sequence.
    <argument&gt ...{fffd}\x {fffd}pez-Nicol\x {fffd}\x
    {fffd}s

    l.16 \end
    {document}
    The control sequence at the end of the top line
    of your error message was never \def'ed. If you have
    misspelled it (e.g., `\hobx'), type `I' and the correct
    spelling (e.g., `I\hbox'). Otherwise just continue,
    and I'll forget about whatever was undefined.

    ! Undefined control sequence.
    <argument&gt ...ning by the addition of maltosyl-\x
    {fffd}\x {fffd}-cyclodextrin

    l.16 \end
    {document}
    The control sequence at the end of the top line
    of your error message was never \def'ed. If you have
    misspelled it (e.g., `\hobx'), type `I' and the correct
    spelling (e.g., `I\hbox'). Otherwise just continue,
    and I'll forget about whatever was undefined.

    ! Undefined control sequence.
    <argument&gt ...he addition of maltosyl-\x {fffd}\x
    {fffd}-cyclodextrin

    l.16 \end
    {document}
    The control sequence at the end of the top line
    of your error message was never \def'ed. If you have
    misspelled it (e.g., `\hobx'), type `I' and the correct
    spelling (e.g., `I\hbox'). Otherwise just continue,
    and I'll forget about whatever was undefined.

    LaTeX Font Info: External font `cmex10' loaded for size
    (Font) <7&gt on input line 15.
    LaTeX Font Info: External font `cmex10' loaded for size
    (Font) <5&gt on input line 15.
    [1

    {path/pdftex.map}]
    (path\newstyle.aux)
    Package logreq Info: Writing requests to 'newstyle.run.xml'.
    )
    The output:
    Josfffdfffd M. Lfffdfffdpez-Nicolfffdfffds et al. “Kinetic study of the acti-
    vation of banana juice enzymatic browning by the addition of maltosyl- fffdfffd-cyclodextrin”. In: Journal of Agricultural and Food Chemistry 55.23 (Nov. 2007), pp. 9655–9662. doi: 10.1021/jf0713399. url: http://dx.
    doi.org/10.1021/jf0713399 (visited on 06/10/2011).
  • edited December 6, 2012
    And finally, the .tex MWE with explicit UTF-8 support:
    \documentclass{article}
    \usepackage[utf8]{inputenc}
    \usepackage[backend=biber]{biblatex}
    \addbibresource{new.bib}
    \begin{document}
    This is how an entry \cite{lopez-nicolas_kinetic_2007} appears with .bib file from the new translator when using the inputenc package for utf8.
    \printbibliography
    \end{document}
    Partial log file:

    .
    .
    .
    ("path\biblatex.cfg"
    File: biblatex.cfg
    )))
    Package biblatex Info: Trying to load language 'english'...
    Package biblatex Info: ... file 'english.lbx' found.

    ("path\english.lbx"
    File: english.lbx 2012/10/29 v2.3 biblatex localization (PK/JW/AB)
    )
    \@quotelevel=\count203
    \@quotereset=\count204

    (path\newutf.aux)
    LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 12.
    .
    .
    (same as above)
    .
    .
    LaTeX Font Info: ... okay on input line 12.
    Package biblatex Info: Input encoding 'utf8' detected.
    Package biblatex Info: Automatic encoding selection.
    (biblatex) Assuming data encoding 'utf8'.
    Package biblatex Info: Trying to load bibliographic data...
    Package biblatex Info: ... file 'newutf.bbl' found.

    (path\newutf.bbl)
    Package biblatex Info: Reference section=0 on input line 12.
    Package biblatex Info: Reference segment=0 on input line 12.


    ! Package inputenc Error: Unicode char \u8:β not set up for use with LaTeX.

    See the inputenc package documentation for explanation.
    Type H <return&gt for immediate help.
    ...

    l.16 \end
    {document}
    Your command was ignored.
    Type I <command&gt <return&gt to replace it with another command,
    or <return&gt to continue without it.

    LaTeX Font Info: External font `cmex10' loaded for size
    (Font) <7&gt on input line 15.
    LaTeX Font Info: External font `cmex10' loaded for size
    (Font) <5&gt on input line 15.
    [1

    {path/pdftex.map}]
    (path\newutf.aux)
    Package logreq Info: Writing requests to 'newutf.run.xml'.
    )
    The output:
    Jos´e M. L´opez-Nicol´as et al. “Kinetic study of the activation of banana juice
    enzymatic browning by the addition of maltosyl--cyclodextrin”. In: Journal of Agricultural and Food Chemistry 55.23 (Nov. 2007), pp. 9655–9662. doi:
    10 . 1021 / jf0713399. url: http : / / dx . doi . org / 10 . 1021 / jf0713399
    (visited on 06/10/2011).
    I hope these help identify or narrow source of the issue. I haven't posted the full logs but can provide them if required.
  • I guess the new .bib output from zotero is different from earlier:

    L\'{o}pez (old output) vs López (new output)
    and $\beta$ vs β

    which is causing the problem.
  • Zotero can do both (and that's been the case for a long time). By default it exports utf-8 (which I understood biblatex was supposed to be able to handle with inputenc? noksagt will know more about that), but if you select "Display Character Encoding..." in the export tab of the preferences and then select ISO 8859-1 on export it will escape special characters
  • On a related topic, I'm having some issues with the import of rich text into bibtex. If I try to apply rich text to a title, It shows fine when using "Create Bibliography from Selected Item" but not when trying to export to bibtex.

    Using the quick start guide item that Zotero comes with as an example, if I give the title as

    Zotero Quick Start <i>Guide</i>

    I get the following output for bibtex

    @misc{center_for_history_and_new_media_zotero_????,
    title = {Zotero Quick Start \{\textless\}i\{\textgreater\}Guide\{\textless\}/i\{\textgreater\}},
    url = {http://zotero.org/support/quick_start_guide},
    author = {\{Center for History and New Media\}},
    howpublished = {http://zotero.org/support/quick\_start\_guide},
    annote = {Welcome to Zotero!
    View the Quick Start Guide to learn how to begin collecting, managing, citing, and sharing your research sources.
    Thanks for installing Zotero.}
    },

    Any help will be greatly appreciated as I need to be able to apply italics to items for my work.

    Thanks in advance
  • confirmed - I broke this somewhere. grumble.
  • It seems to be an issue for all 6 cases tested for in the code.
  • yeah, the tests unfortunately only go the other way around and that works, that's why it slipped through.
  • edited December 12, 2012
    I tried exporting with ISO 8859-1, and it worked without having to use \usepackage[utf8]{inputenc}, but the beta symbol is still a problem regardless of whether the utf8/inputenc package is loaded or not (although, it now appears as a ? instead of disappearing altogether, and no error messages are generated).
  • does Zotero export $\beta$ or β in ISO 8859?
  • neither! the bibtex file reads ? instead of $\beta$ or β with iso 8851-1.

    for completeness:

    it currently exports as β with utf8, but used to export as $\beta$ before the update (with utf8).
Sign In or Register to comment.