Bibtex export breaks page numbers

When I import a valid bibtex file into zotero and then export the same file, Zotero breaks the en dash between page numbers.

I import this

@article{cusumano2008changing,
title={{The changing software business: Moving from products to services}},
author={Cusumano, M.A.},
journal={Computer},
pages={20--27},
year={2008},
publisher={IEEE Computer Society}
}

and exporting gives me this

@article{cusumano_changing_2008,
title = {The changing software business: Moving from products to services},
journal = {Computer},
author = {M. A Cusumano},
year = {2008},
pages = {20–27}
}

It might no be visible in the forum, but the "long dash" that appears instead of the two "short dashes" is not valid latex symbol and prevents compiling of the bibtex file
  • If I export the imported item as RIS, I get

    TY - JOUR
    ID - 754
    T1 - The changing software business: Moving from products to services
    JF - Computer
    A1 - Cusumano,M. A
    PY - 2008///
    SP - 20–27
    EP - 20–27
    ER -

    So apparently it is the import feature that is not working properly
  • It is a tough call. I'd actually say that import is working fine: '--' is an en dash, and should be converted as such on import (unless there is some agreement that only hyphens should be permitted in the 'pages:' field, but I don't think there is).

    I'd also say that BibTeX export works fine. The default is to export at UTF-8. There is no reason to do the conversion back if your tex-toolchain supports UTF-8. You probably want to export to ISO-8859-1, though & the en dash will then be converted for you.

    I'd say that the RIS export is broken if it doesn't split pages on en dashes (assuming that we'd want to support en dashes in the pages field).
  • edited October 1, 2009
    The problem is that my tex compiler does not support UTF-8 and even if it did, there is no guarantee that the people who I work with have UTF capable TeX environments. Moreover, I see little downsides in the reverse conversion.

    What comes to splitting pages with en dash, I think that it should be supported since it is used in bibtex.

    Is there a way to force the export to be ISO-8859-1?

    Edit: found this http://www.zotero.org/support/preferences/advanced
  • Is it possible to change the default encoding form UTF-8 to ISO-8859-1? . The problem here is that we have very heterogeneous computing environment and using UTF breaks things.
  • If you enable the character set selector, it should remember the last character set you used. It sets the about:config setting 'extensions.zotero.export.translatorSettings' to:{"exportCharset":"ISO-8859-1"}This does not seem to be applied to quick copy, though.
  • Indeed, it does remember that. Would it be better to have ISO-8859-1 as default instead of UTF, since

    Also, according to wikipedia, en dash is the proper punctuation mark for ranges of numbers http://en.wikipedia.org/wiki/Dash#En_dash, so it would make sense that it was supported in page number fields. Also, it would be nice if hyphens were converted to en dash in this field.
  • Would it be better to have ISO-8859-1 as default instead of UTF
    You didn't actually finish your reasoning behind this. This topic has been long-debated on here and on the dev list. There are good arguments in favor of either approach.

    My preference has always been to use the same default preference as JabRef & the other popular BibTeX-based reference managers. This is still, AFAIK, ISO-8859-1. This would be supported by all toolchains.

    I am not really sure why the default export encoding for BibTeX, in particular, is UTF-8 presently. It might be just that it uses the same preference as all exporters do. It might be that the argument to support export of CJK (that can't be supported through transliteration) is compelling. Or it might be some users preference for more modern tools that do support UTF-8.

    In any case, this is a FAQ & we should better publish our reasoning behind the choice & how to select the other option.

This is an old discussion that has not been active in a long time. Instead of commenting here, you should start a new discussion. If you think the content of this discussion is still relevant, you can link to it from your new discussion.