Apostrophes and dashes turning into odd characters in references

Some apostrophes and dashes are turning into odd characters (â, uppercase A with accent and an uppercase S with tail) in my references. Mostly the dash problem is in the page numbers, but can also happen in titles; the apostrophe problem happens in titles.

I'm using standalone with TexStudio. I tried as messing suggests here https://forums.zotero.org/discussion/4331/bibtex-importexport-problems/ to modify the Bibtex.js so it reads ""\u2013":"-", // EN DASH" etc. but that didn't fix it.

I can manually retype the character in Zotero and reexport and that solves it. The apostrophe is more vertical after this there is no change to the dash I can see. I also noticed that these corrupted dash references only result from the folder icon importation. (If I have a list of results from Google Scholar click on the folder icon in the address bar and then check boxes for individual articles, as opposed to clicking on a result and hitting the paper icon which imports the only reference.) The apostrophes are corrupt either way. This is true independent of browser (Firefox or Chrome). The adds-on is up to date in for both browsers.

Are there any suggestions as to how to correct these references (other than manually editing) or prevent the corruption in future references?

Thanks,
Dustin
  • If you open the BibTeX file in a text editor like NotePad++ do the characters look right? Also see http://tex.stackexchange.com/questions/154278/texstudio-changed-encoding
  • The BibTeX looks fine. The stackexchange link was very helpful. All the odd characters are gone. There's just the relatively minor problem that some of references had single - rather than -- . The former results in nothing between the beginning page number and the end number.
  • Can you export such reference into Zotero RDF and post the contents of the resulting file on http://gist.github.com ? Provide link here
  • I think I was able to do what you suggested:
    https://gist.github.com/anonymous/6bdc2aeb49f19b8bc612

    Thanks for the help.
  • Does the dash in the page range appear correctly in Zotero? Can you copy-paste just the Pages field here directly?
  • In Zotero the pages is "39–58". That - disappears. However the one in "201-214" doesn't.
  • OK. I see what's happening. Will have this fixed shortly.

    One more question. for the "39–58" entry, what is listed in the Library Catalog field. That's using an em-dash to separate pages in a page range, which is quite unusual for a range, so we may also want to fix this on import.
  • Both of those have "Google Scholar" in the Library Catalog field.
  • I have had similar problems with Zotero exported bibtex or biblatex files --- that is, dashes and apostrophe will not render correctly in DVI or PDF. Although, these dashes and apostrophes almost always show up correctly in source (using LEd and TexStudio). I pickup references mostly from Google Scholar and occasionally from the journals directly.

    I have been manually doing search-and-replace for dashes and apostrophes in automatically picked up references.

    Would appreciate any suggestions ...
  • As the link stackexchange list that aurimas posted suggests, adding the lines:

    % !TEX TS-program = lualatex
    % !TEX encoding = UTF-8 Unicode
    % !TEX spellcheck = de_DE

    after you begin document may resolve much of it.
  • Zotero will export in UTF-8 (there's a way to export with Unicode characters escaped for BibTeX), and that should display properly when interpreted as UTF-8. If you're seeing weird characters, the most likely scenario is that whatever software you are using to read the BibTeX file is interpreting it as Windows-1252 encoding (or some other single-byte encoding). You would just need to tell it to read the file as UTF-8.

    Regarding dashes in page ranges, we were not properly converting various dashes (hyphens, en-dashes, em-dashes, etc.) to the "--" in BibTeX. We'll fix that in a bit.
  • Within BibTeX.js file does the var mappingTable say when you find the thing on the left use (replace it with) the thing on the right?
  • There are two mapping tables, one for import, one for export. I don't recall which one is which, but the general idea is as you describe. Though that table only applies if you're exporting in non-utf8 encoding.
Sign In or Register to comment.