ADSABS: Special characters imported strangely

The name Éanna is imported strangely, as �anna.

A workaround is to manually type (or paste) Éanna into the Author field.
  • The NASA ADS translator uses the RIS translator, which has only limited character set support at the moment.
  • The site is also incorrectly claiming that the served RIS file is UTF-8 (when it's actually Windows-1252 or ISO-8859-1), so even Firefox doesn't detect it correctly. Since RIS files can (according to various versions of the spec) be IBM850 or Windows-1252 and are often UTF-8, providing the wrong character set (or not providing one at all, which is the case with many sites) makes it difficult to import correctly.

    Firefox does get it right when opening it as a file from the disk (where there's no given charset), but we haven't yet found a good way of accessing that detector code, so this may require implementing some basic charset detection between those three types.

    More details on the ticket.
Sign In or Register to comment.