Worldcat - losing Unicode characters in title string
I'm having a problem adding references from Worldcat: it seems to be stripping unicode characters from the title string and replacing them with English-letter near-equivalents, which won't do.
Adding this:
http://www.worldcat.org/oclc/34553157&referer=brief_results
gives the title in Zotero as
O niepodleglosc i granice : 1914-1921
when it should be
O niepodległość i granice
I think it's a problem with the Worldcat translator because when I add the same book from the Library of Congress:
http://lccn.loc.gov/95141898
the title comes through properly as
O niepodległość i granice
(although I note that it does mess up the first author, putting them in as 1922-, (first).)
Adding this:
http://www.worldcat.org/oclc/34553157&referer=brief_results
gives the title in Zotero as
O niepodleglosc i granice : 1914-1921
when it should be
O niepodległość i granice
I think it's a problem with the Worldcat translator because when I add the same book from the Library of Congress:
http://lccn.loc.gov/95141898
the title comes through properly as
O niepodległość i granice
(although I note that it does mess up the first author, putting them in as 1922-, (first).)
-
dstillmanThe accents aren't available in the RIS file WorldCat provides.
-
jofishWell that sucks! Is this a problem with the RIS spec not properly supporting Unicode, or is this something where if I ask the right people at Worldcat nicely they might fix it for me? (Do you happen to know who those people are?)
-
dstillmanAs far as I know Unicode support was never officially added to the RIS spec, as we've only seen IBM850 and Windows ANSI listed. (This is discussed elsewhere in these forums in greater detail.) Some programs (and possibly sites) do output UTF-8, though, and Zotero (at least post-1.0.7, with reset translators) should fully support importing and exporting UTF-8 RIS. But lack of universal support in RIS may have something to do with WorldCat transposing the characters.