Possible bug: em dash character fails to import to Title field (ACM DL)
Hello,
I just imported a recent publication from the ACM Digital Library - https://dl.acm.org/doi/abs/10.1145/3334480.3381828 - and noticed that the long em dash in the title did not show up in in the Title field in Zotero. Instead it was replaced by a square with what appears to be "00 97" over two lines. I can't paste that in here, unfortunately.
I assume this is something to do with a character set failure somewhere, but I'm not sure quite where along the chain between the ACM DL and Zotero it is.
I don't know if this needs actions from Zotero's end, but thought I'd flag it as a possible bug just in case.
sdvr
I just imported a recent publication from the ACM Digital Library - https://dl.acm.org/doi/abs/10.1145/3334480.3381828 - and noticed that the long em dash in the title did not show up in in the Title field in Zotero. Instead it was replaced by a square with what appears to be "00 97" over two lines. I can't paste that in here, unfortunately.
I assume this is something to do with a character set failure somewhere, but I'm not sure quite where along the chain between the ACM DL and Zotero it is.
I don't know if this needs actions from Zotero's end, but thought I'd flag it as a possible bug just in case.
sdvr
They're serving a control character in place of the dash, which would happen by treating Windows-1252 as Unicode. It's complicated by the fact that (as explained in the linked comment), since this is a common mistake, web browsers forgivingly render these characters as dashes when they're served as HTML entities (
—
) instead of raw UTF-8 characters, as is the case in the main header on the page. That's the only reason the title is displayed correctly there.