Possible bug: em dash character fails to import to Title field (ACM DL)

Hello,

I just imported a recent publication from the ACM Digital Library - https://dl.acm.org/doi/abs/10.1145/3334480.3381828 - and noticed that the long em dash in the title did not show up in in the Title field in Zotero. Instead it was replaced by a square with what appears to be "00— 97" over two lines. I can't paste that in here, unfortunately.

I assume this is something to do with a character set failure somewhere, but I'm not sure quite where along the chain between the ACM DL and Zotero it is.

I don't know if this needs actions from Zotero's end, but thought I'd flag it as a possible bug just in case.

sdvr
  • Yes, it's a bug on their end. You can tell this by looking at the page title (if you have a title bar enabled in your browser, or you hover over the page's tab), or the title in the breadcrumbs line at the top of the page, or by looking at their own outputs from the "Export Citation" button, all of which are missing the dash.

    They're serving a control character in place of the dash, which would happen by treating Windows-1252 as Unicode. It's complicated by the fact that (as explained in the linked comment), since this is a common mistake, web browsers forgivingly render these characters as dashes when they're served as HTML entities (—) instead of raw UTF-8 characters, as is the case in the main header on the page. That's the only reason the title is displayed correctly there.
  • Thanks @dstillman - I appreciate the explanation. I'll see if I can pass this on to the relevant folks at the ACM to look into getting it fixed.
Sign In or Register to comment.