Translator papercut non-breaking space entities

More and more I am finding the following strings within the abstracts of items I'm downloading:   or  .

A similar problem (maybe the opposite of this problem) arises from PubMed imports. There are many abstracts where numbers and words are jammed together without a space and I presume that somewhere in the publishing chain there was a non-breaking space character there that was ignored or filtered out.

For me, at least, this is a minor annoyance but I thought that it was worth mentioning.
  • Can you provide example URLs?
  • Here is one example:

    Many of the Sabinet journals have this problem but it is by far not limited to Sabinet. I'll follow this with examples from PubMed in a few hours.

  • edited October 17, 2017
    Yeah, that's a problem with the site's embedded metadata.

    If you view the page source, you can see that the meta tags contain, e.g.,  . That's a double-encoded entity — they're encoding the ampersand as &, but the ampersand is supposed to be part of the HTML entity  . When Zotero unencodes  , it becomes, correctly, the literal string   instead of a nonbreaking space.
Sign In or Register to comment.