problem with accented characters
After using highlighting on a pdf file im portuguese, I have realized that it was not importing it correctly to annotations. it's not a problem in general as with some pdf I don't have any problems but I wonder if there's a way to fix it.
ex: arbitr ́ario (arbitrário) , n ̃ao (não)
ex: arbitr ́ario (arbitrário) , n ̃ao (não)
In Word/Writer it appears to be pasted correctly, but the grammar checker always finds there is an error.
I tried to find some pdf reader configuration on how it deals with characters, like the Charset used, but I was unable to find a configuration to change, Advanced Preferences has intl.fallbackCharsetList.ISO-8859-1 = windows-1252, but windows-1252 is a Portuguese compatible charset, so I didn't change anything.
https://s3.amazonaws.com/zotero.org/images/forums/u4229194/9use0yp4hj2pwprj61ca.png
It seems a minor problem, but we use the tilde a lot in Portuguese.
The background here is most likely that you can encode accented characters either in a composed form (base char + accent char in one code point) or decomposed, that is separate base char in a code point and composing accent char in a code point, in this order. Depending on the system and software you use, it's often normalised to on or the other. Many modern fonts come with the informations built in how to compose the accented form from the composing components (i.e. where does the accent have to go above or below the base glyph). Aleph, as old as it is, can't handle that correctly so you see the underlined tilde which shows you that it is a composing tilde which is differently encoded than a non composing tilde (~). When you publish this record to Primo or another Opac you should see it correctly composed because the browser knows how to handle it.
EDIT: I think, I remember darkly that Aleph in fact can handle it and only the font doesn't provide the a+tilde composing. IIRC setting Aleph to use a different font can fix it and I did that once. But we don't have Aleph any more so I can't tell exactly what to do.
I've found a workaround: I highlight the needed text and create a note. By copying and pasting from the annotations, the characters display correctly in Aleph.
Do you know why Word also recognizes the characters as incorrect? I'm using the Office 365, a newer or at least more updated software.
I usually bring citations from annotations, which resolves the issue in Word, but not everyone does this, especially since annotations often disrupt the custom styles used in the document.
It seems that the Word spell checker doesn't handle decomposed forms in any language. Just tried it with concentração, as well as with English fiancée and German möglich.