importing Umlaute

Dear all,

I noticed that there seems to be a problem with imported records containing Umlaute (ä,ö,ü). The data in Zotero looks fine after importing it, but when I cite it in a LibO document, the diacritic signs are missing (a,o,u) and there is a small arrow under the character pointing to the right. When I go into the database and try to erase the Umlaut in question (by hitting backspace) the Umlaut first turns into a normal vowel (which I can then delete with a second backspace and replace it with an Umlaut, which then works just fine). Not sure if this is a Zotero or a LibO issue...
  • where are you importing from? Worldcat?
  • There have been several such instances lately and, unfortunately, I didn't keep track of them. The most recent one states "Open WorldCat" in the database entry.
  • We know about this in Worldcat and it is, unfortunately, an encoding issue without a good current solution (related to pre vs. decomposed characters if you care for the details). To the best of my knowledge this occurs only on Open Worldcat.
  • ok, thanks for the information anyway!
  • For me, this character problem also occurs when I'm logged in to the full WorldCat at my university.

    About a year ago, the OCLC folks said that this was "unavoidable" because of the way that records are added by member libraries. I demonstrated the problem to members of the OCLC technical staff at an ALA convention. I used Zotero to demonstrate the encoding problem and they agreed that the situation was unfortunate. They suggested that I cut and paste the problem names or words into a text editor (TextEdit, or BBE) set to UTF-8 and copy the name from the text editor into Zotero. It is a lot of effort but it does the job.
  • By Full Worldcat you mean the FirstSearch version? I believe we're handling that there. What would be an example? (OCLC Record number would be best). And this would work better with Zotero for Firefox than Standalone at this point.
  • edited December 16, 2013
    Frankly, WorldCat is not doing anything wrong here. Both composed and decomposed Unicode is valid and Zotero should be handling this better. There are a number of discussions on this and it's just a matter of normalizing the strings as they enter Zotero. The one issue that I've discussed in this github thread is which form to choose. I am fairly confident that NFD is what Zotero wants. The only downside to NFD is that there is some potential loss of information (e.g. the angstrom sign "Å" and the Swedish letter "Å" become the same symbol)

    Edit: nvm, I was wrong. There is no issue. We just need to implement this.
Sign In or Register to comment.