Search not finding titles with accented characters

This discussion was created from comments split from: Browser window UI errors (macOS).
  • I'm now on 5.0.22-beta.5+c0143300c. Success: the search no longer hangs following the above steps.

    It hasn't, however, solved the problem of only five results being returned (search all fields and tags) when twelve items fulfil the search requirement. I've copied and pasted the journal title into each record, sync'd and restarted, but no more records are found. If I search title, creator, year instead, only three records are found (expected: either twelve or none depending on the definition of "title"- I don't understand this behaviour).

    By typing in the journal title letter by letter into search all fields and tags, "Annales du Service des Antiquit" returns twelve results. Adding one letter "Annales du Service des Antiquité" loses six results, which remain until the next accented letter is typed "Annales du Service des Antiquités de l'É" and a final result disappears, leaving five. This despite cutting and pasting the journal name identically into each record.

    Only 2/3 of my items are in English, which makes this a worrying problem.

    Thank you for the help!
  • Are you certain that you have the title typed exactly the same (with the accented characters) for all the items? Zotero doesn’t currently normalize accents when returning search results, so that would be the likeliest explanation.
  • Hi bwiernik, as I said above, I copy and then paste the journal title into each record so that they are all identical (you can actually see me doing it in the video above). I have done this several times (i.e. checking to make sure I'm copying from a record that "works", then pasting over all of them one by one). At one point I managed to get ONE extra record to show up, but no luck improving the results since. If there is a problem with the accent, I would expect NONE of the results to show up. The partial nature of the results in both search modes (all fields and tags/title, creator, year) puzzles me.
  • Was that the original value of those fields (ignoring normalization)? If you create a new item and paste the value into it, does it show up in a search? If so, what happens if you change the value in one of the items that's not showing up to something else, save it, and then change it back to the pasted value?
  • Was that the original value of those fields (ignoring normalization)? Not sure I understand this question, but all the records are long-term residents of my database and have always looked identical to one another. I didn't intend to change any of them until this search problem occurred. I wonder if the age of these items could be part of the problem? A change in unicode, an old font ... I can't even remember HOW I added them - manually, copied from Word, imported from the web, no idea.

    If you create a new item and paste the value into it, does it show up in a search? Yes, it does.

    If so, what happens if you change the value in one of the items that's not showing up to something else, save it, and then change it back to the pasted value? It now joins the search results.

    Additional: when I paste Annales du Service des Antiquités de l'Égypte over Annales du Service des Antiquités de l'Égypte, I have tried using a "paste values" format stripper and that does NOT help. I just wondered if some formatting was hanging around somehow.

    I have kept the rest of the malfunctioning items untouched in case more tests are needed...!
  • edited October 14, 2017
    In Unicode, strings with accents that look identical can be composed in different ways, and different software handles this differently. Zotero has normalized strings going into and out of the database for a few years, but existing values haven't yet been converted, and that's what's used for searching. So when you pasted a value over a field that looked identical, it probably didn't register as a change, even though it was a different normalization, and so the value in the database wasn't changed and the item didn't match the search. When you paste into a new item, or when you temporarily changed the value to something else, the new normalized value goes into the database and it matches the search.
  • Thanks, dstillman, for the explanation. This sounds like a problem that will only hit long-time Zotero users, but could hang around in a Library for a long time before it is spotted. If there is any way of making sure all data is normalized, or even a way of searching for problematic data so that I can manually fix it, please post it.
  • edited December 14, 2017
    Same problem for me.
    I have noticed that the concerned entries have always been imported. There is no problem with the ones that have been manually added.
    But more strange, my office colleague, who has exactly the same Zotero's version, and uses the same group library, does not have such problem.
    For example, on "my" zotero (standalone), there's no record for "village déserté", but with the same words, she has 2 records. To see those records, I have to type "village sert" without the "é". If I search without any accent, it's all right.
    I understand this thing about encoding caracters, but why does it appear only on my session, and not on hers ? The synchronisation should homogenize all entries, and so, finally, at the very end, each group member should have exactly the same entries, same encoding. Am I wrong ?
    I can't find anything in the options that could explain such a thing.
Sign In or Register to comment.