French letters and signs. Consequences in sorting bibliography

Hello,

I am french, and my sources are in humanities.

I encountered several problems with french accentuated letters.
I can enter them in the Zotero panel where their appearance is quite correct.
But when I create the bibliography and paste it in Word, I then discover all the “funny” results, and I have no other choice than correct them in Word and copy into Zotero !

But this had another worse result in the bibliography's sorting.

In my document, the bibliography has to be sorted as author/title/…

It often happens that I have to cite many titles for one author

Here is an extract of an example of what I first got :

MOLIÈRE. L’Étourdi ou Les Contretemps. …
———. Le Bourgeois gentilhomme. …
———. Le Misanthrope. …
———. L'École des femmes. …

instead of :

MOLIÈRE. L’École des femmes. …
———. L’Étourdi ou Les Contretemps. …
———. Le Bourgeois gentilhomme. …
———. Le Misanthrope. …

I found out that the trouble's origin was within the sign ', as in "L'École". This sign had been entered directly in the Zotero panel.

As to "L’Étourdi" the ’ sign had been first written in Word, and then copied to the Zotero panel. This sign gives the correct result in the sorting.

So I had to modify one by one all the authors and titles containing this sign, so as obtain the correct result.

This method takes a lot of time and doesn’t prevent to forget one !

Is there another method to avoid these unpleasant situations ?

Thank you for considering my question
  • not really, no. sorry. Zotero doesn't currently have a search and replace function.
    If you feel audacious you might be able to work directly with the zotero.sqlite (after backing it up, of course). There are some threads with instructions for that on the forum, but it's not trivial.
  • Are you sure that you need the symbol ’ instead of ' ? Zotero really can't know that you mean for those to be the same symbol, since it doesn't try to do "smart" substitutions like Word does.

    You can create a saved search for items that contain the offending character (or article), then just periodically go through it to fix them.
  • Thanks for replying.

    I don't think I would dare to work directly with the zotero.sqlite.

    I'll go on checking the results in the obtained bibliography (Word).

    To ajlyon :

    I don't really care about the appearance of the symbol. There are lots in the bibliography. I just paid attention to the "sort" variables.

    As a matter of fact, the major problem with the symbol ' comes from its consequences on the sorted list :

    The symbol for apostrophe should be considered as < to letters, whereas the ' symbol is considered as > letters.

    So the result is incorrect (from a French reader's point of view).

    Furthermore, when, as in the example, there are both forms in the same list, the result is really confusing!

    1/ the standard form of apostrophe in French is ’.

    That's the one I obtain when typing French texts in Word.

    Copies from Word are frequent (some being due to correcting altered accentuated French letters).

    Keeping hope in further Zotero functions!
  • I think that Word is actually at fault here -- according to my reading of the always-correct Wikipedia (http://en.wikipedia.org/wiki/Apostrophe#As_a_mark_of_elision), the correct symbol for elided letters in French is ' ; in converting L'Étourdi to L’Étourdi, Word is sacrificing input integrity for aesthetic improvement in the eyes of the Word designers. I'm pretty sure you should use ' everywhere-- it can be rendered to look nicer, but the underlying symbol should be the apostrophe/single quote symbol.

    Sort order is another issue... there are rules for these things. In French dictionaries/lists, does L'Étourdi appear after (or before) all of the L-words? In the middle, before the Le words? In the middle, after the Le words? Between the Les-words and the Leu-words? It would be great to improve the sorting that Zotero does, but I think that most of this should already be handled correctly -- please identify precisely what behavior doesn't match the standard, and exactly how it should behave.
  • The Wikipedia section ajlyon points to is unreliable for the question at hand (and "always-correct" was a joke, right?). One needs to scroll down on the same page to the section Typographic Form and the next several sections, which disambiguate and explain the types of apostrophes accommodated in Unicode. Unicode prefers the curled form, and Word has followed suit.

    The WP article omits (at this time) the logic behind Unicode's and Word's preference. The finest specimens of apostrophes in early bookmaking were curled. The straight form was an innovation driven by the needs of early code table (telegraph onward).

    FWIW, the apostrophe is but one of a number of characters that have legitimately ambiguous Unicode points. The issue is likely to be raised by users time and again.
  • I'm actually interested in what French sorting practice is -- from my reading of collation charts (http://www.unicode.org/charts/uca/ , http://www.collation-charts.org/) it looks like the two apostrophes are usually treated the same, so I'm surprised to see that you are finding that they differ radically in sort behavior in Zotero.

    The important part here is how they should be sorted-- what precisely is the correct behavior in the French tradition?
  • So is this really a Zotero-related sort issue, or is a platform or FF-specific non-ascii input issue? E.g. wouldn't there be similar issues if you wanted to input an em-dash?
  • Javascript has a localeCompare() string method that is driven by the collation of the current locale. With a utf8 locale, sorts using that method should get these issues right for the most part. As far as I can tell, localeCompare() is not used for sorting in Zotero 2.0. It's used for bibliography sorting in citeproc-js, which will be deployed in Zotero 2.1.
  • Hello,

    Having read your interesting remarks, I tried to get more information about the official rules concerning the apostrophe sign.

    Before speaking of the “sort” question, I want to confirm that all the French printed publications (dictionaries included) use the curled symbol ’.

    As to the “sort” question, I discovered a very complex question.

    How to sort according to alphabetic order? The rules are explained in: “NF Z44-001 Novembre 1995 Technologies de l'information - Classement alphabétique des denominations“, at :
    http://www.boutique.afnor.org/NEL5DetailNormeEnLigne.aspx?&nivCtx=NELZNELZ1A10A101A107&ts=3930101&CLE_ART=FA040767

    I did not buy it, so I’m not able to tell anything sure about it.

    However, according to a site from Quebec government (in french), these rules are very complex, french editors would be using non-written rules. Therefore Quebec government is establishing its own rules:
    http://www-clips.imag.fr/geta/gilles.serasset/tri-du-francais.html

    Concerning apostrophes and other special characters, he says that they have to be eliminated in the sorting.

    Therefore, as someone suggested, I checked in the dictionary to see what rule was used.
    I could observe that the dictionary ignore the apostrophe when sorting.

    So, I modified several times my example (adding an imaginary item) to check what was happening in the different situations.

    The result is that the straight apostrophe (') is ignored, while the curled one is considered as < to letters.

    Therefore, I was wrong when saying the straight one was > than letters. It was just that my example was not significant. Excuse me for that mistake.

    So, using the straight apostrophe (') the result corresponds to the dictionary’s order, but the trouble is that the titles won’t appear the same way as in the Word text …

    MOLIÈRE. Le Bourgeois gentilhomme. …
    ———. Le Misanthrope. …
    ———. L'École des femmes. …
    ———. Légende. …
    ———. L'Étourdi ou Les Contretemps. …
    ———. Œuvres complètes. …

    I imagine I have to cope with these differences in the two forms, and be very careful when doing changes.

    Thank you for thinking about it and drawing my attention on points I had not even thought of.
  • I would certainly just try to keep to the ' form of the symbol; as I said before, you can do a saved search for titles and authors with the curled form and use that to keep things clean.
  • Hello Ajlyon,

    I suppose it would be the best solution. However, it's not that easy, for Word search and replace function for these characters is not direct, we have to use the internal codes...

    Anyway, the first thing to do now is change back the apostrophes in Zotero. I'll leave the Word part for later on.

    Let's go on!

    As wrote the french writer Nicolas Boileau (1636-1711):

    Vingt fois sur le métier remettez votre ouvrage,
    Polissez-le sans cesse, et le repolissez ...
Sign In or Register to comment.