Non-breaking space issue.


I know it's an "issue" that has been discussed many times, but reading the forums I didn't find a suitable answer... When I generate a bibliography, some sort of non-breaking spaces are inserted before the ":" or the "?". I'm writing "some sort" because they do not appear as NBSP in Word for instance. I'm talking about that:

Of course it is correct, as French is expecting a NBSP before “!”, “?” or “:”; but it causes display issues in the software I use the bibliographies in. Is there a workaround that would permit me to change those NBSP into normal spaces? Something to switch off in Zotero or to modify in the style I use?

For info, I use Service Médical de l'Assurance Maladie style (French style then), and I'm on a French computer with Zotero in French.
  • That looks like a problem with the (monospace?) font you're using. Have you tried with another font?
  • Hi Dan,

    Changing the font can probably resolve the problem on notepad++ (where the screenshot it taken from) but it won't change my issue in the software I use the bibliographies in. My problem is not how this sort of NBSP is displayed, but its presence.
    I want to get rid of it! :D
  • @fbennett could say more, but I believe the NBSPs are added deep within the citation processor and, since it's the correct character, there's no way to turn that off or change it.

    But you can just do a find-replace with a normal space before transferring the bibliography to whatever software you're referring to.

    (I'm a bit confused what you mean by "they do not appear as NBSP in Word". As far as I know they're just NBSPs, and unless you're using an extremely limited font (as you appear to be in Notepad++) Word shouldn't have any trouble displaying them.)
  • Here is a screenshot from word, with both Calibri and Times New Roman fonts:

    As you can see, the regular NBSP is a sort of °. The one Zotero generates is invisible when displaying the hidden characters (¶).

    For your answer regarding the depth of the processing, I understand what you mean. But I think it is incorrect or at least inconsistent. For instance, I generated a bibliography with a book chapter:

    Only one NBSP where, if you want to be correct, there should be 3 more: on before the ":" after "in", one before ":" after "Bruxelles-Woluwe" and one before ";" after "d'orthopédie".
  • If you paste into Notepad++ and upload the text file somewhere, we can see exactly what characters they are, but we'll probably have to wait for @fbennett to weigh in on this, since this is his department.
  • OK, the character after "pratique" is Unicode Narrow No-Break Space (U+202F). The others are just regular spaces. I assume the 202F is added by this line, but I don't know the details.
  • Sorry for the delay in responding here. Writing.

    The NBSP for the French locale is hard-coded, and at present there isn't a way to turn it off in Zotero or via CSL. There are a few ways that the processor could pass control over it to the user:

    1. Processor option w/Propachi. There are some toggles that can be set on the processor code, to change behavior in various ways. I could add a toggle for the NBSP behavior. A toggle could be built into the Propachi Plugin, so that users could install that when they need control over this.

    2. Zotero option w/Zotero. Same as (1), but with a toggle or hidden option in Zotero itself to control the behavior. (I assume that this is not desired.)

    3. CSL locale. The CSL language could be extended to include either a style option or a locale term for the character that appears as NBSP in French locale mode. (Neither approach is likely to gain traction in CSL development, since the issue seems to be a rare one.)

    So ... shall I look at doing something around (1)?
  • Hi! Thanks for the reply!

    I'd prefer option 2, but if you think 1 is better, I'm OK with it.

    Just to be clear : you're about to add an option for users to turn off the current behaviour that turns regular spaces into nbsp. This option will take the form of a zotero extension (.xpi) that user will have to enable to have only regular spaces or disable to stay with current behaviour. Is it correct?
  • Up to @fbennett, but I would say the proper solution here is for for you to use find/replace. The behavior here is (as I understand it) correct for French, and this character was added to Unicode in 1999, so this is pretty clearly the fault of the software or font you're using that's not displaying it properly.
  • I'm with Dan on this one. What software and font are you using?
  • I agree with both of you. But I wrote some post above about the cases where Zotero doesn't comply with typography rules.

    And some citation styles doesn't comply with typography rules, and Zotero (well the CSL processor) permits to comply with thoses styles...

    What I mean is that letting the user decide could solve my issue and give more freedom when creating styles as well.
  • Right, so I think if there's going to be a solution it should be 3) since CSL handles style formatting. I don't ideologially oppose allowing "bad" typography, but as Frank says, I'd want to see more evidence of an actual need apart from broken software to implement a dedicated option for something 2 people have ever asked about.

    The missing nbsp you mention above are in locations where they style can already specify them if needed for other reasons.
  • Hey adam,

    I'm OK with the third solution from @fbennett as well ^^'
    I know the nbsp I mention can be modified in the style. But why leave transformation of some spaces in nbsp at style discretion, and add other without giving the possibility to deactivate the behaviour?

    I want to be clear: nbsp before ":" and others is correct in French and the software I use is faulty. But well, as a Zotero user, have an option to adjust nbsp behaviour would make sense!

    Before starting to use Zotero everyday, I read a lot of French guides written by librarians on the use of Zotero. In a lot of them there was this sentence: "after you generated your bibliography, remove the squares and the funny symbols". The truth is few people know what is a nbsp, fewer give a damn about it, and a lot of people will prefer to make the modification by hand than ask on this forum :D

    THAT BEING SAID: I totally understand that Zotero development is made of other priorities. Please, just let me know your final decision :)
  • @fbennett I do wonder whether using the narrow no-break space is the best option here. There is spottier font support for it, compared to the full-width no-break space.
  • @bwiernik Totally agree to that. And the full-width nbsp is supported by the software I use (but that's not the real point :) ).
  • But according to the Wikipedia section I linked to above, the narrow no-break space — not the regular no-break space — is the correct character for French in the specific contexts in which it's used:
    It is also required for big punctuation in French, sometimes inaccurately referred to as ”double punctuation“ (before ;, ?, !, », › and after «, ‹; today often also before :)
    If software or fonts aren't properly supporting it, you should complain to the developers, since the character is almost 20 years old.
  • Really, it's far more likely that it's just the font here, so I'd look into just setting a different font in your software. I think it'd only be the problem of the software itself if a particular font was hard-coded or Unicode wasn't supported at all.

    Basically, Zotero shouldn't generate incorrect formatting for everyone just because some fonts are (two decades) outdated, and if you knew to find and use a special setting to change it, you could just as easily do the find/replace or pick a different font.
  • The software I use the bibliography in is some sort of a php web form. I cannot use find/replace in it. I'll have to use notepad or something. Anyway, as I previously said, I understand your point on the development, on the depth of the implementation, on typography rules, and I know that the software I use bibliographies in is faulty.

    1/ Zotero is the only software I know using this narrow space. I didn't even know the difference with a regular nbsp before this subject!
    2/ Microsoft Word and LibreOffice are using regular NBSP when typing in French.
    3/ It's not because I'm the only one complaining here that I'm the only one for whom this is issue.
    4/ A lot of standards are twenty years old and still not implemented... IPV6 for example.
    5/ I'm not asking for Zotero to change its behaviour on NBSP for everyone, I'm just asking for a way to change this behaviour on my computer. A switch to turn off.
    6/ If it's a "no, we're not going to change that, we're not going to add an option for you", well, I'll be annoyed, but I'll understand!

    And I'll keep using Zotero, because it's a really fantastic software :P

  • Microsoft Word and LibreOffice are using regular NBSP when typing in French.
    Just to be clear, a regular space (SPACE in Unicode) isn't a non-breaking space (NO-BREAK SPACE), and a non-breaking space isn't a narrow non-breaking space (NARROW NO-BREAK SPACE). So it's not that Word/LibreOffice aren't using that character normally when you press space — they wouldn't. It's a special character you'd have to insert, like NSBP. There may be easier shortcuts for regular NBSP than for the narrow one on some OSes, though.

    Is your browser not displaying the narrow NBSP, or just the specific output of the application you're using? If you view this page, do you see the correct space?
  • Sorry, I wasn't clear, I meant with auto correct. If I type anything and ":" in French in both LibreOffice and Word, both softwares will insert automatically a NO-BREAK SPACE. Not a NARROW NO-BREAK SPACE.
    I'll add that the shortcut in Word is CTRL+SHIFT+SPACE for a nbsp, you'll have to use ALT+ the numeric keypad for the narrow one: nbsp is easy to insert and known, narrow nbsp isn't.

    No issue to display both of characters in my browser, it's only the output that is problematic.
  • Another up :P
  • edited September 3, 2018
    @dstillman @fbennett Wikipedia is ambiguous it seems. The specified space character varies by punctuation mark. The Unicode guidelines for French punctuation specifies a standard NBSP before the colon (but narrow NBSP before ; ! And ?):

    Canadian French also uses a standard NBSP:

    Here is the related discussion from LibreOffice:

    Generally, I certainly agree that support for Narrow NBSP should just exist at this point, but there are a lot of very popular typefaces that don’t have it and aren’t likely to change (e.g., Helvetica). But for the primary characters that Zotero/citeproc-js have to deal with ( : and » ), a NBSP would be correct in any event.
  • Thanks @bwiernik ! This link to OpenOffice just sums it up.
  • Up.

    Is this issue taken in consideration? I mean, if it's a no, it's a no ^^'
  • We're definitely not going to make this customizable -- that'd be a whole separate CSL command just for this edge case. I'd be open to using a simple NBSP or more precisely follow unicode guidelines; what makes me hesitant is that Zotero is very popular in France and we get very few complaints about this. @Gracile @FHeimburger do you have any thoughts on this?
  • Je comprends. I'll wait for french experts to confirm, but I'm pretty sure they won't care.
  • Although somewhat peripheral to this thread is the problem some publishers have with exporting metadata with fields that contain a non-breaking space.

    See, for example:
Sign In or Register to comment.