strip-periods not working yet?

Strip periods doesn't seem to be working...

I've used the test code below to ensure it's not just one or another style.

...
<citation>
<layout delimiter="; " suffix="." >
<text variable="author" suffix=", " />
<text variable="container-title" form="short" strip-periods="true" />
</layout>
</citation>
<bibliography>
<sort>
<key variable="author" />
</sort>
<layout>
<text macro="author-biblio" suffix=". " />
</layout>
</bibliography>
...
  • I had always thought of strip-periods as being restricted to short-form terms (it replaces add-periods from CSL 0.8.1, which only made sense in that context). But the schema does allow it on content fields as well, I see. Thanks for the report. I'll look into it, and should have it going soon.
  • edited November 13, 2010
    Actually, I can't seem to reproduce this failure in the test suite. There is a fixture on file, magic_StripPeriodsTrue.txt, which has been passing for the past five months or so (most recently a few minutes ago). I've added another, magic_StripPeriodsTrueShortForm.txt, to be sure that the form="short" attribute isn't interfering. But it seems okay.

    Not sure where to start. What's your environment, what is the style, and what exact input string is leaving periods in place on you?

    (Edit: As a side note, the cs:text node will not produce any output for variable "author", at least in citeproc-js/Zotero 2.1. Name variables must be rendered with cs:names.)
  • Below is a sample excerpt which comes much closer to what I use in the style I am working on, McGillv7. The input strings I am using are "O. Kahn-Freund" and "Mod. L. Rev." The output in the test pane, using the code below, comes out as "O. Kahn-Freund <i>Mod. L. Rev.</i>" in all three cases (cite/multi/biblio).

    I am using firefox 3.6.12, Zotero 2.0.9, in Windows Vista, if that makes a difference.

    <macro name="container-title">
    <text strip-periods="true" variable="container-title" form="short" />
    </macro>
    <macro name="author">
    <names variable="author" >
    <name strip-periods="true" and="symbol" delimiter-precedes-last="never" />
    <substitute>
    <text macro="editor" />
    </substitute>
    </names>
    </macro>
    <citation et-al-min="4" et-al-use-first="1" >
    <layout delimiter="; " suffix="." >
    <text macro="author" suffix=" " />
    <text macro="container-title" font-style="italic" suffix=" " />
    </layout>
    </citation>
    <bibliography>
    <layout>
    <text macro="author" suffix=" " />
    <text macro="container-title" font-style="italic" suffix=" " />
    </layout>
    </bibliography>
  • Aha. As my note says, strip-periods was newly introduced in CSL 1.0. Zotero version 2.1 is the first to use CSL 1.0.

    Zotero 2.0.9 uses CSL 0.8.1, so strip-periods is not available there (and if you validate the style, validation will fail).
  • I've installed the 2.1 beta. Strip-period works, but seems to be wonky in combination with <suffix="."> I would expect would strip-periods to be applied before adding the suffix - is it possible it is working the other way?
  • edited November 29, 2010
    Thanks for the report. Will check this later today. (What is probably happening, off the top of my head, is that duplicate punctuation is being suppressed on the attribute, before strip-periods is applied. If that combination was missed in our tests, the code could be failing in that way. Shouldn't be hard to fix, but may take a little time to percolate through to a Zotero release.)
  • Have you looked into this? It appears that strip-periods (in Z2.1b7) continues to remove periods placed in the suffix of the text string...
  • edited February 19, 2011
    Thanks for the reminder; this had indeed slipped past. I've taken your sample code above, and refashioned it as a small valid CSL 1.0 style (the validator says that strip-periods is not allowed on the cs:name node, so I left that bit out).

    Running against some sample input, with bare periods inserted at various locations in the style file, I get this output:Jane Roe.Journ Title.Jane Roe.Journ Title.
    ... where a period has been stripped from "Journ.", and the other periods are supplied by delimiters or suffix elements.

    So the processor looks okay on this; and it's very unlikely that anything in Zotero could change the behavior. If it's not working in your installation, the most likely cause would seem to be invalid CSL, which can have unintended side effects. If you can post a style to gist/github, I'll be happy to give it a try in Zotero to see if the faulty output can be reproduced there.

    http://gist.github.com/

    (Edit: Just in case, note that the construct <suffix=".">, mentioned in your post above, is not valid, and will either produce an error, or no output at all.)
  • I think that's it's fixed; having played around a little, it seems to me that the problem was, indeed, in my own code.
  • Yay, that's great to hear -- and glad to know that it's working for you!
  • I've found one additional wonky behaviour in strip-periods that is frustrating. Essentially, strip-periods is inserting an extra space when used in combination with name-as-sort-order

    Here's sample code:

    <macro name="container-title">
    <text strip-periods="true" variable="container-title" form="short" />
    </macro>
    <macro name="author">
    <names variable="author" >
    <name strip-periods="true" and="symbol" name-as-sort-order="first" delimiter-precedes-last="never" />
    </names>
    </macro>
    <citation et-al-min="4" et-al-use-first="1" >
    <layout delimiter="; " suffix="." >
    <text macro="author" suffix=" " />
    <text macro="container-title" font-style="italic" suffix=" " />
    </layout>
    </citation>
    <bibliography>
    <layout>
    <text macro="author" suffix=" " />
    <text macro="container-title" font-style="italic" suffix=" " />
    </layout>
    </bibliography>

    If the authors have periods after their middle initials (or if they have only first initials) this renders as

    Levy, Mark A , Oran R Young & Michael Zürn European Journal of International Relations .

    If the authors don't have periods after their middle initials, it renders as

    Levy, Mark A, Oran R Young & Michael Zürn European Journal of International Relations .

    I think that the second is the preferred rendering! Is this fixable?
  • I raised this issue with fbennett in early May. He mentioned he would correct this, but I don't know if the fix has already made it into citeproc-js and Zotero. You're running Zotero 2.1.7?
  • Yes. I am running 2.1.7.
  • edited June 14, 2011
    Just guessing, but the issue raised by Rintze was probably a similar issue with initialize-with~" ", which now works correctly. The strip-periods attribute is not allowed on the cs:name node in the CSL 1.0 schema.

    I can see that it's useful there (to preserve names but drop periods from initials set in the source of the entry). But the CSL specification will need to be amended to allow it.
  • Frank,

    Right! I see that the test code I gave isn't CSL 1.0 compliant.

    The problem is actually with the code in McGill v 7, which does meet the specs for CSL 1.0 (i.e. it doesn't pass strip-periods to <name> directly). I got around that limitation with code which is essentially like the following:

    <macro name="container-title">
    <text strip-periods="true" variable="container-title" form="short" />
    </macro>
    <macro name="author">
    <text strip-periods="true" macro="author-to-strip" />
    </macro>
    <macro name="author-to-strip">
    <names variable="author" >
    <name and="symbol" name-as-sort-order="first" delimiter-precedes-last="never" />
    </names>
    </macro>
    <citation et-al-min="4" et-al-use-first="1" >
    <layout delimiter="; " suffix="." >
    <text macro="author" suffix=" " />
    <text macro="container-title" font-style="italic" suffix=" " />
    </layout>
    </citation>
    <bibliography>
    <layout>
    <text macro="author" suffix=" " />
    <text macro="container-title" font-style="italic" suffix=" " />
    </layout>
    </bibliography>

    It's an ugly work around, but for some reason, unlike with all of the other periods in the string which gets passed back to the "author" macro, the first one in my test citation gets replaced with a space instead of just being deleted. Does that make the error reproduceable?
  • edited June 15, 2011
    We probably shouldn't allow that one either. :)

    Seriously, though, it's a little more complicated under the hood than it looks on the surface. Here's a more-or-less full rundown.

    The most obvious problem is that, if the processor is extended to wrap content in hyperlinks, a clobber-periods function executed against the rendered string output of the macro (which is really the only feasible way for the processor to handle that construct) would clobber the periods in the URL as well. Ouch.

    Even if we assume we'll code around that when it becomes an issue, there is a problem with determining exactly how the periods should be removed. In the general-purpose strip-periods function, we currently preserve space -- and replace the period with a space if it is mid-string. So "Env. L.J." becomes "Env L J". This is where the extraneous space is coming from; the processor is seeing the abbreviation mid-string, so it is forcing in a space. Culling spaces that are followed by punctuation is not practicable when working against rendered string content -- the given-name initial could be followed immediately by a closing tag for small-caps, say, with the space falling after. Parsing around such markup is something that we just don't do.

    A cleaner approach would be for us to ban strip-periods on text nodes that call a macro, and allow it on name-part nodes (where it is currently banned). Then you would be legally entitled to do this:<names variable="author">
    <name>
    <name-part name="given" strip-periods="true"/>
    </name>
    </names>
    In fact this will actually work right now, producing the result you're after without any modifications to Zotero or the processor. The only problem is that it won't validate against the CSL 1.0 schema. :(

    I'll raise this over in CSL-land and see where the discussion leads us.
  • Well, you've drawn me into an ugly corner now. I have to choose between valid code and predictable output on unfortunately framed inputs.

    Thanks for clearing up the source of these difficulties and why some 'simple' solutions won't work.

    Two things: I am not sure that I agree that it makes sense to replace the period with a space. A plain, unrelenting and unforgiving implementation of 'strip periods' function which just stripped them out, no holds barred, would work fine for my purposes. Hein Online, e.g. (as I recall you're a lawyer) gives the abbreviations with the correct amount of white space after the relevant periods, and the most recent McGill guide wouldn't expect a space between the L and J in your example. Nonetheless, I suppose there must be cases where people rely on this way of implementing it; you don't have to prove it to me. Your discussion of URLs suggests to me that programming in exceptions to the 'just get rid of them' rule is only going to get harder.

    Finally, You don't have to take the time to explain it to me, but I am not clear on why it would be consistent to ban strip-periods from macros.

    Thanks again for all your hard work on both the Zotero and the CSL front. Great to have a voice at the other end when these things come up. I'll have to see about implementing the non-validating code you suggested <wince>.
  • At the moment, I reckon that the chances of it will becoming valid in the reasonably near future are pretty good. There is always a risk of the person closest to the code being blinkered by implementation details, though, and I've certainly had that experience in the past. We'll have a discussion, and see what emerges.
  • edited June 15, 2011
    You might well be right about the preferred behavior of strip-periods. The advantage of the current behavior is that it produces the same output regardless of whether or no the period is followed by a space in the input. That's not much help if the uniform output is wrong for your style, of course. :)

    We could just say that for fine-grained control over the orthography of journal abbreviations (say), we need support for pluggable abbreviation lists.
  • @liam.mchugh.russell,

    I'm bundling up some changes for a new processor release. Looking at this strip-periods issue, and after running the tests in a couple of configurations, I've concluded that you're right: strip-periods should do just that, without attempting to insert any sort of placeholder.

    When the revised processor comes through in a new release, we'll see whether the change elicits any cries of dismay. If not, we'll be good with Hein Online, and your original coding would, I think, also work.
  • @fbennett,

    Has the processor release gone through? The problem still appears with the latest version of the style.
  • You need the next release (after Zotero 2.1.8).
  • not in the regular 2.1.8 version of Zotero, which was released before Frank's post. You can test it on the branch XPI if you want to.
  • There's another problem which seems to be caused by strip-periods. Here's a summary of the offending code, taken from mcgill-guide-v7:

    <group delimiter=". ">
    <text macro="author-bib" strip-periods="true"/>
    <text macro="render-book"/>
    </group>
    <macro name="author-bib">
    <names variable="author">
    <name name-as-sort-order="first" and="symbol" sort-separator=", " delimiter-precedes-last="never"/>
    <et-al term="et-al"/>
    <label form="short" prefix=", "/>
    <substitute>
    <names variable="editor"/>
    </substitute>
    </names>
    </macro>

    When this is passed one or more authors, it works fine. When there are no authors, but only one or more editors, it fails to render the intermediary period. When the strip-periods is removed, the intermediary period returns for both cases (though obviously, with unwanted periods in the authors names left intact).

    I've added a test case to the CSL Test Submission group using fbennett's csl feedback plugin which replicates this problem from Zotero 2.1.8, which I am using. I know that certain changes to the processor which impact the implementation of strip-periods will be made in a future release, but I thought that it was important to identify this now, lest it should remain despite those changes.
  • Just to make clear where this problem is appearing, here's the code again, with the content of the "render-book" macro:

    <group delimiter=". ">
    <text macro="author-bib" strip-periods="true"/>
    <text macro="render-book"/>
    </group>
    <macro name="render-book">
    <group delimiter=", ">
    <text variable="title" font-style="italic" />
    <text macro="edition"/>
    <text macro="translator"/>
    <text macro="editor"/>
    </group>
    <text macro="publisher-place-year"/>
    </macro>
    <macro name="author-bib">
    <names variable="author">
    <name name-as-sort-order="first" and="symbol" sort-separator=", " delimiter-precedes-last="never"/>
    <et-al term="et-al"/>
    <label form="short" prefix=", "/>
    <substitute>
    <names variable="editor"/>
    </substitute>
    </names>
    </macro>
  • Liam,

    Thanks for raising this. I'll take a look in the next couple of days.
  • edited July 11, 2011
    This has been fixed in processor version 1.0.190, just checked in. With the fix, strip-periods should be safe to use anywhere it is allowed by the CSL 1.0 schema.

    (Edit: Simon has picked up the new version and merged it to the 2.1 branch. In the next Zotero release we'll do a better job with the McGill Guide.)
  • edited November 8, 2011
    I am getting some very odd behaviour as a result of strip-periods again. Here is some code adapted from mcgill-guide-v7.csl:

    <macro name="author-note">
    <names variable="author">
    <name and="symbol" delimiter-precedes-last="never"/>
    <et-al term="et-al"/>
    </names>
    </macro>
    <citation et-al-min="4" et-al-use-first="1">
    <layout delimiter="; " suffix=".">
    <text variable="URL" />
    <text macro="author-note" strip-periods="true"/>
    <text variable="title" quotes="true" suffix=","/>
    </layout>
    </citation>
    <bibliography et-al-min="4" et-al-use-first="1">
    <sort>
    <key variable="issued"/>
    </sort>
    <layout>
    <text variable="URL" />
    </layout>
    </bibliography>
    </style>

    For some reason, this strips periods from the URL and title as well as the author's name. When the call to the author-note macro is removed, the problem disappears. I am using the latest version of Zotero; I've reproduced this problem both in the test pane and the word plugin. I had originally thought that this was limited to the URL variable and had something to do with the "Include URLs..." option but no such luck. It seems to affect all variables. And what is most odd is:

    1. It strips periods from the rendering of other variables whether the call to author-note macro comes before or after the rendering of other variables.
    2. It appears to strip periods from the rendering of variables in the bibliography, even if the call to author-note is only made in the citation portion of the style!
  • Liam,

    Thanks for reporting this. It's obviously a true bug.

    I've fixed it in the latest processor release (1.0.242). The problem will heal in Zotero when the new version makes its way into a release.
Sign In or Register to comment.