Possible Solution for Primo Bug

The primo translator https://github.com/zotero/translators/blob/master/Primo.js is not working for articles because the PNX was not accessible for PrimoCentral. Mehmet Celik from the KU Leuven shows a way, how it is possible for libraries to access the PNX also for PrimoCentral data:

http://www.exlibrisgroup.org/display/PrimoCC/showPNX+Revisited

This solution works with an jsp-file on the primo-server and a little bookmarklet. I think it should now also be possible to use this jsp-file for the primo-translator for zotero. What do you think?
«1
  • For libraries that have the showPNX.jsp on their server, yes, we should be able to take advantage of that.
  • I tried to adapt the translator, see:

    https://dl.dropboxusercontent.com/u/59474281/Primo-with-jsp.js

    This is the "normal" Primo.js where there is some more code for primo implementations with the jsp file. The whole idea is here that I suggest to provide *one* translator for all primo sites: If the translator is invoked on a primo site with the jsp-file then there will be some code for that. If the translator is invoked on a primo site without the jsp-file then it will work as before.

    How does this look for you?
  • just getting back from vacations - will take me a while to get to this, but should be possible to wrap into one translator. Do you have a couple of sample primos with the JSP file installed?
  • edited August 18, 2013
    Primos with the JSP file installed that I found out:
    (1) http://purdue-primo-prod.hosted.exlibrisgroup.com/primo_library/libweb/action/search.do?vid=PURDUE
    (2) http://primo.bib.uni-mannheim.de/primo_library/libweb/action/search.do?vid=MAN_UB
    (3) http://limo.libis.be/primo_library/libweb/action/search.do?vid=LIBISnet&fromLogin=true
    (4.a) http://virtuose.uqam.ca/primo_library/libweb/action/search.do?vid=UQAM
    (4.b) http://eudoxe.bib.uqam.ca:1701/primo_library/libweb/action/search.do?vid=UQAM&institution=UQAM

    Especially, the first two are interesting, because they contain the articles from the PrimoCentral database. The adapted primo translator worked on all tests I tried so far (Primos with and w/o the JSP file). Please let me know if I can help somehow.
  • looks great to me. Did some clean-up, but took your code verbatim. Pull request is here:
    https://github.com/zotero/translators/pull/617
    might take a bit until it gets accepted, the guy who does most of the reviewing just went on vacations.
  • oh - and let me know if you want credit - i.e. be added to the translator creators or mention in the code, happy to do that.
  • Thank you very much for the support! It is good to hear from you that it works. I can wait for some time, to see the feature in the official version. If possible, I would take the credit, also I didn't do much. Maybe, I can try in the future to help to improve the translator further.
  • Happy to give credit: what name should I use?
  • edited September 15, 2013
    [Thank you!]
  • There are still some problems.
    Go to the Oxford Primo at http://solo.bodleian.ox.ac.uk/
    and search for
    Test Valley street map 2006/2007
    That item won't import with your modification - l.150-152
    https://github.com/zotero/translators/pull/617/files#L0R150
    break the XML.

    I'll want to keep the CDATA and clean this up on import further down - it's just too fragile - but probably do want to remove the prim: part. Is there anything other than prim: you're removing with that?
  • My guess is that the showPNX.jsp in the Oxford Primo has some errors, because the XML it creates begins with:

    i<?xml version="1.0" encoding="UTF-8"?>

    The XML breaks downs because it begins with an "i" and not with the XML declaration.

    Here is an example of the XML produces by the jsp file (without errors):
    https://dl.dropboxusercontent.com/u/59474281/000871081.xml

    Possible namespaces are "prim" and maybe "sear" (I can't remember if there were more).

    The cleanup in the lines 148, 149, 151, 152, 153 are critical that the translator is doing something. If you feel that the general regular expression is too critical, we could replace it by more specific one(s).
  • Update: Oxford corrected two errors in their JSP-file ("i" on line 1 and "%" on line 3). Still, there seems to be some problems. Maybe, I can look at these problems a little later...
  • The cleanup in the lines 148, 149, 151, 152, 153 are critical that the translator is doing something. If you feel that the general regular expression is too critical, we could replace it by more specific one(s).
    that's what I mean. I'm in particular concerned with 151 and 152. I realize we need those, but I'd like them to be a lot stricter. Restricing them to just prim and sear would work. Alternatively we could leave them out and include prim and sear as a ns declaration.
  • Also the more restricted version as you described above is fine for me!

    I found out what the problem with Oxford is. For example the following line in the pnx make problems:

    <prim:lln03><![CDATA[$$Uhttp://books.google.com/books?lr=&as_drrb_is=q&as_minm_is=0&as_miny_is=&as_maxm_is=0&as_maxy_is=&q=intitle:Thetis+&as_brr=0&as_pt=ALLTYPES&sa=N&start=0$$DSearch for this title in Google Book Search]]></prim:lln03>

    Inside the CDATA there are some elements which contradict to the (strict?) XML specification and this is also the problem that the translator will not be able to extract anything. One would have to demasking those elements. Suggestions?
  • So, for line 151/152 could we simply be using
    text = text.replace(/\<(prim|sear):([^\>]*)/g, "<$2");
    text = text.replace(/\<\/(prim|sear):([^\>]*)/g, "</$2");


    As for the CDATA - do we actually need to remove that for the XML to parse? I don't think so, right? If not, we could just remove this further down, something along the line of
    item.title = item.title.replace(....)
  • All characters are allowed inside CDATA. We shouldn't be stripping off the CDATA part. xpath's are smart enough to work around it.
  • edited September 5, 2013
    textContent should return the text inside CDATA

    Edit: sorry, I meant text() in xpath
  • @adamsmith: Yes, I think we should use your suggestion for lines 151/152. An example where my general approach went wrong, is actually the GoogleBooks url from Oxford Primo (see 4 posts above).

    The CDATA is also present in the pnx file (of library records) at the moment which one will get by adding '&showPnx=true'. The translator can deal with that at the moment. Therefore, you are right and we don't have to clean this up before using some xpath expression.
  • OK, we'll try to get to this over the weekend
  • The new translator is now up. Your version of Zotero will automatically update within 24hs, or you can update manually using the "Update Now" button in the "General" tab of the Zotero preferences.

    This will work with translators equipped with the showPnx.jsp file, fix some other issues, mostly with Primo 4 installations, and frequently provide more/better data for articles.
  • Thank you very much! This looks really nice now!

    Now, in my tests the journal title is not transformed to zotero via the translator. The line 301:

    item.publication = ZU.xpathText(doc, '//addata/jtitle');
    is responsible for that. The XPath looks good, but is item.publication or is it item.publicationTitle?
  • good catch, it should be publicationTitle. I'll fix that later.
  • Fixed. Same update instructions as above.
  • Hi,
    any news about this issue? I still can't import Primo record via Firefox. However, Chrome/Safari Connectors import Primo record pretty good to my Firefox Zotero library. (Stand-alone version works also.)
  • This issue is fixed and the translator for Primo works (here) for my firefox very good. Are you using special settings or other plugins in your firefox? Can you give a concrete example (URL of the Primo instance) and search term where you encountered problems?
  • Primo installation:
    http://search.obvsg.at/ACC

    Search for e.g. 'zotero' - take a record you like.
  • I can save all 7 records by one click with firefox ;-)
  • I've downloaded Zotero again, but it does not work (I've tested serveral Primo installations) --> 'translator error' with link to:
    http://www.zotero.org/support/known_translator_issues - 'Primo: Most Primo catalogs will fail when trying to import articles.'
    I've talked to my colleagues - they have the same problem.
  • Please try to search for a record in the OBVSG and go (in the same browser window) to the link: http://search.obvsg.at/primo_library/libweb/showPNX.jsp?id=0 . Do you see the data?
  • Yes, I see the PNX record (XML).
Sign In or Register to comment.