Import RIS from OPAC

Hi.

I have problems importing RIS from german OPAC Servers. Some fields are used twice (like AU, TI) but only the last one gets imported.

For example:
TY - BOOK
ID - FU003885748
AU - Bordwell, David
AU - Thompson, Kristin (only this one gets imported)
TI - Film art
TI - an introduction (only this one gets imported)
PY - 2001
SP - XX, 458 S.
CY - New York <<[u.a.]>>
PB - McGraw-Hill
(...)
ER -

Is this a zotero bug or invalid RIS format? If it's zotero bug, is there a workaround?

Thanks and greetings,
Max
  • As far as I can tell from the specification, RIS wants both the title and subtitle in a single TI or T1 or CT element (separated by a colon or whatever is usual in language you are writing). This would mean that your exporter has it wrong. If that is the case you might report it to them, with a link to the spec, and some proper examples exported from another source. You can fix this manually by combining the two title lines in a text editor to read :

    T1 - Film art: an introduction

    The author fields are according to spec , however ("Each reference can contain unlimited author fields") and in fact import just fine for me.

    (Note that if you try to import the above RIS, you need to replace '-' with ' -', since RIS needs two spaces before the dash, and this forum removes one. Putting <code> tags around the pasted text avoids this. The following example should import as expected:

    TY - BOOK
    ID - FU003885748
    AU - Bordwell, David
    AU - Thompson, Kristin
    T1 - Film art: an introduction
    PY - 2001
    SP - XX, 458 S.
    CY - New York <<[u.a.]>>
    PB - McGraw -Hill
    ER -
  • Scot, thanks a lot for your comment.

    Meanwhile I have phoned with the exporter. They seem to be cooperative. But due to software restrictions they are unable to merge those two title fields into one.
    Is there a way have two title fields in RIS, which get properly imported by zotero?

    The data is stored as US MARC and the RIS export is created by transforming the fields one by one:

    331 Film art
    -->
    TI - Film art

    335 an introduction
    -->
    TI - an introduction


    I tried to understand the RIS specs at refman.com. What I didn't get is the difference between TI, T1, BT, T2 and T3
    Would it be correct to export the above example like this?

    331 Film art
    -->
    T1 - Film art

    335 an introduction
    -->
    T2 - an introduction

    I tried to, but Zotero ignores the T2 field.

    Seems to me, that rewriting the import script for my personal need is the only way.

    --
    cheers,
    Max
  • I'm no RIS expert, but my reading of the spec is that TI and T1 (and perhaps BT) are treated equivalently, and T2 is used for a second title contained in the entry (the title of a containing work), similarly for T3. So it will do you no good.

    Yes, you may have to try to rewrite the import script to merge the two values into one Zotero title.

    The alternatives are (1) write a script or a text editor macro to merge the fields before you import them. (2) type lost title piece in manually each time (3) persuade your library to add COiNs or unAPI tags to their pages, which Zotero can easily recognize and import. Though if they are constrained by proprietary software or budget, that may not be possible. (4) import your records from another library or database.
  • Taking a step back: the original data is in MARC, which Zotero can import. Perhaps the best short term solution is to ask that the MARC records be made available for download & to check whether those successfully import?

    Assuming they worked, a slightly longer term solution would be to either write a Zotero translator for the site that uses the MARC, or to convince the site to implement unAPI for the MARC records.

    (I don't think it is as nice as the above suggestions, but there are several MARC->RIS conversion tools. If their current one isn't making RIS, there are other they might be able to try...)
  • @noksagt
    To be correct, the data is stored in MAB2 and there is the possibility to connect to the DB and retrieve MARC formatted data. I do not have any information about the retrieval procedure, except that you have to connect to the server on a different port (TCP 991)?! Zotero doesn't seem to have an interface, so I couldn't try it out.
    Anyhow, I am not very familiar with bib stuff, but I will try to talk to the bib admin.

    @scot
    I did rewrite the RIS translator and it now works on the fly.
    I had to connect to the sqlite database and change the "code" field in the table "translators". The appended diff should make it more easy to reproduce, originaly I made the changes using sqlitebrowser. The hack only works if I get exactly one or two TI fields in the RIS!

    diff -ru zotero.orig/ris.translator.js zotero.new/ris.translator.js
    --- zotero.orig/ris.translator.js 2007-09-22 11:42:17.000000000 +0200
    +++ zotero.new/ris.translator.js 2007-09-22 11:41:20.000000000 +0200
    @@ -75,7 +75,15 @@
    if(fieldMap[tag]) {
    item[fieldMap[tag]] = value;
    } else if(inputFieldMap[tag]) {
    - item[inputFieldMap[tag]] = value;
    + if (tag == "TI") {
    + if (!item[inputFieldMap[tag]]) {
    + item[inputFieldMap[tag]] = value;
    + } else {
    + item[inputFieldMap[tag]] = item[inputFieldMap[tag]] + ": " + value;
    + }
    + } else {
    + item[inputFieldMap[tag]] = value;
    + }
    } else if(tag == "TY") {
    // look for type

    I also had to add another content-type to the list of content types to capture. These changes have to be made inside the chrome/zotero.jar file:

    --- zotero.orig/content/zotero/xpcom/ingester.js 2007-06-20 17:53:25.000000000 +0200
    +++ zotero.new/content/zotero/xpcom/ingester.js 2007-09-21 12:56:37.000000000 +0200
    @@ -645,6 +645,7 @@
    // list of content types to capture
    // NOTE: must be from shortest to longest length
    this.desiredContentTypes = ["application/x-endnote-refer",
    + "application/x-end",
    "application/x-research-info-systems"];

    this.QueryInterface = QueryInterface;


    The result is that I am now able to import stuff from my preferred catalog (opac.fu-berlin.de) with only five more clicks. No more 'save as...' and 'import...'. Of course it's just a temporary hack, because any extension/translator update will undo the changes.

    I think I need to dig more deeply into that [COins|unAPI|openURL|MAB2|MARC|.*] stuff to get things driven forward. Do you know some docs for a quick overview about this?
  • @max213, for info about some of the mentioned standards, please see e.g.:

    COinS: http://ocoins.info/
    unAPI: http://unapi.info/
    openURL: http://en.wikipedia.org/wiki/OpenURL
    MARC: http://www.loc.gov/marc/
  • I am having the same problems with opac.fu-berlin.de and I am looking for the best workaround.

    I noticed that you can also grab the citation information from the item display (Titelanzeige), but the imported data sucks regardless of the chosen view (Standardformat, Katalogkarte, Feldnamen, Feldnummern). So, for best results, I export selected items as RIS (Zotero grabs it automatically) and fix the title and some other fields afterwards.

    I was thinking of using other sites for grabbing citation information and using opac.fu-berlin.de and other Berlin catalogs only for getting the books. I am therefore still looking for libraries or other websites who provide well-formed citation data of German literature that imports flawlessly. Any suggestions?
  • Here's a zotero compatible opac:
    http://gso.gbv.de/DB=2.1/SET=1/TTL=1/

    I talked to opac.fu-berlin.de admins some time ago and they told me that the RIS export interface of aleph is not 100% Zotero compatible and customization is hardly possible.
  • >>> http://gso.gbv.de

    Thanks for that link, max213 - very helpful indeed!
Sign In or Register to comment.