Problem importing a URL field from MARC

I sent this to Simon Kornblith, the author of the MARC translator, but haven't heard anything so I thought I'd try for some help here:
===============================
I'm trying to develop a MARC import translator for Zotero that is largely based on your generic one. One of the catalogs at my school library uses a few MARC tags that you didn't support, and it uses others in other unique ways that requires some extra support.

I'm having trouble parsing one line of a record in one of our records:

856 42 |zBoston.com|uhttp://www.boston.com/bostonglobe/ideas/
articles/2008/02/17/q_and_a_with_jim_wallis?mode=PF


I've been grabbing it with this code in my version of your MARC translator:
this._associateDBField(item, "856", "u", "url");

The problem is that the results replace the underscores in the URL with spaces. Here's the output from the Zotero Scaffold:
'url' => "http://www.boston.com/bostonglobe/ideas/articles
/2008/02/17/q and a with jim wallis?mode=PF"


I've tried tracing through your _associateDBField and record.prototype.importBinary, but I haven't been able to pinpoint where this occurs precisely. The underscores seem to be read fine within record.prototype.importBinary, but by the time I get to _associateDBField (or even record.prototype.getField), the underscores have been replaced with spaces.
===============================
Any ideas how I could overcome this? I'm having trouble tracking down the source of the problem in the MARC translator.
  • I was able to resolve this myself, actually... if someone can close the discussion or wants to delete it that's fine with me, I don't think I can do that.
  • Did you find a bug, or is this issue unique to your school's catalog?
  • I haven't tested this on other catalogs, so I'm not totally certain, but I think I found a bug.

    The translate function of the translator calls the _associateDBField method which in turn fires a number of other member functions. As far as I could tell, the substitution occurs somewhere in between the function that reads the raw MARC data (called importBinary, I think) and the higher level member functions that pass the information to _associateDBField.

    I worked around this by simply doing a string replacement after my
    this._associateDBField(item, "856", "u", "url");
    call. I'm going off memory right now, but I basically just checked for spaces and replaced them with underscores. I'm assuming that the only character that gets replaced by spaces is underscores, which is probably a risky assumption.

    If anyone can further test this or offer a fix that's better than mine that would be great!
Sign In or Register to comment.