CalmView

I'm trying to put together a translator for archives using the CalmView OPAC, which is used by dozens of small-medium archives in the UK. I've got something working for the archive I happen to be currently using (Parliamentary Archives, London), but it seems like it would make sense to expand it so that it covers all CalmView catalogues before submitting it, if possible.

However, I've hit one major snag: I can't find a way to reliably scrape the name of the archive. On all the catalogues I've looked at so far, the archive's name/logo is an image set as the background of an empty <div>, so there's not even the chance of alt text to go on.

Aside from this, I'm pretty sure the catalogues are standard enough that a single translator would work. If anyone has any ideas how to get around the archive name issue, I'd be super grateful. Some example archives include:

http://www.kentarchives.org.uk/CalmView/Default.aspx
http://www.calmview.bham.ac.uk/default.aspx (this one frustratingly has a non-standard URL, which could also be a problem)
http://archives.lambethpalacelibrary.org.uk/CalmView/Default.aspx?
http://www.calmview.eu/Goldsmiths/CalmView/default.aspx
http://www.portcullis.parliament.uk/calmview/

Thanks,
Richard
  • Given the diverse nature of the URLs that these catalogs are using, you're probably going to have to list all of them explicitly in the target regexp. If you're doing that, then you can also map URLs to archive names. Feel free to put up a pull request on github and we can discuss this further with more concrete examples.
  • Great news that someone's working on this - thanks, Richard.

    According to the CalmView devs at Axiell, CalmView is supposed to serve RDF metadata on item pages but appears not to be doing so as expected. They are working on this at the moment.

    Even if the RDF were being served correctly, the current CalmView RDF implementation omits some important fields, so a bespoke scraper translator may still be the best approach.

    A possible additional obstacle to using the RDF with CalmView is that the RDF which is supposed to be served does not seem to be recognised by the Zotero Embedded RDF Converter.

    The intended CalmView RDF can be viewed in isolation at e.g. http://www.calmview.eu/ShetlandArchive/CalmView/record.ashx/Record.ashx?src=CalmView.catalog&id=AD22/101/1906/24 . Perhaps someone more experienced can identify why the eRDF converter is not picking up this data. I wonder if it is simply because that URL string does not serve a full standard XML document.
  • @rtbell, did you ever finish writing a CalmView translator?

    Unfinished or finished, might you be willing to share the work you did on this (e.g. as a Gist)?
Sign In or Register to comment.