Sirsi translator doesn't quite work with Stanford's Sirsi catalog

jlavigne · January 14, 2008

Zotero can grab records from Stanford's Sirsi catalog but only a few pieces of information go into the correct fields (e.g., author, title, ISBN). Other data is put into the Extra fields, but with identifiable field tags( e.g., "Physical Description: viii, 223 p. ; 23 cm".) I notice that other Sirsi catalogs seem to work better, for example, the Emory catalog at http://www.library.emory.edu/. This is our HTML for the "Physical Description" field:
<tr>
<th class="viewmarctags" align="right" valign="top">
Physical Description:
</th>
<td class="viewmarctags">
viii, 223 p. ; 23 cm.
</td>
</tr>
And this is Emory's:
<TR>
<TD NOWRAP VALIGN="TOP" ALIGN="right">
Physical description:
</TD>
<TD VALIGN="TOP">
<B>viii, 223 p. ; 23 cm.</B>
</TD>
</TR>
What is the translator looking for that would make Emory's HTML get parsed correctly and ours not parsed correctly? Is there some way we could change things so our data gets into the right Zotero fields?

Thanks,
Jonathan Lavigne
Stanford University Libraries

sean · January 14, 2008

Zotero works best with library catalogs and other resources that expose a structured record. Emory's catalog, for example, will display a formatted MARC record for any item in its catalog. With the version of Sirsi that Stanford is running, we don't yet know of any good way to get structured data, so instead we're forced to "screen scrape" data from the page. The other Sirsi catalogs we've encountered tend to place their metadata under headings that differ significantly from yours. In particular, Stanford displays publication information (location, publisher, date) under "Imprint" instead of in separate "Publisher", "Pub Date", and "Publication Info" lines. We can tweak our Sirsi translator or prepare a custom "scraper" just for Stanford, but ideally we would be working with a MARC record or something similar on your end. Any ideas on that front? Feel free to contact me at support@zot...org