Suggested improvement to COinS importer

Within a COinS span, there may be an anchor (<a>) that links to the URL of the article referenced. See for example the reference at the end of this blog post:

It's straightforward to:

  1. read the hrefs of these anchors
  2. add them in the URL field of the added item and
  3. add snapshots of the URLs in question to the added item
Currently the only snapshot which is attached is the one from the referring URL; more useful is the snapshot of the URL of the article. I think this is a useful improvement.

Sample implementation (only attaches last URL within the span):

in doWeb:

var theas = doc.evaluate('.//a[@href]', span, nsResolver, XPathResult.ANY_TYPE, null);
while(thea = theas.iterateNext())
newItem.URL = thea.href;

in completeItems, in the else clause:

newItems[i].attachments.push({title:'[Original article] ' + newItems[i].title,url:newItems[i].URL,mimeType:'text/html'});
  • Most COinS do not feature a URL inside the span. LibX & others will replace the entire span. As such, the span is usually empty (or effectively empty, with only a non-breaking space) OR has elements that will appear when no COinS tool processes the page (and I'd imagine that some will therefore have a link to a COinS processor).
  • Well, how about fetching the linked webpage when there's a DOI in the COinS info? That avoids the issue of finding out whether an <a> tag is relevant or not; the URL associated with the DOI is always relevant.
  • I definitely agree with the spirit of patrick's suggestion - having the origin page as a snapshot is a bit silly for increasingly common COinS bibliographies.
    DOI sounds like a good idea to me - any objections? (one possible problem could be that the COinS - assuming it is for the actual article the user is looking at - could be from a non-gated version of an article and the DOI could resolve to a gated one, not sure how likely that is).
Sign In or Register to comment.