Metadata retrieval mismatch (SIGN guidelines)
Retrieving metadata from SIGN guideline 113 (available from http://www.sign.ac.uk/guidelines/fulltext/113/index.html, direct link http://www.sign.ac.uk/pdf/sign113.pdf) returns:
"A new system for grading recommendations in evidence based guidelines", an article from the BMJ from 2001. Similarly, a test of SIGN 96 returns "Guideline development process for the Health for Kids in the South East project". Testing the other SIGN guielines I have saved handy demonstrates similar failures.
Many SIGN guidelines have a few pages of cover, title and a page explaining evidence grading before any identifier. For example SIGN 113 has the ISBN on page 4, in the format "978 1 905813 54 4". I'm not sure if that's how the metadata retrieval system works, but reading a few other threads about it suggest that it looks at the first couple of pages of the PDF for identifiers. Is that broadly correct?
(As it stands, it is possible to go to the WorldCat site and look up the ISBN so there are ways around it. WorldCat finds it fine without spaces (http://www.worldcat.org/title/diagnosis-and-pharmacological-management-of-parkinsons-disease-a-national-clinical-guideline/oclc/614591030&referer=brief_results), but not with spaces.)
Cheers
bertieb
PS: Forgive a silly question, but... I know nothing about writing translators, but if I was interested in writing one for the SIGN Guidelines website, is it possible to look up WordCat entries and get metadata from them based on a detected ISBN?
"A new system for grading recommendations in evidence based guidelines", an article from the BMJ from 2001. Similarly, a test of SIGN 96 returns "Guideline development process for the Health for Kids in the South East project". Testing the other SIGN guielines I have saved handy demonstrates similar failures.
Many SIGN guidelines have a few pages of cover, title and a page explaining evidence grading before any identifier. For example SIGN 113 has the ISBN on page 4, in the format "978 1 905813 54 4". I'm not sure if that's how the metadata retrieval system works, but reading a few other threads about it suggest that it looks at the first couple of pages of the PDF for identifiers. Is that broadly correct?
(As it stands, it is possible to go to the WorldCat site and look up the ISBN so there are ways around it. WorldCat finds it fine without spaces (http://www.worldcat.org/title/diagnosis-and-pharmacological-management-of-parkinsons-disease-a-national-clinical-guideline/oclc/614591030&referer=brief_results), but not with spaces.)
Cheers
bertieb
PS: Forgive a silly question, but... I know nothing about writing translators, but if I was interested in writing one for the SIGN Guidelines website, is it possible to look up WordCat entries and get metadata from them based on a detected ISBN?
As for translators - yes, I think you could write a translator that gets the ISBN for a page and then uses WorldCat to query that - but it's a bit more advanced as translators go. Unfortunately, the site is otherwise so unstructured that I don't see any reasonable alternative.
Advanced you say? Well, where there's a will, there's a way! Sadly, the SIGN website is suprisingly unstructured for the level of organisation they otherwise have and the work they do. They may be amenable to change though.
http://www.zotero.org/support/dev/exposing_metadata
As for the translator - if you get to this, check the Institute of Physics translator which does the same for DOIs.
(And yes, if you look through a document, DOIs are easier to identify because they all start with "10.")
I've had a look at the Insititute of Physics translator; it seems they call CrossRef which looks like it can handle ISBNs, which will hopefully make things a bit more straightforward! Now that I have my head around XPath (somewhat) I'll see if I can do something similar.
On the other hand, their site is so surprisingly heterogeneous anything I hack up will be rather fragile, so getting them to do the work on their end might be the way to go.