Scaffold 2.0 fails to run examples from Crymble ch11?
summary: I get null results when executing the examples from http://niche-canada.org/member-projects/zotero-guide/chapter11.html with Scaffold 2.0, though the examples from previous pages work. How to fix, debug, report bug, etc?
details:
I've been using Zotero for over a year, but have not previously documented or developed for it. I'm currently using Zotero 2.0.3 on Firefox 3.5.9 on Ubuntu 9.10. I recently tried to pull in an article from a journal for which there is no translator. After some search, I found Adam Crymble's "How to Write a Zotero Translator"
http://niche-canada.org/member-projects/zotero-guide/chapter1.html
(aka HWZT) to be generally acclaimed the best guide to writing a simple screenscraping translator ... except that it hasn't been maintained, and the tools it references are either downlevel (Scaffold 1.0) or orphaned (Solvent 2.0). So I've started a wiki page
http://www.zotero.org/support/dev/how_to_write_a_zotero_translator_plusplus
(aka HWZT++) that aspires to update and wikify HWZT. For the moment, it is merely a list of deltas to HWZT, organized by HWZT page: crude, but quick and avoids rights issues. The main differences are
* Scaffold 1.0 -> Scaffold 2.0
* Solvent -> DOM Inspector + XPather
* saving the first sample page locally, and working on the local copy
(see HWZT++ for justifications). This has worked well for the first 10 HWZT chapters/webpages: i.e. for that material I can
* browse the local sample page with DOM Inspector + XPather
* execute HWZT's Javascript examples in Scaffold 2.0
However execution of the examples in
http://niche-canada.org/member-projects/zotero-guide/chapter11.html
give null output. I.e. if I
0 in uplevel Firefox, install add-ons
* Scaffold 2.0 from
http://bitbucket.org/rmzelle/scaffold/downloads
* latest DOM Inspector and XPather from
https://addons.mozilla.org/en-US/firefox/
and restart.
1 save http://niche-canada.org/member-projects/zotero-guide/sample1.html to /tmp/HWZTsamples/sample1.html
2 open file:///tmp/HWZTsamples/sample1.html in FF
3 open Scaffold 2.0 (Tools>Scaffold) on file:///tmp/HWZTsamples/sample1.html
4 in tab=Metadata, set
Label=foo
Creator=bar
Target=file:///tmp/HWZTsamples/
and hit button="Test Regex", I get result=
> 18:06:54 ===>true<===(boolean)
5 switch to tab=Code and enter in the input text/frame (on the left) this code (goto
http://niche-canada.org/member-projects/zotero-guide/chapter11.html
and search on text="Example 11.5", and combine that code with Example 11.6)
// start code
function detectWeb(doc, url) {
var namespace = doc.documentElement.namespaceURI;
var nsResolver = namespace ? function(prefix) {
if (prefix == "x" ) return namespace; else return null;
} : null;
var myXPath = '//td[1]';
var myXPathObject = doc.evaluate(myXPath, doc, nsResolver, XPathResult.ANY_TYPE, null); }
Zotero.debug(myXPathObject);
// stop code
then hit icon=thunderbolt
Expected result: text in the output text/frame (on the right) indicating
http://niche-canada.org/member-projects/zotero-guide/chapter11.html
> myXPathObject is now equivalent to a Simple Variable holding, in this case, "Title: "
Observed result: nothing.
What must I do to make this work? Alternatively,
* should I be discussing/reporting this in another venue?
* if a bug, what should I do to report it?
details:
I've been using Zotero for over a year, but have not previously documented or developed for it. I'm currently using Zotero 2.0.3 on Firefox 3.5.9 on Ubuntu 9.10. I recently tried to pull in an article from a journal for which there is no translator. After some search, I found Adam Crymble's "How to Write a Zotero Translator"
http://niche-canada.org/member-projects/zotero-guide/chapter1.html
(aka HWZT) to be generally acclaimed the best guide to writing a simple screenscraping translator ... except that it hasn't been maintained, and the tools it references are either downlevel (Scaffold 1.0) or orphaned (Solvent 2.0). So I've started a wiki page
http://www.zotero.org/support/dev/how_to_write_a_zotero_translator_plusplus
(aka HWZT++) that aspires to update and wikify HWZT. For the moment, it is merely a list of deltas to HWZT, organized by HWZT page: crude, but quick and avoids rights issues. The main differences are
* Scaffold 1.0 -> Scaffold 2.0
* Solvent -> DOM Inspector + XPather
* saving the first sample page locally, and working on the local copy
(see HWZT++ for justifications). This has worked well for the first 10 HWZT chapters/webpages: i.e. for that material I can
* browse the local sample page with DOM Inspector + XPather
* execute HWZT's Javascript examples in Scaffold 2.0
However execution of the examples in
http://niche-canada.org/member-projects/zotero-guide/chapter11.html
give null output. I.e. if I
0 in uplevel Firefox, install add-ons
* Scaffold 2.0 from
http://bitbucket.org/rmzelle/scaffold/downloads
* latest DOM Inspector and XPather from
https://addons.mozilla.org/en-US/firefox/
and restart.
1 save http://niche-canada.org/member-projects/zotero-guide/sample1.html to /tmp/HWZTsamples/sample1.html
2 open file:///tmp/HWZTsamples/sample1.html in FF
3 open Scaffold 2.0 (Tools>Scaffold) on file:///tmp/HWZTsamples/sample1.html
4 in tab=Metadata, set
Label=foo
Creator=bar
Target=file:///tmp/HWZTsamples/
and hit button="Test Regex", I get result=
> 18:06:54 ===>true<===(boolean)
5 switch to tab=Code and enter in the input text/frame (on the left) this code (goto
http://niche-canada.org/member-projects/zotero-guide/chapter11.html
and search on text="Example 11.5", and combine that code with Example 11.6)
// start code
function detectWeb(doc, url) {
var namespace = doc.documentElement.namespaceURI;
var nsResolver = namespace ? function(prefix) {
if (prefix == "x" ) return namespace; else return null;
} : null;
var myXPath = '//td[1]';
var myXPathObject = doc.evaluate(myXPath, doc, nsResolver, XPathResult.ANY_TYPE, null); }
Zotero.debug(myXPathObject);
// stop code
then hit icon=thunderbolt
Expected result: text in the output text/frame (on the right) indicating
http://niche-canada.org/member-projects/zotero-guide/chapter11.html
> myXPathObject is now equivalent to a Simple Variable holding, in this case, "Title: "
Observed result: nothing.
What must I do to make this work? Alternatively,
* should I be discussing/reporting this in another venue?
* if a bug, what should I do to report it?
There is a discussion here of a new helper framework for building translators, built by Erik Hetzner:
http://groups.google.com/group/zotero-dev/browse_thread/thread/2da920ae70b2ddf7
When the framework is incorporated into Zotero, a streamlined syntax will be available for extracting content using xpath statements. Certainly one to watch.
http://niche-canada.org/member-projects/zotero-guide/sample1.html
It looks as though something is awry with the content of the page?
I hope that fixes your problem with chapter 11.
Adam Crymble
> I've fixed the 3 sample pages that had been corrupted.
Thanks, I've removed the sample-page-related workarounds from HWZT++.
> I hope that fixes your problem with chapter 11.
Unfortunately it does not. The usecase is now
0 in uplevel Firefox, install add-ons
* Scaffold 2.0 from
http://bitbucket.org/rmzelle/scaffold/downloads
* latest DOM Inspector and XPather from
https://addons.mozilla.org/en-US/firefox/
and restart.
1 Open http://niche-canada.org/member-projects/zotero-guide/sample1.html in FF
2 Open Scaffold 2.0 (Tools>Scaffold) with that sample page open and focused.
3 in tab=Metadata, set
Label=foo
Creator=bar
Target=http://niche-canada.org/member-projects/zotero-guide/
and hit button="Test Regex". Expected and actual results similar to
> 18:06:54 ===>true<===(boolean)
4 Switch to tab=Code and enter in the input text/frame (on the left) this code from Example 11.5 with a bit appended from Example 11.6 in http://niche-canada.org/member-projects/zotero-guide/chapter11.html
// start code
function detectWeb(doc, url) {
var namespace = doc.documentElement.namespaceURI;
var nsResolver = namespace ? function(prefix) {
if (prefix == "x" ) return namespace; else return null;
} : null;
var myXPath = '//td[1]';
var myXPathObject = doc.evaluate(myXPath, doc, nsResolver, XPathResult.ANY_TYPE, null);
}
Zotero.debug(myXPathObject);
// stop code
then hit icon="Run doWeb" (the stylized thunderbolt). Expected result: text in the output text/frame (on the right) similar to
http://niche-canada.org/member-projects/zotero-guide/chapter11.html
> myXPathObject is now equivalent to a Simple Variable holding, in this case, "Title: "
Observed result: nothing.
Note that hitting icon="Run detectWeb" (the eye next to the thunderbolt) also produces no output.
http://bitbucket.org/rmzelle/scaffold/issue/8/exemplar-from-hwzt-ch11-produces-no-output#comment-207513
The code to use is
function detectWeb(doc, url) {
var namespace = doc.documentElement.namespaceURI;
var nsResolver = namespace ? function(prefix) {
if (prefix == "x" ) return namespace; else return null;
} : null;
var myXPath = '//td[1]';
var myXPathObject =
doc.evaluate(myXPath, doc, nsResolver, XPathResult.ANY_TYPE, null).iterateNext().textContent;
Zotero.debug(myXPathObject);
}
which on clicking the icon="Run detectWeb" (eye) produces output like
11:51:57 Title: