FW.Scraper() major issue
How can I use document.evaluate in FW.Scraper() function?
var tVar="title';
FW.Scraper({
title: FW.Xpath('...').text().trim(), //works
title: FW.Xpath('.../text()').trim(), // Why this dose not work?
title: FW.Xpath('.../text()').text().trim(), // Why this dose not work?
title: "The title', //works
title: tVar, // Why this dose not work?
title: doc.evaluate(...), // Why this dose not work?
});
var tVar="title';
FW.Scraper({
title: FW.Xpath('...').text().trim(), //works
title: FW.Xpath('.../text()').trim(), // Why this dose not work?
title: FW.Xpath('.../text()').text().trim(), // Why this dose not work?
title: "The title', //works
title: tVar, // Why this dose not work?
title: doc.evaluate(...), // Why this dose not work?
});
This discussion has been closed.
Technical questions should probably go to the zotero-dev list.
By using framework you are creating a set of rules that will be executed later. But you cannot use variables that may be available later to define the rules now (if that makes sense)
EDIT: If you care to understand the technical details behind Framework, you can take a look at the code at http://e6h.org/~egh/hg/zotero-transfw/raw-file/tip/framework.js I'm not sure if it's up to date with the framework code used in Zotero currently, but the general idea is the same. And if you feel comfortable reading that code, it would probably be easier for you to write a translator without using framework at all, as that gives you much more flexibility.
date: FW.Xpath('/html/body//table[2]/tbody/tr[1]/td/text()').text(),
this returns empty string for me. Do you get no-empty result?
Assuming you are targeting a similar td element
<td>
<span>A</span>
B
</td>
This contains two text nodes. The first text node is the empty string between <td> and <span>. The second is "B".
FW.Xpath.text() only selects the first node matched by the xpath (technical note: I would have expected this behavior to match Zotero.Utilities.xpathText, which concatenates the nodes. I don't use Framework that much). In your case, you want to select the second node, so
date: FW.Xpath('/html/body//table[2]/tbody/tr[1]/td/text()[2]').text()
Keep in mind that there is a text node (empty or not) before and after each tag
Also note that Framework is not bundled with Zotero per se, a minified version of it is embedded in each translator. The code is hidden by Scaffold. The version is therefore generally whatever is in the version of Scaffold used to develop the translator, which is usually up to date.
http://sid.ir/en/ViewPaper.asp?ID=247419&varStr=1;TABIBIAN%20MANOUCHEHR,SHOLEH%20MAHSA;ARMANSHAHR;SPRING-SUMMER%202010;3;4;1;16
I want select "SPRING-SUMMER 2010; 3(4):1-16. " text using translator framework. Using FirePath plugin you can see that using pure xpath expression "/html/body/div/div[3]/table[2]/tbody/tr/td/text()" yield desired result but this xpath epresion will not work in framework. Actually I test followings without success.
articleTitle=doc.evaluate("/html/body/div/div[3]/table[2]/tbody/tr/td/font",doc,null, XPathResult.STRING_TYPE ,null).stringValue;
FW.Scraper({
...
date : FW.Xpath('/html/body//table[2]/tbody/tr[1]/td').text().remove(/\n/g).remove(/.*?\)/).remove(/;.*/),
date : FW.Xpath('/html/body//table[2]/tbody/tr[1]/td').text(),
date : FW.Xpath('/html/body//table[2]/tbody/tr[1]/td').text().remove(RegExp(articleTitle)).text(),
date : FW.Xpath('/html/body//table[2]/tbody/tr[1]/td').text().remove(RegExp(articleTitle)),
date: FW.Xpath("/html/body//table[2]/tbody/tr[1]/td/text()[2]").text()
date: FW.Xpath("/html/body//table[2]/tbody/tr[1]/td/text()").text()
..});
and a lot more.
Can any body write a code that just select "SPRING-SUMMER 2010; 3(4):1-16. " in that page?
<td>[empty text node]
<input ... />
[empty text node]
<b>...</b>
[empty text node]
<font color="#118811">ARMANSHAHR</font>
SPRING-SUMMER 2010; 3(4):1-16.
</td>
I added the [empty text node] parts to illustrate a point. As you can see, the text node you are trying to select is 4th on that list.
This works
date: FW.Xpath("/html/body//table[2]/tbody/tr[1]/td/text()[4]").text()
Sometimes it is much more convenient to count from the end though. The text node is the last node in that block, so you can use "last()" instead of the "4". For the 3rd node, you could use "last()-1", etc.
But I would encourage to use less document structure-dependent xpaths. Instead of traversing down the entire HTML tree, I would suggest
FW.Xpath("//table[@id='Table2']/tbody/tr[1]/td/text()[last()]").text()
Also, pay attention to the number of nodes your xpaths select. In FirePath, you can see the number of nodes selected on the bottom bar. The xpath you posted that "works in FirePath" selects 12 nodes on the entire page.