ScienceDirect not downloading PDF for some pages
I tried saving this article:
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TCY-4V03JYS-3&_user=7300267&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000032378&_version=1&_urlVersion=0&_userid=7300267&md5=81aca4e98c3d0fbabd2a9cee71b9be43
And it didn't download the PDF as requested. When I looked at the translator code, I noticed that there are essentially two different translators and an "if" statement at the top directs it to one or the other. The first translator seems to do all the work necessary, and the second one is some kind of backup bare-bones version.
The article in question was directed to the second translator. When I simply replaced this if statement with "if(1)", it used the first translator and it worked just fine, downloaded the PDF, no errors, etc. Not sure what the if statement or the second set of code was intended for (it looks for an "Advertisement" div), but apparently you are keeping the translator from doing its job.
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TCY-4V03JYS-3&_user=7300267&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000032378&_version=1&_urlVersion=0&_userid=7300267&md5=81aca4e98c3d0fbabd2a9cee71b9be43
And it didn't download the PDF as requested. When I looked at the translator code, I noticed that there are essentially two different translators and an "if" statement at the top directs it to one or the other. The first translator seems to do all the work necessary, and the second one is some kind of backup bare-bones version.
The article in question was directed to the second translator. When I simply replaced this if statement with "if(1)", it used the first translator and it worked just fine, downloaded the PDF, no errors, etc. Not sure what the if statement or the second set of code was intended for (it looks for an "Advertisement" div), but apparently you are keeping the translator from doing its job.
I think using "the export button" is a better way to judge a authorized user(export ris file ) or not.
The xpath is \\//*[contains(@src, "exportarticle_a.gif")]\\.
To download the fulltext pdf, xpath \\//div[img[contains(@src, "icon_pdf.gif")]]\\ should be used. If the textContent is "Purchase PDF", it means you can't download the pdf file; if there is no "Purchase" in the textContent, you can get that pdf file enven if you are non-authorized user.
Thanks for the xpath checking for authorized access. Looks like sciencedirect has changed their advertising policy :)
Regarding the PDF xpaths, can you provide an example link of an authorized user not having access to the pdf (ie: having a "Purchase PDF" textContent) and an example of a non-authorized user having access to the PDF? As far as I have been able to test it, the current pdf download xpaths work fine (using the new authorization xpath).
anyway thanks much for fixing this!