ScienceDirect not downloading PDF for some pages

kajeling · January 18, 2009

I tried saving this article:

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TCY-4V03JYS-3&_user=7300267&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000032378&_version=1&_urlVersion=0&_userid=7300267&md5=81aca4e98c3d0fbabd2a9cee71b9be43

And it didn't download the PDF as requested. When I looked at the translator code, I noticed that there are essentially two different translators and an "if" statement at the top directs it to one or the other. The first translator seems to do all the work necessary, and the second one is some kind of backup bare-bones version.

The article in question was directed to the second translator. When I simply replaced this if statement with "if(1)", it used the first translator and it worked just fine, downloaded the PDF, no errors, etc. Not sure what the if statement or the second set of code was intended for (it looks for an "Advertisement" div), but apparently you are keeping the translator from doing its job.

santawort · January 18, 2009

If you have no right to download fulltext PDF from this site, you can't use the "export to" function because this button dosent appear, so Zotero can't get the ris file for that article. But Zotero will go to the second part of if statement and get the metadata from the webpage. Usually, there are "Advertisement" for non-authorized user, so Zotero uses this as a standard to judge authorized user or not. Actually, many authorized users can also see this "Advertisement".

I think using "the export button" is a better way to judge a authorized user(export ris file ) or not.
The xpath is \\//*[contains(@src, "exportarticle_a.gif")]\\.

To download the fulltext pdf, xpath \\//div[img[contains(@src, "icon_pdf.gif")]]\\ should be used. If the textContent is "Purchase PDF", it means you can't download the pdf file; if there is no "Purchase" in the textContent, you can get that pdf file enven if you are non-authorized user.

kajeling · May 19, 2009

Just thought I would bump this up again since it hasn't been resolved and ScienceDirect is a large enough site to be worth fixing. I have it working on my machine just fine with the "if(1)" hack, but can't get the suggested xpath solution to work (at least not without something like Scaffold for Z2.0). I assume most users whose universities connect to ScienceDirect the way mine does are having the same problem automatically dling the PDF without this fix, and it would be nice to make that available.

dstillman · May 28, 2009

I've created a ticket to test out santawort's suggestion.

mcburton · May 31, 2009

Sanatawort,
Thanks for the xpath checking for authorized access. Looks like sciencedirect has changed their advertising policy :)
Regarding the PDF xpaths, can you provide an example link of an authorized user not having access to the pdf (ie: having a "Purchase PDF" textContent) and an example of a non-authorized user having access to the PDF? As far as I have been able to test it, the current pdf download xpaths work fine (using the new authorization xpath).

kajeling · June 4, 2009

Thanks so much mcburton! Since the most recent translator update (presumably implementing santawort's suggestion from a while back), I haven't encountered this problem. I think my univ's subscription package is pretty deep so all I'm finding in my quick search are either articles with PDFs I can download or articles that don't provide PDF at all.

anyway thanks much for fixing this!