ScienceDirect not scraping [any more?]

Andrew Moylan · January 6, 2008

I'm pretty sure I used to be able to import articles from this site before:

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TYH-3Y2N9G4-1M&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=8e68e4c1ffddc8454258e32f9b1811b2

Has something changed?

amc · January 7, 2008

I've never been able to import from ScienceDirect. Searching for the issue in these forums seems to indicate that you have to be subscribed to get the citation (although that doesn't seem to be confirmed). I'm not sure how that works. In any case, all the information is there on the screen, so I suppose a scraper could be written to parse it...

sean · January 7, 2008

Andrew and amc, are you subscribed and logged in users of ScienceDirect (e.g. via an institutional subscription)? If not, Zotero will not import from it. If you are logged in, Zotero works fine.

Andrew Moylan · January 7, 2008

Hi Sean,

I should clarify:

1. When I'm not on-campus, I use a reverse proxy to access sites to which my university has a subscription: I replace sciencedirect.com in the URL with sciencedirect.com.virtual.anu.edu.au. When I do this, Zotero offers to scrape to the article (the little icon appears in the address bar), but the scrape fails with the usual error message directing me to the Known Translator Issues page.

2. I assumed that Zotero could scrape the freely available information from ScienceDirect even if I'm not logged in.

Cheers,

Andrew

michaeltt · January 19, 2008

If the information is available to be seen by Guest users, how is it that the information cannot be scraped?

What difference does it make to be logged in?

Thanks

bioman666 · February 5, 2008

Yeah i second that : science direct is poorly supported... that 's a pity cause it's one of the biggest science library.

i could scrap info when logged in but today i came back, logged in and could never scrap anything...

...anyone has an idea ?

mikowitz · February 6, 2008

The difference being logged in makes is that it gives Zotero access to a more detailed set of metadata about an article, as well as allowing a user to download a full-text PDF version of the article.

We've just pushed a revised version of the translator that still only works for logged-in users, but should behave better. We've got a ticket up and in progress to support ScienceDirect for non-logged-in users, and I'll post here again when that's up and working.

--Michael

Tulapi · February 19, 2008

I second this feature. I often want to save the bibliographic information of papers, even if I don't have access on Sciencedirect (I generally request the PDF from the author)
It will save me a lot of time.

wbl2745 · February 19, 2008

I'm having inconsistent results with ScienceDirect. This URL http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WC7-4GP25JD-3&_user=456938&_coverDate=12%2F31%2F1984&_rdoc=3&_fmt=summary&_orig=browse&_srch=doc-info(%23toc%236731%231984%23999539995%23602453%23FLP%23display%23Volume)&_cdi=6731&_sort=d&_docanchor=&_ct=16&_acct=C000021830&_version=1&_urlVersion=0&_userid=456938&md5=5f524c4bce74210d329ee3383c9806b8
failed.

mikowitz · February 19, 2008

A guest access scraper for users who cannot log in to ScienceDirect is now available.

You can update your translators manually on the Preferences pane in Zotero, or they will update automatically within the next 24 hours.

Tulapi · February 20, 2008

It doesn't work for the moment
Zotero's icon appears, but the box at the bottom right doesn't say anything about saving something (it's empty). And no citation information citation is saved.

mikowitz · February 20, 2008

Tulapi-

Are you logged in to ScienceDirect or using the guest access scraper? Also, could you please provide an example of a URL which causes the result you reported above? That'll help me try to figure out what's going wrong.

Thanks,
Michael

Tulapi · February 20, 2008

For example :

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WH8-4RKVD37-1&_user=10&_coverDate=01%2F16%2F2008&_rdoc=15&_fmt=summary&_orig=browse&_srch=doc-info(%23toc%236844%239999%23999999999%2399999%23FLA%23display%23Articles)&_cdi=6844&_sort=d&_docanchor=&view=f&_ct=62&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=2d3b79dd23370a6cbb07994541bb59b9

It is written on top : "You have Guest access to ScienceDirect"

Martin Nicolaus · February 21, 2008

Ditto on not being able to import from ScienceDirect with guest access. Tried it several times this evening, Feb. 21 '08, even after updating translators, and every time, Zotero hangs (endless little ringworm in Vista). Thank you for this otherwise wonderful tool, and the price is right!

Tulapi · February 25, 2008

Congratulations, it seems to work now !
Thank you very much to the Zotero team.

Tulapi · February 25, 2008

Just a problem with that one

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WJS-4RTTKNX-3&_user=10&_coverDate=02%2F13%2F2008&_rdoc=4&_fmt=summary&_orig=browse&_srch=doc-info(%23toc%236886%239999%23999999999%2399999%23FLA%23display%23Articles)&_cdi=6886&_sort=d&_docanchor=&view=f&_ct=22&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=256b15c7f1cb1b15a2137e86277a3c91

with no abstract

MSchneider · March 17, 2008

I'd like to continue this thread since I am facing similar problems than Martin Nicholaus. I recieve the scrape icon but when clicking on it only a empty message box appears and zotero hangs.
I have no subscription to the journal I test and logging into my registered account (also without subscription) does not help. I am using a proxy at my institution, may this be a problem? Is the a way to get information on what is happening have way between ScienceDirect and zotero. Otherwise it is a fantastic tool and this would just be a nice-to-have.

marisabrandt · March 21, 2008

I am having the same problem as MSchneider (above), though I am logged in and able to download PDFs. However, when I click the scrape icon, zotero hangs and does not import anything.

(here is one such URL: http://www.sciencedirect.com.floyd.lib.umn.edu/science?_ob=ArticleURL&_udi=B6V7Y-3Y0JN82-2&_user=616288&_coverDate=01%2F31%2F2000&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000032378&_version=1&_urlVersion=0&_userid=616288&md5=f41004b801b3059f44d99f12cac9d8f53.20.2008 - not sure how useful this is, as it is uni-login specific, but maybe it will help to track the issue down)

kajeling · March 23, 2008

The problem is that the translator ignores the university proxy when trying to access the citation. That is, when it should be accessing sciencedirect.com.yourproxystuff/science?_ob..., it's actually just hardcoded to try sciencedirect.com/science?_ob...

I fixed it in my own copy, but I don't know how to make that available, so in the meantime anybody who is up to the challenge of editing their translators (with Scaffold), you can fix it yourself. There is a line about 1/4 of the way down, where it is trying to post the citation url:

var post = "_ob=DownloadURL.....
Zotero.Utilities.HTTP.doPost("http://www.sciencedirect.com/science", post, function(text) {

Leave the first line as it is, but replace the second line with this:

var baseurl = url.match(/(https?:\/\/[^\/]+)\//)[0];
Zotero.Utilities.HTTP.doPost(baseurl + "/science", post, function(text) {

From now on, when you're logged in via a proxy, it will include that proxy when accessing the citation. I've noticed this problem on a few other translators, so it's worth keeping an eye out for it.

mikowitz · March 23, 2008

An updated version of the ScienceDirect translator that uses kajeling's code (thank you for the suggestion) is now available. Your translators will update automatically within 24 hours, or you can update them manually in Zotero's preference pane. Please let us know if this problem persists.