EBSCO Host
This is an old discussion that has not been active in a long time. Before commenting here, you should strongly consider starting a new discussion instead. If you think the content of this discussion is still relevant, you can link to it from your new discussion.
This discussion has been closed.
After selecting all articles, the little red-bordered grey box on the bottom right appeared as before without the actual article that is being processed displaying in the box (which it DOES do when I for instance do the same in Google Scholar). After a while, that box started 'moving up' more quickly than it used to (although I still didn't 'see' the actual articles in those boxes). And the result had all 20 nicely downloaded but no duplicates any more.
When I tried again with 50, however, things still seemed to get stuck at the first box. When I clicked on that box, it just disappeared and nothing further happened.
Let me ask 3 questions:
* SHOULD this process be sensitive to external inputs etc (I noticed before too that I couldn't use that computer for anything else, or else the download would not work).
* is there no way to create a log for this script, whereby it logs what exactly it does, and what succeeds or does not?
* could it be that the problems are related to the fact that I am putting them in a sub-sub-folder?
I will keep trying and and will post the messages firefox error console again in the hope that we can fully 'fix' this!
-Stephan
EBSCOhost is not the fastest server to respond, so downloading 50 items may take some time and the red box might disappear. The way the translators work, the items are not displayed in the red box until we have collected all of the metadata for an item (which involves going to the item page, retrieving the RIS file, and if there is an associated PDF, going to the PDF page to retrieve the PDF link = 2-3 page downloads from EBSCO). This may take quite a while, and is further delayed by going through a proxy.
Keep in mind that when you download 50 articles, there are 50 simultaneous connections being made to the EBSCO servers. This may be a problem on some networks. On Windows (XP SP2+ and Vista up to SP2), for instance, I believe this may be a problem with the maximum limit of half-open connections (set to 10). Your router may also be configured to not allow that many connections (though that number is typically much higher than 50, but on a shared network you might be reaching it). Furthermore, the proxy might be imposing similar limits (in which case, we could reconsider making these connections sequentially instead of concurrently). The point is, that performing certain other tasks could push the number of connections even further (e.g. running a bittorrent client)
As far as using the computer for, let's say, writing in Word, this should not have any effect on Zotero's ability to scrape pages. I've added a bunch of debug messages that could help you determine where the items choke. The updated translator is at the same location (link in previous post). I don't believe that would matter. Do you mean sub-sub-collection? or are you talking about folders on your computer. Sometimes there may be a problem with saving PDFs that have very long names. This would appear in the debug log.
Edit: Corrected "Vista up to SP1" to "Vista up to SP2"
Thanks much. I am running this on a Windows 7 home pc (and am not running a bittorrent client). I am still trying now (with the new version) with 50 items, but, like yesterday, it doesn't download anything. I'll post more details (and the error log - if I actually get any errors) later, but I was just wondering - is there any way to tell that the script is actually finished? Normally, AFTER it shows all items that it has captured on the right of the screen in separate 'stacked' boxes, it STILL take quite a while before I can actually start using Firefox again. Which suggest to me that it may still be doing sthg. And I'm sometimes afraid that doing sthg else then might still mess up the process. So I therefore wondered whether there is any way of telling that the whole process has run its course...
(* the Source file is always http://global.ebsco-content.com.library3.webster.edu/interfacefiles/12.41.7.0.1/css/ehost/master_bundle.css)
*Warning: Selector expected. Ruleset ignored due to bad selector.
* Warning: Expected ',' or '{' but found '/'. Ruleset ignored due to bad selector.
Line: 1
* Unknown pseudo-class or pseudo-element 'relative'. Ruleset ignored due to bad selector.
* Warning: Unknown pseudo-class or pseudo-element 'text'. Ruleset ignored due to bad selector.
And at the end - nothing ended up in my Zotero folder (My Library/Title of Project/Title of query). I also looked in the root folder (as it sometimes saves items there too), but there was nothing there either. Any further ideas?
1) Before we do much more debugging, let's try to simplify the system a bit. I'm sorry if I missed this in your posts and you are in fact already doing this, but could you try using Zotero as a firefox extension and access EBSCO via Firefox?
If that does not make a difference and you are still not able to import items:
2) Update the translator to the latest version from https://raw.github.com/aurimasv/translators/EBSCOhost/EBSCOhost.js (same link as above, but I added some more debugging code)
3) And please generate a Debug ID by following the instructions in the _first_ section (Debug Output Logging) of this page: http://www.zotero.org/support/debug_output#debug_output_logging
When you get to step 5, hit Disable and then open the log by clicking View Output. Then select the entire log and copy paste it on a website like pastebin.com or gist.github.com
4) If using Firefox solves the problem, let us know and we'll go on to debugging the Chrome connector itself.
Here some more results from records 40-60 of a search, whereby the first 2 pages (0-20 and 21-40) went fine: https://gist.github.com/3772358 . Debug#: D533709939
In both of your reports one item never completes (they never manage to get RIS data)
In the first one (https://gist.github.com/3770531), item (35) never completes and in the second (https://gist.github.com/3772358), item (10) fails to complete.
I'm guessing that the page requests time out. I need to think about how to fix this, but it will probably require an update to the Zotero client. I'll give you an update in a day or two, but if I don't, bump this thread.
Thanks for all of your reports.
Looking at the code, it seems that this should have generated a log message (utilities_translate.js#L415 -> translate.js#L1155). Since there is no log message, (A) was there no error, (B) was the error mishandled and never made it into the log, or (C) am I not following the code correctly?
If there was no error, there should have been a follow-up log message from the EBSCOhost translator (Last seen message is EBSCOhost.js#L332 which should then be followed by EBSCOhost.js#L32 if doGet calls back)
In either case, I am thinking that we should allow translators to handle errors from xmlhttprequest. Before I try to devise a detailed proposal, I was wondering if there is already something in the works or maybe even already in master branch (I suppose I can try and browse through the code, but I thought I'd ask anyway). Otherwise, I have some ideas how this could be handled in a user friendly way.
I glanced through the debug log, and what stood out to me most was:
Translate: Unknown RIS item type: Book. Defaulting to journalArticle
Obviously that's an unrelated problem, but it's rather annoying that books are being captured as articles, which throws away metadata and makes it useless.
Anyway, I guess I'll have to try capturing fewer at a time. But why does Zotero try to capture all 50 at once? Why doesn't it queue them and capture a few at a time? This would also make debugging easier.
Ok, worse than that: this attempt to capture items in Zotero caused Firefox to allocate over 2 GB of memory, and Firefox won't release it. I tried to exit Firefox, and the GUI closed, but the process remained running and still didn't release memory. I had to kill Firefox. This is repeatable. Using Firefox 16.0.1.
Bottom line: Zotero seems to be just plain unreliable. It's a shame, because it is so useful when it works. But it seems like when I need it most, it lets me down. All I can use it for is capturing bibliographic data on single items, and then I just copy and paste into my editor.
which EBSCO database are you searching/using?
Beyond that see aurimas above.
I understand that EBSCO isn't returning data in compliance with the RIS spec, but capitalization is a very minor thing. Why isn't Zotero more robust? I can't help but think of Jon Postel's famous axiom: "Be conservative in what you do, be liberal in what you accept from others."
I wish that all resources had metadata in standard formats embedded in their HTML. And I wish that I didn't have to use EBSCO. But I'm afraid it's my only choice.
@aurimas - any thoughts on whether to do that in RIS or in the EBSCO translator?
Could you describe your set-up in a bit more detail? I'm not quite sure I understand the library search via EBSCO.
The closer we can replicate this, the better we can troubleshoot. I don't want to raise expectations too much, EBSCO will likely remain a bit finicky, but we can probably improve.
@Dan, Simon, etc. As far as number of connections to the server is concerned, I think we should impose these limits using the value in network.http.max-connections-per-server. IMO this should be done in Zotero.HTTP.* See the updated translator linked by me a couple posts above (https://raw.github.com/aurimasv/translators/EBSCOhost/EBSCOhost.js) I'll push this version out to everyone when I get a chance to remove all the debugging code. This sounds serious. I will try to replicate it, but I have not seen this behavior yet. Are you pressing the folder icon multiple times (for instance you try to import again after a failed attempt)?
The best suggestion I can give you right now is to use the translator from the link above and import ~10 items or so at a time.
The books - those are books in the university library?
I don't think we've ever seen or dealt with that, so there's a good chance it wouldn't work well.
Is that search publicly accessible or do you need to be signed in?
How about if you go through EBSCO directly? Does your U allow that? Do you see any difference?
I'll leave it to Simon to comment on ways we could support queuing in the translator architecture, but I agree that it would be a good thing. We've always said that Zotero wasn't really designed with the download-huge-pages-of-results-at-a-time use case in mind, but since people are clearly going to do it, it'd be nice if we handled it a bit better. (Among other things, a queue would make it less likely that people would get banned for violating the TOS—which, essentially, they probably are.)