Eastview

If any has developed or is willing to develop (in the short run) a translator for Eastview Press (one of the best aggregator of Eastern Europe and Chinese online full-text resources, please let me know! Thanks

-Stephn
  • Overall - http://www.eastviewpress.com/Main/Home.aspx.
  • And please contact me offline if you want access to take a look.
  • Please provide links to some specific web pages that you expect to be able to import. If you can't provide a permalink, provide directions to navigate to such page.
  • Aurimas, the pages are behind a paywall. And so I can't really provide a live link. Would a saved webpage help? And if so, how can I attach it here?
  • Saved page would be helpful, but some of us may have access to the paid resource anyway, so provide whatever link you see in your url bar.
  • Well I'd already given the main page. Any of those databases should be ok to check. The precise one I am using is http://dlib.eastview.com/search. But Sebastian has already taken a look and he said he didn't think it could be done.
  • well, I wouldn't say it can't be done, but it's not simple. There's no structured metadata per se, but the metadata table at the top of each article can be parsed with some effort. I doubt any of us will get to this any time soon though.
  • edited August 29, 2014
    Hmmm. But so how about the possibility of converting the concatenated html file that they allow you to download in zipped format into bibtex or so?
  • no, parsing the html isn't going to be any easier than parsing the site--we parse the html there, too.

    Aurimas is looking at a slightly different site structure--when I browse to articles such as:
    http://dlib.eastview.com/browse/doc/42165427
    have a nicely structured table at the top. That's parseable. Not great, but doable.

    What I get for most articles when search dlib, though, is just a couple of lines in an unstructured header, such as here:
    http://dlib.eastview.com/browse/doc/222481 (unfortunately Aurimas doesn't have access there) that's not parseable.

    But I've written you an e-mail about parsing search results that could work.
Sign In or Register to comment.