Does the Web Translator support calling Javascript in the current page (or external link)?

I am writing a Translator for a certain database, and I can easily use .doPost() to obtain the information I want. But the information I obtained was encrypted by AES (perhaps for better transmission on the network).
After Google, I learned that parsing these encrypted characters requires a third-party JS library (crypto-js). Alternatively, I can directly call the functions in the Javascript files already used by the webpage itself.
So, is there any way to call the JS of the web page itself or external JS in the Translator? If not, I may have to copy a lot of code from the JS of the webpage to complete the decoding.
  • I'm pretty sure the answer is no on both accounts: you can't trigger javascript actions in the browser window, nor load up external JS libraries, sorry.
  • Thank you for your reply. It's really regrettable, perhaps I have to reinvent the wheel.
  • Are you sure you can't just get the data from the page itself? If the page is decrypting this data, is it not rendering it somewhere?
  • In my case, get data from web page elements is effective for a single item, but becomes challenging for multiple item situations.
    I can easily obtain multiple item IDs (used to concatenate urls) and titles from the search results page. But when I use .doGet() or .requestDocument() to request these web pages, it returns an "empty" web page containing two JS scripts. On the browser side, these JS will automatically execute, with one JS making a POST request and the other JS parsing the POST result, and then adding entry information to the webpage template. But in Zotero Translator, even if 'await' is used, the callback function can only obtain the empty webpage without key information (it does not wait for the webpage's JS to continue running).
    That's why I tried to simulate POST requests to obtain the information I wanted.
  • @dstillman -- having some way to have scrape (or doWeb) wait a bit and do something like monitorDOMChanges would appear to solve this type of issue when dealing with importing multiples on dynamically built pages -- the standard advice for these situations is to mimic the site's internal calls in translators as jiaojiaodubai23 is trying to do, but that can be quite tedious and in the above example would either be impossible or create a massive overhead in a single translator.
  • By the way, even if we can call JS that already exists on the webpage, the JS script customized for the webpage itself may have very high coupling, and some functions are difficult to call separately.
    If we allow Translators to call third-party libraries in the future, then unchecked Translators may pose security issues.
    I think the most elegant solution is to provide a mechanism to wait for the JS script of the webpage to run, as Adamsmith said. For example, after doGet is issued, the callback function checks the returned page at a certain frequency to determine if there is a specific element in it to determine whether the page has been loaded. If there is such a mechanism, we don't need to consider how to call additional JS, and the scratch scheme for individual elements can continue to be used.
    As for my Web Translator, I spent a lot of time trying to port code from other libraries into my Translator file, but even after writing over a thousand lines of code, it still doesn't work well. I plan to temporarily shelve its support for multiple items.
  • I think the most elegant solution is to provide a mechanism to wait for the JS script of the webpage to run, as Adamsmith said
    No, that's not possible. The Zotero Connector doesn't and can't make full browser requests (i.e., that execute JavaScript) to other pages. It can only request the HTML that comes over the wire — that's what both do*() and request*() do. So if a site generates pages via JS, there's no way to scrape data from those rendered pages from a search-results page.

    So the standard advice holds — you'd have to mimic the site's internal calls. And you would indeed have to do whatever the site itself is doing to parse its API responses. It sounds like they're trying to make the API unusable on its own, so there may not be a great solution here.
Sign In or Register to comment.