Site Translator for Espacenet not working

This translator has multiple issues. It does not detect the site at all anymore - the match in detectWeb needs to change from textdoc to biblio for individual documents and from results to searchResults for multiple. I was able to change these with Scaffold to get the site to detect at all.

But after that, the code section throws an error which I cannot find.

Thanks.
  • The author, Gilles Poulain advises below. Thank you Gilles for a speedy reply and fix.

    -------------------------------------------------
    Espacenet have changed their URL format, simply replace the Detect code in Scaffold by the following :

    function detectWeb(doc, url) {

    if(doc.location.href.match("searchResults")) {
    return "multiple";
    } else if (doc.location.href.match("publicationDetails")) {
    return "patent";
    }
    }

    In addition, the title xpath have changed too, replace the older by :

    /html/body/table[2]/tbody/tr[1]/td[3]/h2
  • Is there a way to get this translator updated in the main Zotero translator repository? Google patents is now by far the best way to access US patents, but doesn't, AFAIK, cover international patents. The WIPO website stinks, so espacenet seems to be the best way to access EU and international patents. Z doesn't currently recognize espacenet, and most users will not be able to update the translator in the fashion described above, so I am hoping it can be updated in the main repository. The link to the translator code is below, but I am not sure whether the above fixes have been implemented (it doesn't look like it to me).

    http://zotero-dev.googlegroups.com/web/TradZotero_espacenet.txt
  • Amacom73, what you need to do is:

    1. Go into Z preferences (cogwheel menu in Z), and in the 'Advanced' tab, click on Data Directory;
    2. Click on the 'storage' folder and locate the file called 'ESpacenet.js';
    3. Open with a text editor like Notepad, or better Notepad Pro;
    4. CLOSE Firefox;
    5. Back in the test editor, replace the function detectWeb in accordance with tperki's second post above;
    6. Replace the xpath statement for the title section in accordance with tperki's second post above (this is the line starting 'var xpath' under the line '//Get title');
    7. Change the line that starts "lastUpdated" (near the top of the file) with the new date and time in the format YYYY-MM-DD hh:mm:00;
    8. Save and close the file.
    9. Re-start FF.

    You may be able to use Scaffold as an alternative but this add-on doesn't seem to work with Zotero 1.5b2.1.
  • Thanks, cavallad, that worked great! My first bit of Zotero coding.

    I did try Scaffold before posting here but it didn't seem to connect to the database. Not sure if that's an access/permissions issue or if it's not working. Anyway, thanks for the instructions.
  • Thanks to tperki and cavallad,
    I got the espacenet-translator working now.

    But has somebody a solution for integration of an automatic full-text pdf-download in this espacenet translator. This would be very helpfull.
  • edited May 27, 2009
    I'm not sure if the espace PDFs can ever be automated, since each one requires passing a captcha test.

    I've been using the Zotfile plugin, which takes a little fiddling (well-documented in their instructions) to get to work, but which will pull your most-recently downloaded file out of the download folder and into whichever item you are highlighting in Zotero. So I manually download the PDF and Zotfile it into the appropriate Z item (you can also download a bunch of PDFs and Zotfile them in reverse order of download, since Zotfile deletes the file from download folder when it moves it to Zotero.

    http://www.columbia.edu/~jpl2136/zotfile.html

    Hey Z devs, any chance we can get the fixed espacenet translator in the respository?
  • Thanks amacom73,

    i tried to use the zotfile plugin, but I got it until now not working. This problem, I discuss further in an other discussion.

    pdf's direct from espacenet: captcha could be really a problem.
  • Hi,

    the old espacenet translator is not working anymore since the espacenet search homepage has changed to http://t1.espacenet.com/ . Is there a chance to get the old espacenet translator working for the new site or has anyone a translator written for the new espacenet homepage.

    Thanks.
  • Not yet-- if no one works on it in the meantime, post here again in a month or two. If I have a bit more time, I'll take a look.
  • Hi
    would indeed be very usefull to have that working again.
    Would it be an idea to contact espacenet/epo to ask whether they could add some standard to their page that can be read by zotero? Embedded RDF?
    -Moritz
  • That would be great. Direct them to http://www.zotero.org/support/dev/make_your_site_zotero_ready and the development mailing list zotero-dev (http://groups.google.com/group/zotero-dev/).
  • I've been updating the xpath definitions for the new espacenet website (workdwide.espacenet.com) :

    https://gist.github.com/951329

    It should work but I'm not really a xpath specialist so please report any error.

    To make it work find the ESpacenet.js translator in your Zotero/translator folder and replace the code.

    Edouard.
  • Further discussion of the translator specifics should be on the developer list, zotero-dev (http://groups.google.com/group/zotero-dev), but note that you should try to craft XPath expressions that are more flexible-- the ones you have look they were created automatically by Firebug or the like, and they are very fragile. Instead of //table[1]/tbody/tr[5]/td, use //table[@class="tableType3"]/tbody/tr[5]/td or //table[@class="tableType3"]/tbody/tr[contains(th/text(),"Applicant(s)")]/td. Navigating tables is messy, so enumeration is sometimes not avoidable, but enumeration outside of data tables is something to avoid; some of your expressions are fragile enough to be thrown off by the addition of a single DIV or TR anywhere in the page. If you have questions, send them to zotero-dev and we'll get this committed. Thanks for stepping up and making a great start at a translator!
  • Of course, any formulation that relies on page content will have to have all three languages, too (French, German, English).
  • Hi edoleroy,

    thanks for your work. But for me the modified translator is not working. I got no "Zotero" Button in the address line of my firefox, which indicates that the espacenet-site is zotero-ready.

    (firefox 4.0.1, zotero 2.1.6, windows xp sp2)
  • as ajlyon points out the translator is likely not super stable - but it should show a symbol - do you have a sample URL?
  • Note that the new translator is designed specifically for the version at worldwide.espacenet.com
  • Hi,

    Yes, it should be working on worldwide.espacenet.com. Note that you need to search for something for the icon to show up.

    The only feature that is not working yet is the ECLA classification, the translator is not able to get the data.

    Thank you for your comments, I'll rework the xpaths and the ECLA support and post the code on the google group.
  • Hi,
    for me the translator do not work.

    Here a sample url:
    http://worldwide.espacenet.com/publicationDetails/biblio?DB=EPODOC&adjacent=true&locale=en_EP&FT=D&date=20071221&CC=KR&NR=20070120187A&KC=A

    I get no zotero icon in the addressbar of the firefox.

    Edouard - with "replace the code" you mean "replace the file Espacenet.js completely, or not.
  • edited May 5, 2011
    Go to http://github.com/ajlyon/zotero-bits/raw/master/ESpacenet.js and save the file to the translators directory of your Zotero data directory (http://www.zotero.org/support/zotero_data), replacing the existing file of the same name. Then restart Firefox and try again.
  • Now the translator is working.
    Many Thanks Ajlyon.
  • This is now in the main repository and will be in the next version of Zotero, if Dan doesn't push it to clients before then.
  • Unfortunately, the Espacenet translator seems to be broken again. The .js file from ajlyon is no longer available on github so I cannot try that path anymore.
    Could anyone provide assistance fixing the problem with the translator? This was such a great feature.

    TIA.
  • yeah - it's completely shot. I won't have time to repair this anytime soon, I'd guess neither has ajlyon (as noted above, the fix linked to above has long since been merged into Zotero).
    I'd suggest to see if one of the other patent sites works for this.
    The problem is that there is no systematic display of data on the site, so writing a translator involves an unfortunate amount of hacks which break easily.
  • Espacenet should now work.
This discussion has been closed.