Microsoft Academic as an Alternative to Google Scholar

Microsoft Academic is a new database with an API which could offer a solution to some of the data quantity throttling limitations of Google Scholar.

Has anyone considered adding the ability to recognize Microsoft Academic citations to Zotero?

https://microsoftacademic.uservoice.com/knowledgebase/articles/838965-microsoft-academic-faq
  • edited May 23, 2016
    Microsoft Academic is a new database with an API which could offer a solution to some of the data quantity throttling limitations of Google Scholar.
    Doubtful, at least for now. The free tier of their Academic Knowledge API is limited to 10,000 transactions a month (https://www.microsoft.com/cognitive-services/en-us/academic-knowledge-api), which would be way too low for Zotero. And it doesn't look like they currently offer full-text lookup, which is the primary reason Zotero uses Google Scholar.
  • edited May 23, 2016
    The query to Google Scholar appears to be done on the client side rather than server side currently; I get my IP blocked or get captcha verification requirements on Google Scholar at far less than 10,000 transactions in a month. Wouldn't Microsoft Academic avoid the IP-blocking from Google?

    I'd rather have 10,000 transactions a month without full-text lookup than a much lower number per month with full-text lookup.
  • edited May 23, 2016
    You're misunderstanding — full-text search is the whole point of the Google Scholar lookup. That's how Zotero finds metadata from PDFs when there aren't DOIs or ISBNs (which use different services).

    As for rate limits, Google Scholar is just a website, so it rate limits by IP address and cookie. If there's an API limit for the MS service, that's presumably through an API key, which would be problematic to distribute, or we'd need people to authenticate to create their own keys, which we wouldn't particularly want to do.

    We're planning to address Google Scholar limits via searches using the Zotero servers.
  • edited May 23, 2016
    Forget the API limit, which is a bit of a red herring right now (although it's very unlikely that Zotero could find a way to assign every user their own AK API account without a lot of work on the user's part).

    The main issue is that Microsoft Academic can't replace Google Scholar unless it starts offering full-text search.

    edit: Beat by Dan.
  • As for rate limits, Google Scholar is just a website, so it rate limits by IP address and cookie.
    So Zotero currently reads the webpage (DOM) that GS returns for a query? Is that really how it works? Seems like a lot of overhead.
  • So Zotero currently reads the webpage (DOM) that GS returns for a query?
    Yes. It just makes some queries and uses the Google Scholar translator.
  • I apologize if I am misunderstanding here. We may be using Google Scholar for different purposes.

    If I am searching Google Scholar for a particular topic, Zotero makes it really easy for me to import metadata (and sometimes PDFs) using the "Save to Zotero" Firefox add-on.

    But if I go to Microsoft Academic and find articles of interest, "Save to Zotero" does not recognize the search results and allow me to import them with 1-click as I can do with Google Scholar.

    Why not add this capability to Zotero? Wouldn't this be of substantial interest to Zotero users overall?
  • edited May 23, 2016
    Oh, you're just asking for a translator for MS Academic? That's totally different. The way your question was phrased made it sound like you were talking about PDF metadata recognition — that's the context in which we talk about throttling limitations on Google Scholar or requiring an "alternative" to GS, and the feature uses the word "recognize" internally. Sorry for the confusion.

    I'll let others comment on the feasibility of an MS Academic translator.
  • edited May 23, 2016
    Actually, we have an MS Academic translator already, dating to 2012:

    https://github.com/zotero/translators/blob/master/Microsoft Academic Search.js

    It needs to be redone (possibly from scratch) for the new version of the site.
  • Thanks

    But a related question then - why am I getting throttled or IP-blocked by Google Scholar if all I am doing is using Zotero as a translator? Might I be sending lots of other queries to Google Scholar through Zotero without realizing it?

    Is there somewhere in the documentation I can read more about the PDF metadata recognition feature of Zotero?
  • edited May 23, 2016
    That might happen if you save page after page of results (e.g., if you're doing a systematic review and saving everything to Zotero first, in which case you're essentially using Zotero as a bot that they might reasonably block). It definitely doesn't happen to people during normal usage of Google Scholar — performing a search in GS, finding the result, saving it to Zotero.

    If you're using "Retrieve Metadata for PDF" on a lot of PDFs or using a third-party plugin that makes Google Scholar lookups (e.g., Google Scholar Citations), you could be throttled. PDF lookup is the case that we're planning to address by using the Zotero servers instead.
Sign In or Register to comment.