Auto-Update Collection Using Microsoft Academic

edited February 12, 2019
I find that Publish or Perish is an excellent source for locating highly cited and otherwise seminal articles on a given topic. But Publish or Perish does not import the Abstract so when importing a CSV to Zotero from Publish or Perish, the items have very limited detail.

Also often articles imported from Google Scholar with the Zotero Google Scholar translator do not contain the Abstract.

On the other hand, Microsoft Academic Search does a terrific job at listing abstracts and other metadata which are imported with the Zotero Microsoft Academic Search translator.

So my question: Is it possible for me to identify a Collection in Zotero and in an automated way search for each item by title in Microsoft Academic Search, then use the Microsoft Academic translator to update each entry in the Collection?
  • No, not possible. Auto-updating of metadata is likely going to happen in some way relatively(!) soon.

    I don't think MAS (which I generally quite like) has overall high enough quality of metadata to rely on it as a source for auto-updating, though (I think the approach will be to use Crossref and publisher's own pages, potentially also pubmed), so specifically what you're asking is unlikely to happen any time soon. Could likely be coded as a plugin if someone wanted to, though.
  • Thanks for the feedback. I suggested Microsoft Academic Search because I have found that particularly on older publications it often has abstracts when other sources do not. Crossref or publisher sites often do not work well for older articles with no doi.

    Is anyone interested in writing this as a plugin project for hire?
  • The great thing about MAS is, as you say, that the site often has abstracts even when the publishers' sites do not. However, beware of using unedited MAS metadata. It is very frequently very wrong. It is very useful for identifying that literature exists.
  • It's API also imposes a once-per-second rate limit (not an issue for single items, but bulk refreshes will be s-l-o-o-o-o-w using MAS, and using the web front-end for search turns out to be pretty tricky (as in I haven't yet managed to get a search result using xhr)
  • Well... after no lack of trying on Emiliano's part, it is apparent that there are immense problems with the Microsoft Academic API. The human browser interface works wonderfully; its "semantic search" design is quite useful and quite arguably a powerful competitor to Google Scholar, PubMed, and other search engines. But in many cases - even a simple search by exact title - a query that works perfectly in the browser fails to return anything in the API. The API is definitely not ready for prime time - a major disappointment by Microsoft.
  • Update.. Final version works very well - thanks Emiliano - with a new translator, Microsoft Academic can be very useful for adding abstracts to virtually any CSV source, including but not limited to Publish/Perish.
  • and @emilianoeheyns
    What is this "new translator" you're talking about in the last post here, which adds abstracts "to virtually any CSV source"?

    I would like to be able to update abstracts using MSA.

    And because PoP and CSV was mentioned... and I should probably make a separate post about this... I'm doing large cross-discipline literature reviews (1k-10k articles before eliminations) using Publish or Perish and having major pain with the workflow. If I do CSV from PoP, then I can use that to review and filter articles but then I can't get those articles into Zotero or back into PoP. If I export RIS from PoP I can get it into Zotero but then I don't have powerful ways of filtering through the data as I do with a CSV in Excel.
  • It's been a while since I worked on this, but broadly it means that MSA search is pretty good, and that is easily scripted to merge with other structured sources - such as CSV.

    What interaction/workflow would you prefer to see between PoP, zotero and MSA?
  • @emilianoeheyns Thanks very much for asking. Before I replied I wanted to get a deeper understanding of how everything would work, and try some things. However, it is a lot to write out. As you probably know, different article databases have different issues. E.g. Google Scholar truncating journal names and having no abstracts, Scopus API returning only first names. It really is quite a mess. How can it all be so bad in 2021? So trying to define the interaction I would like is a little hard, I feel like there are several paths and interactions that could lead to the same desired end.

    But the bottom line is I would like CSVs which have full-text and abstract text for Zotero items (where possible), which may mean that text needs to be retrieved ("auto-updated" by Zotero) from somewhere like MSA. I'm trying to develop some methods for doing computer-aided quantitative literature reviews of key terms which are used across many disciplines. Basically no article database can generate the necessary output. By using Zotero to import the RIS files from searches in those databases, if I can auto-update/retrieve full-text and abstracts where possible, I could build a usable CSV file for the lit review.

    The issue I've had with MSA is that it has a semantic query layer which is "smart" but doesn't take exact phrases. But by doing that it outsmarts attempts to do reliable research and the results it returns are terrible. But perhaps a more targeted search for longer title strings from single articles gives much better results. I do have an email out to another researcher who is rumored to have code which might help me get what I need from MSA, but it's sort of a shot in the dark. Currently I am not accessing MSA via its API.

    If you're interested in this, perhaps we can have an email conversation? I have a gmail with just stanleyrhodes at the start. I am very interested in both building the methodological process for other scholars to use this method, and helping in building the toolchain for doing so (although I am not yet a strong coder, and I don't know javascript well yet).
  • @emilianoheyns Whoops, I do have sort of a workflow that I posted about previously and should have included:
  • I've answered on zotero-dev. I'm OK with private email ( if you think that's a good idea, but while I know a fair bit about how Zotero works, I am definitely not the authoritative source on Zotero, so I think you'd benefit if the conversation is had in public, where the Zotero devs can correct misconceptions about Zotero I might hold.
Sign In or Register to comment.