[MLZ] MLZ not picking up citation from Google scholar case law
A temporary problem?
This is an old discussion that has not been active in a long time. Before commenting here, you should strongly consider starting a new discussion instead. If you think the content of this discussion is still relevant, you can link to it from your new discussion.
More soon ...
First the bad news. Google Scholar still provides no structured metadata on case pages: what you see on the screen is all the translator has to work with. Unfortunately, there isn't much structure in there, and I think this item shows that we've reached the limit of what can be done.
So if working from Google Scholar, you'll need to enter the details of this and similarly formatted items by hand.
To extend that gloomy picture a little, there are similar issues with WestLaw, Lexis, and Bloomberg Law (US), and with BaiLII (UK). For the most part this is by intention. For the commercial services, it is part of a lock-in strategy. For Google Scholar (which, to be fair, has done a lot to open up access to US court judgments), the aim is presumably to drive traffic to their site. BaiLII is a special case: under their arrangements with UK courts, they are at pains to emphasize that their reports are only an unofficial record, and their policy is therefore to do nothing that would support third-party referencing tools like Zotero.
This is all very annoying. As I wrote of our efforts to screen-scrape legal case citations from bare text (several years ago, in another context):
The good news is that services that do provide structured metadata are beginning to emerge. The two that I am familiar with are CourtListener and FastCase.
API: 1; Screen-scraping: 0.
CourtListener is a free-access project driven by contributions and grant funds. The service has quite broad coverage, and the team aim to cover all of US law. CourtListener offers an excellent API that is accessible with a (free) account on the service. The MLZ translator for CourtListener relies on the API, and it produces clean metadata. The current limitations are: some holes in coverage (the case linked above is not yet in the service); and a lack of official citations (to West reporters etc) for cases below the US Supreme Court level. The team are making progress on both of those issues: CL is a service worth checking out, and definitely one to watch.
FastCase is bundled with bar association membership in many states, and subscriptions to the service are available at much lower cost than the other commercial services. FC offers an API that might be useful for building more reliable translators. Maybe—you would have to check. I built a translator for FastCase under trial access some time back, and ended up doing screen-scraping. My access expired, and plans to subscribe to the service locally have not gone anywhere, so I'm not sure whether the translator I built still works, or whether there might now be a better approach to the site available. But as the existence of the API shows, FastCase is a modern service that "gets" the importance of inter-operability. They too are worth exploring.
It's a shame that Google Scholar has stopped to provide metadata. It's odd that Google has chosen to do so to DRIVE traffic to their site! I've been visiting their site exactly because they are compatible with MLZ translator, unlike WestLaw or LexisNexis. Google's new policy is actually driving me AWAY from their site and towards their alternative.
Is there a chance that the Google Scholar developers are not aware of how legal cases are cited? My sense is that Google would be willing to build structured metadata for legal cases, if they realize there is need from legal scholars. Maybe we should alert them to the issue.