I’m at one of the parallel sessions, with a theme of ‘E-Governance’, and here are my notes on Anton Geist‘s paper.
Geist’s work is in a very interesting area. Chair Rachel Craufurd Smith (last seen on Lex Ferenda here) alludes to the mixed blessing of legal information on the Internet – information overload is a persistent issue. Geist has been studying how to obtain relevant information from this huge stack of knowledge. My colleague at UEA, Mathias Siems, has been engaged in work using related techniques, on how courts cite each other (see for example this paper on citation patterns, presented at a staff seminar earlier this year).
Geist’s jumping-off point is the world since Google, and is interested in ‘relevance ranking’ – the way in which a search engine orders the results, particularly important when there are millions of responses. He explains how PageRank works (in brief!); but for something like Westlaw, things work quite differently. The basic assumption has been that we can’t use the Web-style algorithms for computer-assisted legal research systems; so the research hypothesis is thus that citation analysis could identify courts cases that are more relevant.
The first step in testing this is looking at the network structures, can you build a network based on court cases that looks like the WWW? Relying on typical networks (including scale-free networks; a topic that has been exercising the mind of fellow attendee and blogger Andres Guadamuz of late), Geist notes that the contemporary web (and thus web searching) is recognised as scale-free. Constructing a network (using Python!) of freely-available Austrian Supreme Court cases (both headnotes and full-text are available), he finds that the vast majority of decisions have few headnotes and a small number have many – i.e. a power law distribution. This suggests some similarity between web pages and the Austrian cases, meaning that citation analysis might be just as efficient in the latter case as it is in the former.
The second step, then, is looking at official reports (a selection of Supreme Court cases), to see if they are unevenly distributed. However, it seems that there are more headnotes for the reported cases; this could mean, in searching, that cases without headnotes would be ‘ranked down’. Again, this suggests that network data could be used.
Q: Jon Bing asks a very interesting question about evaluating different types of citations, i.e. looking at legislation etc, which is certainly (says Geist) an area worth researching. Geist also mentions the purported differences between reports in civil and common law jurisdictions.
Q: who writes the headnotes? normally by assistants/clerks after the judgement is completed.
Q: how do you deal with gaming/self-fulfilling prophecies/etc? acknowledges it, but focused here on publicly-available documents; so far, this is the best call on relevance that can be made.