IIITH Study Makes Case For Leveraging Anchor Text In Improving Legal Precedent Retrieval

Chennai: The basis of a good legal argument depends on citing priors or precedents. With AI (and advances in NLP and ML) entering the field of Law, legal research has gotten automated by algorithms that can search for and retrieve relevant precedents. While research efforts are ongoing to exploit features such as catch phrases, sentences, paragraphs and so on, an experimental study conducted by a IIIT-Hyderabad team led by researcher Gaurang Patil under the guidance of Prof. PK Reddy has focused on the text surrounding citations in legal arguments for better precedent retrieval.

The study titled, “Citation Anchor Text For Improving Precedent Retrieval: An Experimental Study On Indian Legal Documents” which was supported by iHub Anubhuti-IIITD Foundation was presented at the 37th International Conference on Legal Knowledge and Information Systems (JURIX 2024), held from December 11-13, 2024 in the Czech Republic where it won the ‘Best Paper Award’.

The Idea

“An anchor text is the clickable text in a hyperlink. So, instead of the entire URL, you have text that describes the linked page’s content,” describes Gaurang, 2nd year MS by Research student who has been exploring information retrieval in the legal domain. He continues that his explorations revealed how the usage of anchor text in web documents in the 90s and early 2000s was very helpful in improving the quality of web search. “The anchor text was found to provide a concise representation of the linked webpage. We wanted to extend this concept (of the anchor text) to the legal domain and test its effectiveness,” he remarks.

The Experiment

With the help of publicly available Indian Supreme Court judgements that were downloaded from the Indian Kanoon website, the researchers extracted preceding citation anchor text for prior cases that were part of two existing datasets. “Compared with existing approaches, our approach of leveraging the text around the citation shows that the document representation of the referenced judgement (prior case) improved thereby improving the overall retrieval performance too,” explains Gaurang. Document representation refers to the ways in which a document can be encoded or structured for processing by machines especially in information storage and retrieval. For the IIITH team, this is just the beginning. “We plan on using improved document representations to summarise judgements too in the future,” states Gaurang.

IIITH Study Makes Case For Leveraging Anchor Text In Improving Legal Precedent Retrieval

The study titled, “Citation Anchor Text For Improving Precedent Retrieval: An Experimental Study On Indian Legal Documents” was presented at 37th International Conference on Legal Knowledge and Information Systems

Latest News