The Promise and Peril of AI in the Courtroom
With 50 million cases pending, India’s courts face crippling delays. Specialist AI systems show promise, but only sustained progress and accountability can secure their role in judicial administration
View as PDF
Mohit Bhatnagar, OP Jindal Global University.
Shivaraj Huchhanavar, OP Jindal Global University.
SDG 9: Industry, Innovation and Infrastructure | SDG 16: Peace, Justice and Strong Institutions
Institutions: Ministry of Law and Justice | Ministry of Electronics and Information Technology
Justice in India remains a distant promise for many. Court cases stretch over years, lawyers are costly, and the system is often impenetrable for those who need it most. With more than 50 million cases pending, the machinery of justice is paralysed by delay.
Artificial intelligence (AI) can offer something citizens rarely receive in today’s courts: timely guidance about their rights. New tools that answer questions in plain language could turn the law from a source of intimidation into a source of support. It will not, of course, solve the crisis overnight. And the risks are real: poorly designed systems may spread confusion and erode the very confidence on which justice depends.
From Chatbots to Real Guidance
The promise of AI in law goes beyond chatbots. Large Language Models (LLMs) are being tested in summarising judgments, predicting outcomes and even emulating statutory reasoning. These will not replace lawyers or judges, but they can speed up research, streamline routine tasks, and bring more consistency to decisions. Among these applications, legal question-answering stands out for its immediacy. A tenant facing eviction or a worker denied wages may not need a full legal brief; they need first-level clarity in plain words. Systems that deliver this could narrow the gulf between citizens and the courts in a way few reforms have managed.
But the vision carries risks. Unlike general knowledge, law does not tolerate approximation. A wrong answer can have serious consequences for a person’s home, livelihood or freedom. Current AI systems are prone to “hallucination” – producing answers that sound authoritative but are inaccurate or misleading. Many widely used models misfire more often than not and can even contradict themselves when asked the same question twice. Such behaviour undermines reliability and raises hard questions of liability. If a citizen acts on bad advice, who is responsible – the developer, the deploying institution, or the state? Regulators cannot leave this unresolved.
Training Trust into Machines
The solution lies in building systems tailored for law. General-purpose models are like encyclopaedias: broad but shallow. Justice requires specialists – models trained on real legal questions, fine-tuned with examples of sound reasoning, and corrected through human feedback. When aligned in this way, their responses become clearer, more consistent and closer to how a lawyer might guide a client. Crucially, they must be anchored in verifiable sources. A legal assistant that cites the relevant statute or precedent not only improves accuracy but also reinforces confidence.
The real test is whether such systems can deliver answers people can rely on. In one evaluation of 100 legal questions, three versions of the same language model were compared: a basic version, a supervised fine-tuned version trained with human-checked examples, and a preference-optimised version tuned to favour the answers people find most useful.
The progression was clear. The preference-optimised system scored 6.9 out of 10 – nearly 90 percent higher than the base model and 40 percent above the supervised one. Reliability improved further when this system was paired with a mainstream model like GPT-3.5, pushing average ratings to 8.7 out of 10.
A separate review by lawyers and subject experts reached similar conclusions. In assessing 46 real cases, they found the hybrid approach – combining user intent, preference-optimized output, and ChatGPT – more accurate, clearer, and more confident than GPT-3.5 alone. It also showed a better grasp of legal nuance, though it still struggled to cite the precise statute or judgment.
That gap highlights the next step: grounding answers more firmly in legal texts. Reliability can be steadily improved, but unless models are trained to reference the law itself, they risk sounding persuasive while missing what matters most.
A Better Way to Train Legal AI
Progress also depends on rethinking how systems are built. Much legal AI today relies on “retrieval”: pulling passages from databases and feeding them to the model. This reduces hallucination but risks quoting out of context. Alternatives that combine domain-specific training with human preference optimisation perform better, aligning answers with how people actually frame their concerns.
Another advance is the creation of public datasets drawn from the kinds of legal questions people post on everyday online forums. Made freely available, these allow different groups to improve systems together and reduce dependence on proprietary “black box” models controlled by a few firms. For policymakers, the principle matters as much as the technology: open, transparent data is the foundation for trustworthy innovation.
Getting the Rules Right
Harnessing AI’s potential requires deliberate action. Rigorous testing against human benchmarks must become the norm, with unsafe error rates deemed unacceptable. Governments should encourage controlled pilots – sandbox trials where tools are tested in limited settings under close supervision – so innovation can advance without putting citizens at risk. The legal profession too must build AI literacy. Law schools and bar councils should make it part of their curricula, preparing the next generation of lawyers for a digital era.
Data quality and ethics must sit at the core. If models rely only on English or urban-centric sources, they will exclude millions. Building multilingual, culturally diverse legal datasets is not just technical but a matter of equity. Privacy must also be safeguarded. Many training datasets draw on public forums where people share sensitive disputes. Strict rules on anonymisation, consent and ethical handling are essential if citizens are to accept AI as a partner in justice.
Institutions Hold the Key
Sceptics may be right that AI cannot fix the structural problems of India’s courts. Backlogs, procedural delays and lack of resources demand institutional reform, not just digital tools. But dismissing AI would be a mistake. Properly designed and carefully governed, it can act as a force multiplier, making the law more accessible. The deeper question is whether institutions are ready. Who should regulate these systems – technology regulators, bar councils or the judiciary? Will governments invest in the multilingual data needed to make them relevant? And will liability rules evolve so accountability is not outsourced to algorithms?
AI in law and courts is not about futuristic robots dispensing justice. It is about whether a citizen facing eviction, harassment, or denial of wages can get prompt, reliable guidance at the tap of a phone. It is about narrowing the distance between people and the law, and reinforcing the principle that justice must be accessible to all. Whether that promise is realised will depend less on technology than on the institutions that govern it.
India has a particular reason to embrace these innovations. Its courts are overburdened, and professional help is largely inaccessible. Used as a first line of support, AI could empower citizens while easing the strain on formal institutions.
The lesson extends beyond India. From the United States to Africa, justice systems are struggling with similar backlogs and barriers to access. Legal AI could play a role akin to that of mobile phones, widening access in places where traditional infrastructure is weak. With its scale, diversity and digital public infrastructure, India is well placed to lead in building systems that are robust, multilingual and rights-based.
View as PDF
Authors:
The discussion in this article is based on the authors’ research published in Artificial Intelligence and Law (2025). Views are personal.