Close

Generative AI in legal research: Opportunities and risks

Legal Generative AI Research
By understanding the risks and benefits of generative AI, you can minimize unintended consequences that have troubled legal professionals.

Lately, there have been lots of discussions about the risks and benefits of lawyers using generative AI (GenAI), particularly for legal research.

Many legal tech vendors are touting the legal research capabilities of their GenAI products. It’s even claimed that GenAI, which is a subset of Artificial Intelligence (AI) characterized by the ability to generate new content, will significantly transform the landscape of legal research. 

This article will examine AI and GenAI, explain their workings in layperson’s terms, and examine the risks and benefits of employing GenAI for legal research.

Additionally, I will use my experience as a tech lawyer and propose five practical strategies for you to mitigate the risks associated with using generative AI in legal work.

Background

We have all heard horror stories about the use of GenAI by lawyers.

There was the lawyer in New York who cited non-existent cases provided by the well known GenAI platform, ChatGPT.

Then there was the Colorado lawyer whose license was suspended after he cited fictitious caselaw that he found with ChatGPT to draft a motion.

More recently, Michael Cohen was implicated by a judge in similarly using non-existent case citations. 

There are also judges entering orders requiring that the use of GenAI or even AI be revealed. Many of these orders also require verification that any cases cited by the tool are real and stand for the proposition cited. 

These events are bound to make lawyers and legal professionals skittish about using GenAI, particularly for legal research.

But the benefits of GenAI are real. The legitimate use of GenAI for legal research depends on understanding the risks and benefits of it, which, by the way, is our ethical duty. By understanding the risks and benefits, you can then reduce the risks of Gen AI, if not eliminate them altogether. 

Some definitions

To understand the risks of generative AI in the legal field and how to mitigate them, we first need to clarify our terminology with some definitions. 

Artificial intelligence

AI, in its broadest sense, refers to machines or software that can perform tasks that emulate those that typically are thought to require human intelligence.

These tasks include doing things that appear to require reasoning, learning from experience, understanding language, and problem-solving.

In the legal context, computers and software frequently employ AI in the form of machine learning to facilitate automation.  

Generative AI

GenAI is a subset of AI. It can create new and original content, ranging from textual material to images and music, based on the data it has been trained on.

As the name implies, it generates new content instead of just regurgitating existing materials, such as those revealed by a Google search.

It is conversational and simulates talking to a colleague or associate. There are many types of GenAI models that can produce content based on different inquiries or prompts. 

Large language models

LLMs are AI platforms that use significant amounts of language as data to work. GenAI needs to be trained to use these LLMs.

In the legal profession, generative AI models need to be trained on a large number of legal documents, case laws, statutes, and scholarly articles, which enables them to assist in legal research.

LLMs can recognize, predict, translate, summarize, and generate language, including software code. 

ChatGPT

ChatGPT, created by OpenAI, is perhaps the best-known GenAI model. It is not the only model, however.

Importantly, ChatGPT is an open model, meaning it uses publicly available data as of a specific date as the large language model for its responses.

It also uses any data placed in it for responses to future inquiries.

Other GenAI models use closed or defined data sets for responses. For various reasons, which are discussed below, these closed systems are better for legal uses. 

How does it work?

To understand the risks and benefits of GenAI, you first need some knowledge of how GenAI works.

While you don’t necessarily need technical computer science knowledge, you do need a basic conceptual understanding of what the models are doing when they generate answers to your inquiries. 

Very simplistically, the models work by predicting content based on patterns found throughout the large data sets (LLMs) upon which they are trained. It’s similar in concept to how your smartphone can predict the next word you may want to use in a text.  

It’s essential to understand that the models are not necessarily trained to be accurate in the sense humans use the word.

The models are not trained to give the “right” answer, only the most predictable or useful answer based on the data. It’s trying to predict what’s needed or should come next without any sense of whether what it’s providing is true or false.  

For open AI models like ChatGPT, the data used is everything publicly available on the internet, which can lead to accuracy problems. Not everything you see on the internet is true, as we all know. 

The closed GenAI models, such as those offered by many legal tech vendors, are slightly different. These models use a closed set of data such as relevant legal precedents, case law summaries, regulations and statutes, and other existing materials.

Using this more limited data set is referred to with the nerdy-sounding term “retrieval augmented generation.” Like the open systems, the process used in these systems is complex in its underlying technology.

However, these systems can still create a simple query-and-response interface for lawyers, making them accessible tools for legal research. 

The risks

Because of how GenAI models work, some inherent risks exist in using them where accuracy is essential, such as legal research. These risks are as follows: 

Hallucinations

Hallucinations are simply the inclusion of content, references, or citations in responses that simply don’t exist.

Hallucinations are a particular problem with open systems since the data being used is unlimited. Again, GenAI models are trained to give the most plausible or predictable result, not the most accurate.

So, it might think the predictable response requires a citation even if it doesn’t exist. The closed systems offered by legal tech vendors reduce the risks of hallucination.

The data in these systems and upon which the responses are based only include true references, real citations, and trusted informational sources.

Hence, when a citation is provided, there is a much greater chance that the citation is to a real case, statute, or regulation.  

Accuracy

All GenAI modules, open or closed, carry the risk of inaccuracy. In other words, the model may cite a real case, but the purpose for it being cited may be incorrect.

Of course, existing research tools like headnotes or case summaries present the same risks. We have all read a summary or headnote of a case only to find when we read it that the case really stands for a slightly different proposition entirely.

Part of the problem is that different courts and jurisdictions use different terms for the same concept, and models are not sophisticated enough or have been sufficiently trained to pick up nuance, which leads to the next risk. 

Difficulty with nuance

Present GenAI models all struggle with understanding nuance or undertaking complex reasoning.

It’s not to say the responses are simplistic; they often appear quite sophisticated, but the underlying rationale may not support what’s being said. 

Overreliance

Despite accuracy and reasoning shortcomings, because the responses often sound so good, it’s easy to become overly reliant on them (this may have been what happened to the sanctioned New York and Colorado lawyers).

As we have seen, overreliance is a problem for lawyers since accuracy and understanding the reasoning behind legal concepts are critical. Over-reliance on GenAI models can also lead to an erosion in research skills and critical thinking abilities. 

Privacy and confidentiality

Open AI systems like ChatGPT do not necessarily protect the privacy or confidentiality of either the queries made of them or the responses that are generated.

Therefore, if you reveal client confidences or other private information, that information is not protected from future public disclosure or use.

This is unsurprising; no one should think their Google searches and search results are privileged and confidential.

On the other hand, many legal tech vendors offering GenAI models assert that the data and responses are protected, will not be revealed, and will not be used in future training. 

Bias and inappropriate content

All GenAI models have biases that can sometimes appear unexpectedly in the responses. All models present risks of inappropriate content since they merely predict what should come next in a response. The model often doesn’t know what it says is wrong or inappropriate.  

The benefits

Of course, GenAI offers significant benefits against which the risks must be weighed.  

The most obvious benefit of using GenAI for things like legal research is simple: GenAI can save a vast amount of time and make you more efficient.

The models can process and analyze large volumes of legal texts much faster than humans. GenAI thus allows lawyers and legal professionals to better spend their time on things for which they are uniquely qualified and which GenAI can’t do (see my recent article from a CES Keynote discussing these efficiencies).

Gen AI can do such things as confirm background information. And it can often offer up ideas and approaches you had not considered. You may have thought of three ways to analyze a legal issue; for example, GenAI may offer a fourth or fifth idea.  

GenAI can also identify patterns and connections that human researchers might miss. It can analyze vast sums of data quickly, while human analysis to do the same work would often take too long and cost too much to be practical. 

Finally, generative AI provides access to legal information, making it all more accessible to more legal professionals, not just those in large law firms. It levels the playing field. 

Can you use ChatGPT for legal research?

Using ChatGPT, or another generative AI platform, for legal research is a hot debate at present.

Before we consider how to mitigate the risks of using generative AI to do legal research, it’s important to remember that legal research is not just one thing.

There are, in fact, several tasks we do as lawyers and legal professionals that fall under the rubric of legal research. The distinction between these tasks is important because GenAI does not do all equally well.  

Legal research sometimes involves getting background on some element of the law. What is the rule against perpetuities, for example?

Sometimes, it involves summarizing statutes and regulations. Sometimes, it involves getting citations to support or refute a point. Sometimes, and at the most sophisticated end, it involves crafting arguments from case holdings and nuances.  

In thinking about using GenAI, you need to consider the risks on a sliding scale. So, in addition to the risk level of the type of GenAI system you are using (open vs closed systems), you also need to consider the risks associated with the kind of legal research for which you will be using GenAI.

If you want GenAI to summarize regulations or a statute or give you general background about a legal concept, the risk is much lower than if you want it to fashion your legal argument from case law. The latter carries a fair amount of risk of either hallucinations, inaccuracies, or both. 

Five strategies to mitigate risk

Given the benefits, then, are there ways to minimize the risks? Here are  five things you can do to reduce and even eliminate much of the risk. 

Verify

In all cases, verify what the model tells you. The horror stories reported in the media would not have occurred if the lawyers had looked up and read the citations GenAI provided.

The failure in every case was not with the model. The problem was that the lawyers didn’t satisfy their ethical duty to understand the risks and benefits of GenAI and read the cited cases.

When a bright first-year associate provides you with a draft legal memorandum for the first time, you don’t just copy and paste it into your brief. You read the cases (or should). The same is true with GenAI. Trust but verify! 

Be careful what you ask

Be careful what you ask GenAI. When you ask it something, keep in mind how it works and what it does well.

For example, GenAI is good at doing things like confirming something you think is correct or legal reasoning that you believe to be true. It’s also good at providing general information and summarizing lengthy materials.

And while GenAI can provide you with a great start on research, it can miss nuances. It also helps to think through the problems and issues before starting your inquiry.

The more direction you give the model, the better and more accurate the response. 

Protect confidences

Be careful what you put into your inquiries, especially with public or open ones. Read the terms of service of the model provider to know what it can do with the materials and responses provided. 

Investigate

Investigate and vet legal tech vendors offering GenAI tools to be sure you understand the terms of the service.

How will the vendor use the data provided? What privacy protections are being offered? What happens if something goes wrong? (For a good starting point on what to ask about, consider this ABA article on how to vet a cloud computing service. The inquiries are similar). 

Stay involved

Finally, it bears repeating. Don’t accept what GenAI tells you as gospel. It may be wrong. Never rely on it for pure legal advice. Use GenAI as a supplementary tool, just like you would a bright but inexperienced assistant.

GenAI is not a replacement for human judgment and expertise. 

Conclusion

If you follow these rules, you can receive the benefits of generative AI in your legal work without subjecting yourself to undue risk. It boils down to common sense. Understand what you are working with and take the same level of caution as you would give your client.

One Legal: Delightfully easy eFiling

One Legal Dashboard
Manage all your California and Nevada court filing from a single platform. Receive status updates and court-returned documents online while we handle all the logistics of getting your documents filed. Find out more about eFiling with One Legal now.
Contents
    Add a header to begin generating the table of contents

    More to explore

    What is One Legal?

    We’re California’s leading litigation services platform, offering eFiling, process serving, and courtesy copy delivery in all 58 California counties. Our simple, dependable platform is trusted by over 20,000 law firms to file and serve over a million cases each year.

    One Legal Dashboard

    Legal Up Virtual Conference

    Register now to get actionable strategies and inspiration to level up your legal career.