Blog
BFV Perspectives, Corporate Matters, Georgia Business Disputes, | May 31, 2024

Eleventh Circuit Judge Experiments With ChatGPT

Judge Kevin Newsom is one of my favorite judges on the Eleventh Circuit. He makes great writing look effortless. It is always accessible, jargon-free, and enjoyable to read. I might not always agree with him, but when he goes out of his way to write a concurrence, I know it will be interesting. And he will thoughtfully raise a new perspective on a legal issue.

Judge Newsom’s recent concurrence in Snell v. United Specialty Insurance Co. did not disappoint. He describes his personal experience with ChatGPT. He also offers thoughts on how AI-powered large language models (LLMs) can aid in interpreting contracts, statutes, and other legal texts.

The case itself had nothing to do with AI or even technology. It was a simple insurance case. It addressed whether an insurance policy that covered claims arising from “landscaping” work covered a landscaper sued for negligently installing a ground-level trampoline. In other words, does “landscaping” include installing a ground-level trampoline in a customer’s backyard?

The Court answered “no,” although its decision was based on a quirk in Alabama insurance law. This law incorporated parts of the insurance application into the policy, so it expressly did not resolve the original question.

Judge Newsom wrote a concurrence “simply to pull back the curtain on the process by which [he] thought through one of the issues in this case” and “to make a modest proposal regarding courts’ interpretations of the words and phrases used in legal instruments.”

Providing a glimpse behind the scenes of how he approaches and decides cases, Judge Newsom details his thought process and hesitations about using generative AI. He also details his experimentation with generative AI and provides suggestions for how judges may use generative AI in future cases. His proposal: that judges who interpret legal texts based on their “ordinary meaning” should consider whether and how LLMs might inform the analysis.

Judge Newsom explained that, before the panel agreed to take the “off-ramp” of the quirk in Alabama law that allowed it to avoid deciding the question, he “spent hours and hours (and hours) laboring over the question whether Snell’s trampoline-installation project qualified as ‘landscaping’ as that term is ordinarily understood.”

While analyzing that issue, Judge Newsom had a “disconcerting” thought about whether ChatGPT could shed light on the ordinary meaning of “landscaping.” He initially discarded that thought as “absurd,” but the more he considered it “the less absurd it seemed.”

As the various dictionary definitions did not provide a conclusive answer, out of curiosity, he directed one of his clerks to ask ChatGPT, “What is the ordinary meaning of ‘landscaping’?” ChatGPT’s response:

“Landscaping” refers to the process of altering the visible features of an area of land, typically a yard, garden or outdoor space, for aesthetic or practical purposes.  This can include activities such as planting trees, shrubs, flowers, or grass, as well as installing paths, fences, water features, and other elements to enhance the appearance and functionality of the outdoor space.

This response struck Judge Newsome as “more sensible” than he thought it might and “definitely less nutty” than he feared. Relevant to the analysis, he noted that ChatGPT’s definition included more than “natural” improvements and covered both aesthetic and “functional” objectives. He then asked ChatGPT the ultimate question: “Is installing an in-ground trampoline ‘landscaping?’” ChatGPT responded in the affirmative:

Yes, installing an in-ground trampoline can be considered a part of landscaping.  Landscaping involves altering the visible features of an outdoor area for aesthetic or practical purposes, and adding an in-ground trampoline would modify the appearance and function of the space. It’s a deliberate change to the outdoor environment, often aimed at enhancing the overall landscape and usability of the area.

Judge Newsom posed the same questions to Google’s Bard, which “was more equivocal than ChatGPT” but also indicated that the trampoline-related work that the appellant performed, such as excavating a pit, constructing a retaining wall, installing the mat, and adding a decorative wooden cap, might be landscaping. (Truly giving the reader a unique behind-the-scenes look at his thought process as a judge, Judge Newsom includes an appendix of all his queries and the AI models’ responses.)

While he never reached a definitive conclusion due to the panel’s “off-ramp” approach, Judge Newsom described his process of experimenting with generative AI as valuable because it led him to consider that LLMs might be useful when interpreting legal texts. As Judge Newsome put it, his view changed from that idea being “positively ludicrous” to now being “a pretty firm ‘maybe.’”

Judge Newsom listed several benefits that using LLMs could provide—as one of several tools—to inform the ordinary meaning of words and phrases in contracts, statutes, and other legal instruments.

Most importantly, since LLMs draw on a “seemingly bottomless reservoir of linguistic data”—“from Hemmingway novels and Ph.D. dissertations to gossip rags and comment threads”—they can provide useful information about how words are used in common parlance.

So, for example, LLMs can provide useful information about the ordinary meaning of “landscaping” because “the internet data on which they train contain so many uses of that term, from so many different sources—e.g., professional webpages, DIY sites, news stories, advertisements, government records, blog posts, and general online chatter about the topic.”

LLMs also understand context. For example, they can distinguish when the word “bat” refers to the animal versus a baseball bat. This makes them, in Judge Newsom’s words, “high-octane language-prediction machines capable of probabilistically mapping, among other things, how ordinary people use words and phrases in context.”

Judge Newsom proceeded to discuss several other benefits before turning to some words of caution. LLMs are not perfect and do not include what Judge Newsome calls “pure offline” usages of words, meaning those that neither originated online nor originated offline but were uploaded online. This may lead to not fully accounting for usage of words and phrases by underrepresented populations.

LLMs also may “hallucinate,” meaning giving incorrect answers. There is also the risk that lawyers or judges may manipulate queries to lead to a preferred response. And, though difficult and unlikely, prospective litigants may try to “corrupt the inputs” by adding data that would lead to preferred interpretations. He raises several other questions and issues to consider, and the whole concurrence is worth a read.

Overall, Judge Newsom concludes that, while “[p]lenty of questions remain” and of course judges should not plug a legal question into ChatGPT and then blindly apply that answer to the case at hand, it is worth considering LLMs alongside the traditional interpretative aids to determine the ordinary meaning of legal texts.

In my practice, I have not thought about using generative AI in this way. But the next time a case involves an issue of statutory or contractual interpretation, I will definitely check out what ChatGPT and other AI programs have to say. I know at least one Eleventh Circuit judge might find the results persuasive.

BFV Perspectives, Corporate Matters, Georgia Business Disputes, | May 31, 2024
Jeremy L. Kahn
Jeremy L. Kahn

Jeremy L. Kahn is a thoughtful and strategic litigator, with a creative approach. He enjoys crafting strategies to resolve difficult and legally challenging problems, always seeking to achieve his clients’ desired results in an efficient manner.