As generative AI tools become embedded in everyday life, copyright owners are increasingly asking a critical question: what happens when creative works are copied to train artificial intelligence without consent, compensation, or attribution?
That question is now squarely before the courts in a series of lawsuits brought by authors and other creators against OpenAI and similar companies. One of the most closely watched is In re: OpenAI, Inc. Copyright Infringement Litigation, multidistrict litigation that combines 12 cases brought by news media, authors, and others against OpenAI (the owner of ChatGPT) and Microsoft (a key investor) for copyright infringement. This case may set the stage for how copyright law will apply to AI training practices.
For authors, publishers, and other content owners, these cases offer important insight into how courts may evaluate infringement claims involving AI and what enforcement options may be available going forward.
The Authors Guild Lawsuit
In 2023, the Authors Guild and several well-known authors, including George R.R. Martin, John Grisham, and Jodi Picoult, filed a suit against OpenAI alleging that its large language models were trained using entire copyrighted works without authorization or proper licensing. As mentioned, several related cases have been consolidated in hopes of streamlining judicial consideration of similar copyright issues and AI’s impact on copyright claims.
Although AI is a new frontier for copyright-based claims, the complaint advances familiar copyright allegations:
- Direct infringement through unauthorized copying
Plaintiffs allege that OpenAI copied complete books to ingest into its training datasets. - Derivative works and loss of exclusivity
The authors allege that AI outputs can generate summaries, character descriptions, plot structures, and stylistic emulations that encroach on their rights. The authors allege harm such as the lost opportunity to license their own works and losing their standing in the market as a result of derivative works. - Commercial exploitation without compensation
Plaintiffs also argue that their works were fed into the system without any compensation to the authors.
The Authors Guild has indicated it is not opposed to AI itself. Indeed, the authors recognize the value in training AI models on large volumes of text such as novels to improve their functionality. However, these authors have brought claims to enforce the basic premises of copyright law. The position is that creators should have a certain amount of control over their works and should be compensated fairly for the same.
Why the Case Matters for Copyright Owners Beyond Book Authors
Although the plaintiffs are fiction writers, the implications extend well beyond the publishing industry. The same AI training practice alleged in the Authors Guild case are used across other industries, including journalism, digital media, photography, and academic publishing.
For copyright owners, the case highlights several litigation-relevant issues:
- Training data is copying
Courts will be asked to decide whether ingesting copyrighted works into AI models constitutes actionable copying even if the end user never sees the original text verbatim. - Scale does not excuse infringement
The fact that AI systems ingest millions of works simultaneously does not eliminate the need for authorization. If anything, it may amplify potential damages. - Output similarity is a fact-intensive inquiry
Whether AI outputs are “substantially similar” to protected works will likely require expert analysis and discovery—opening the door to traditional infringement frameworks in a new technological context.
The Fair Use Defense
AI developers have leaned heavily on fair use, arguing that training models on copyrighted material may be sufficiently “transformative” or “innovative” such that it should be a permissible use.
While some courts have been receptive to fair-use arguments in AI cases, others have:
- Allowed infringement claims to survive early dismissal
- Distinguished between lawfully obtained works (i.e. licensed works) and stolen content
- Emphasized the importance of market harm and substitution
Fair use remains a fact-specific defense, meaning many of these cases may turn on discovery into how training data was sourced, stored, and used.
Lessons from AI Copyright Cases
Although the Authors Guild case is still somewhat early in the discovery stage following the consolidation of multidistrict litigation, there are certain key takeaways:
- Given the fact-intensive nature of such disputes, resolution by motion (including motions to dismiss and motions for summary judgment) is unlikely
- Early investigation into data sourcing that the AI models rely upon can be critical to potential claims
- Licensing negotiations and litigation are not mutually exclusive strategies and can occur simultaneously
- Although litigation is ongoing, changes have already been made. While ChatGPT used to provide direct quotes or excerpts from copyrighted works, it no longer does so.
- This is a new frontier and courts are still shaping the rules so copyright holders have opportunity to develop this area of law if willing to weather the unpredictable and lengthy nature of litigation
As generative AI continues to evolve, so too will the legal frameworks governing its use of creative works. For copyright owners, the current moment represents both uncertainty and opportunity, particularly for those prepared to enforce their rights through strategic litigation.
Law is driven by compelling storytelling. Ashley Bowcott grew up as a writer in a family of legal professionals.