Dear Doc:
I hear that using copyrighted materials to train artificial intelligence systems is legal. Can that be true?
Signed,
Every Artist, Ever
Is Using Copyrighted Materials to Train AI Legal?
Dear EAE:
Training artificial intelligence (AI) systems, particularly large language models (LLMs), is a process that relies on vast amounts of information that is broken down and statistically analyzed. Regardless of whether that information is text in some language such as books, web pages, or computer code, or audio such as music or recorded speech or birdsong, or graphical information such as digitized photographs, artworks, or surveillance camera feeds, it will be broken down into a sequence of “tokens” and that sequence will be the basis for responses to questions (called “prompts”) presented to the AI.
Part of the material used in this process is subject to copyright, and the owners of the copyright have brought various lawsuits against the companies developing the AI systems, claiming that this process is an infringement of the rights of the authors and copyright holders.
One response of the AI companies is to explain that although the training process uses a computer and makes a form of a copy of the materials used in training, they say it’s like telling a child to learn how to write well by reading many well-written books. They also claim that their use of the copyrighted works is “transformative” because the result is a statistical database (the “AI Model”) that contains information about the training data, but is not a copy of that data. Finally, the AI builders have responded to the lawsuits by claiming that their use of copyrighted materials is “fair use” under United States Copyright Law, and is therefore not an infringement at all.
What Do the Courts and the Copyright Office Say About AI Training?
On May 9, 2025, the United States Copyright Office released a 108-page report on whether the use of copyrighted materials to train AI systems is defensible as a fair use. The view of the Copyright Office is that such uses cannot be defended as fair use. One recent court decision also took this position. The case of Thomson Reuters v. ROSS Intelligence, 765 F. Supp. 3d 382 held recently that fair use does not apply to AI training, and that case is on appeal to the Third Circuit.
Two more recent court decisions granting summary judgment have sided with the AI companies and against the authors, declaring that, in the view of at least two federal judges, the use of copyrighted materials in AI training is fair use. In Bartz v. Anthropic PBC, Judge Alsup ruled, “In short, the purpose and character of using copyrighted works to train LLMs to generate new text was quintessentially transformative.” In Kadrey v. Meta Platforms, Inc., Judge Chhabria ruled that in the absence of meaningful evidence of market dilution from the authors, the copying and training by Meta to train its AI were fair use.
Both cases will continue, as in each, there were other issues not ruled on by the courts. But these cases mark a significant victory for AI companies in what materials they may use in training. Still, many other high-profile cases making similar claims are moving through the courts.
Are you training an AI model? Are you artificially intelligent? The lawyers at LW&H are genuinely intelligent about these and other intellectual property legal issues. Give them a shout.
Until next month,
The “Doc
— Lawrence A. Husick, Esq.



