A federal court has ruled that Anthropic’s use of copyrighted books to train its AI models qualifies as fair use, marking a significant legal milestone in the debate over generative AI and intellectual property. However, the court also found that the company must stand trial over its past use of pirated book copies, leaving open the possibility of billions of dollars in damages.
Fair Use Ruling Marks a First in Generative AI
In a decision released Monday evening, Judge William Alsup of the U.S. District Court for the Northern District of California ruled that Anthropic’s training of large language models (LLMs) on copyrighted books constitutes fair use. The judge emphasized that the training process was “transformative,” a key criterion under U.S. copyright law that determines whether use of copyrighted material without permission is legally permissible.
“This is the first major ruling in a generative AI copyright case to address fair use in detail,” said Chris Mammen, a managing partner at Womble Bond Dickinson, who specializes in intellectual property law. He noted that the court explicitly rejected the argument that human learning is meaningfully different from machine learning in this context. The judge’s opinion stated that LLM training was “among the most transformative many of us will see in our lifetimes.”
The lawsuit, filed in August 2024 by a group of authors, alleged that Anthropic used their works without authorization to build its AI systems. The company sought a summary judgment in February, asking the court to resolve the fair use issue without a full trial. With this ruling, Anthropic becomes the first AI firm to receive a favorable summary judgment on fair use in the current wave of copyright litigation.
Court Allows Piracy Claims to Proceed to Trial
Despite the fair use ruling, Judge Alsup concluded that the company must face trial over allegations that it downloaded millions of pirated books during its data collection process. According to the court’s findings, Anthropic initially built a central library of over 7 million unauthorized book copies, sourced from pirate databases like Books3, Library Genesis, and the Pirate Library Mirror.
The summary judgment notes that cofounder Ben Mann personally downloaded the Books3 dataset in winter 2021, followed by at least five million books from LibGen in June 2021 and two million more from PiLiMi in July 2022. While Anthropic later shifted to using legally purchased copies for training, the court found that the earlier acquisition and retention of pirated materials could not be justified under fair use.
“The downloaded pirated copies used to build a central library were not justified by a fair use,” Alsup wrote. “Every factor points against fair use.” The plaintiffs argue that Anthropic should have paid for these copies, even if they were not ultimately used in training. The court agreed, setting the stage for a trial on damages.
Industry Impact and What Comes Next
This split ruling has implications for dozens of similar copyright cases currently moving through the U.S. legal system. While it sets a precedent in favor of AI developers claiming fair use for training data, it also reinforces the limits of that defense when unauthorized materials are obtained unlawfully. Experts believe the decision could influence upcoming rulings, particularly where the issue of piracy is absent.
Prior to this case, only one other AI copyright case had reached summary judgment. In Thomson Reuters v. Ross, a federal court ruled against the AI company Ross Intelligence, determining that its use of Westlaw materials was not protected by fair use. That case is now under appeal. In contrast, Judge Alsup’s ruling in Bartz v. Anthropic is already being cited by fair use advocates as a significant win for innovation.
“Judge Alsup’s ruling should be a model for other courts assessing whether Gen AI training on copyrighted material is fair use,” said Adam Eisgrau, senior director of AI, Creativity, and Copyright Policy at the Chamber of Progress. “He found it is clearly transformative and affirmed that the purpose of copyright is to promote competition and creativity.”
Anthropic has not been alone in facing piracy claims. Other lawsuits, such as Kadrey v. Meta, also allege that AI firms downloaded copyrighted books from pirate sources like LibGen. The statutory minimum penalty for willful copyright infringement is $750 per work, and with Anthropic’s library consisting of at least 7 million books, the company could face billions in damages. No trial date has been set.