Anthropic, the company behind the AI chatbot Claude, has reached a proposed settlement of $1.5 billion with a group of book authors who claimed the company used pirated copies of their books to train Claude.
The lawsuit was brought by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson. They alleged that millions of copyrighted books — many downloaded without permission from pirate sites like LibGen and Pirate Library Mirror — were used by Anthropic to build and train its AI model.
As part of the settlement, Anthropic will:
- Destroy the pirated books in its possession.
- Compensate authors at about $3,000 per eligible work, for approximately 500,000 books.
- Continue using only properly licensed or legally obtained training data going forward.
Legal Status & Challenges
While this is a landmark agreement — potentially the largest copyright settlement in U.S. history involving AI training data — a judge, William Alsup, has paused the final approval. The court seeks more clarity and transparency around the claims process and how authors will be notified.
Why It Matters for AVGC & AI/Creative Industries
Key Insight | Implications |
---|---|
Copyright Risk Acknowledged | AI development can no longer disregard how training data is sourced. Studios and AI creators will need clean, licensed content. |
Author Compensation Model | Establishes a financial precedent for compensating creators when their work is used without permission. |
Fair Use Not a Catch-all | Even if training usage is “transformative,” the method of acquiring content (pirated vs legal) matters deeply. |
Legal Precedent | This could shape future settlements and lawsuits involving AI companies like OpenAI, Meta, etc. |
Operational Changes Likely | Expect tighter policies for data sourcing, increased due diligence, more licensing deals in AI training pipelines. |