Authors file copyright lawsuit against Anthropic for AI model training

The latest in a series of cases concerning copyright and AI looks at the sources Anthropic used to train its Claude large language models.

Three authors have filed a class-action suit against artificial intelligence company Anthropic for copyright infringement. The plaintiffs claim the company used data sets containing pirated versions of their works to train its Claude family of large language models (LLMs).

Plaintiffs Andrea Bartz, Charles Graeber and Kirk Wallace Johnson are journalists and authors of popular fiction and nonfiction. They allege that Anthropic knowingly used data sets that “were comprised of a trove of copyrighted content sourced from pirate websites.”

Copyright violations were a choice, suit alleges

Anthropic could have obtained licenses to use the material it took, the suit alleges, but instead “made the deliberate decision to cut corners and rely on stolen materials to train their models.” The suit states:

“Throughout its existence, Anthropic has cloaked itself in the rhetoric of ‘AI safety’ and ‘responsibility.’ Its actions, however, have made a mockery of its lofty goals.”

The suit, filed in the Northern California District Court, also attempts to frame its allegations in a larger context:

“Goldman Sach’s estimates that generative AI could replace 300 million full-time jobs in the near future. […] Already, writers report losing income from copywriting, journalism, and online content writing, which are important sources of income for book authors.”

The plaintiffs ask the court to rule on whether Anthropic infringed on authors’ copyrights, whether the company’s actions constitute fair use, whether the members of the class action were harmed and are entitled to damages and whether Anthropic acted willfully in its alleged infringement.

The plaintiffs ask for statutory or compensatory damages and a permanent enjoinder against the company for future copyright violations.

*The authors’ suit against Anthropic. Source: Pacer*

Related: Amazon takes minority share in ChatGPT rival Anthropic AI

AI use issues being sorted out

The authors’ suit is part of a wave of complaints that have arisen alongside the AI industry. They noted:

“In the last two years, a thriving licensing market for copyrighted training data has developed. A number of AI companies, including OpenAI, Google, and Meta, have paid hundreds of millions of dollars to obtain licenses to reproduce copyrighted material for LLM training.”

Anthropic competitor OpenAI is facing a class-action copyright infringement suit by the Author’s Guild filed in the Southern District Court of New York in September. Comedian Sarah Silverman and authors Richard Kadrey and Christopher Golden sued OpenAI and Meta in the same California court in July 2023. Those suits also concerned LLM training.

A group of music publishers filed suit against Anthropic in the same court for using lyrics for training Claude in October. On Aug. 15, Anthropic asked that some of the allegations in that suit be dismissed.

Anthropic did not reply immediately to a request from Cointelegraph for comment on the case.

Copyright violations were a choice, suit alleges

AI use issues being sorted out

Related Articles

Responses