US judge refuses OpenAI’s motion to dismiss New York Times copyright infringement claims

Part of a wave of litigation engulfing tech firms over using material to train AI models
New York, NY, USA - July 5, 2022: Front view of the New York Times Building on the west side of Midtown Manhattan in New York City.

Tada Images; Shutterstock

A New York district judge has issued an opinion denying ChatGPT creator OpenAI’s motion to dismiss copyright infringement claims brought by the New York Times (NYT) and the Daily News publishers over using their content to train AI models.

Judge Sidney Stein on 4 April, did, however, grant some of OpenAI’s requests including removing several of the US Digital Millennium Copyright Act (DMCA) claims.

The consolidated lawsuits are part of a wave of litigation engulfing tech companies brought by authors, publishers and artists over how their copyrighted material is being used without their permission to train AI models.

The New York Times had accused OpenAI and its parent company Microsoft of copying and using millions of NYT copyrighted articles to train automated chatbots that now compete with the NYT as a news source. The other plaintiff, the Daily News, encompasses eight news organisations including the New York Daily News and the Chicago Tribune. It asserted similar copyright infringement claims.

OpenAI tried to remove the copyright infringement claims arising more than three years before the NYT and the Daily News plaintiffs filed their complaints (before 27 December 27, 2020 for the NYT and 30 April, 2021 for the Daily News).

But Stein said that OpenAI had not met its burden of establishing that NYT and the Daily News should have discovered the alleged infringement more than three years before the filing of the complaint.

OpenAI points to a few publicly available articles in 2019 and 2020 to argue that it was “common knowledge” by 2020 that some of its training datasets included plaintiffs’ works.

The judge said: “OpenAI fails to explain why the articles, even if their existence had been known to plaintiffs at the time of their publishing, are sufficient to put plaintiffs on notice of the particular infringing conduct by defendants that provides the basis for plaintiffs’ claims.”

The defendants’ motions to dismiss the contributory copyright infringement claims also failed.

The court found that the plaintiffs had “plausibly alleged” the existence of third-party end-user infringement and that the defendant knew or had reason to know of that infringement.

The judge pointed to the NYT’s more than 100 pages of examples and dozens of examples in the Daily News complaints of alleged infringing outputs at the pleading stage. He continued that these examples, combined with their allegations of “widely publicised” instances of copyright infringement by end users of OpenAI's products “give rise to a plausible inference of copyright infringement by third parties”. 

The Daily News is also being allowed to pursue its trademark dilution claims.

OpenAI moved to dismiss the federal and state trademark claim brought by several of the Daily News titles, including the New York Daily News, Chicago Tribune, Mercury News and Denver Post.

The plaintiffs had alleged that OpenAI had diluted the quality of their trademarks “on lower quality and inaccurate writing”. OpenAI moved to dismiss the count, contending that the complaint failed to allege that the diluted trademarks were “famous” under US trademark law.

Stein said fame is the “key ingredient” in a federal trademark dilution claim and is defined as one that “is widely recognised by the general consuming public of the United States”. However, the plaintiffs proved that the trademarks were indeed famous.

The court therefore denied OpenAI’s motion to dismiss the federal trademark dilution claim in the Daily News action. It also allowed the state trademark dilution claim to continue. 

Dismissed claims

It did not go all the plaintiffs’ way as OpenAI managed to get some of the claims dismissed.

Under the DMCA claims, the plaintiffs had contended that “regurgitations” generated by the defendants’ large language models constituted “distributions” of copies of their work. 

However, Stein ruled that allowing the DMCA claims to survive when the distributed work was not “close to identical” to the original would risk “boundless DMCA liability”.

Accordingly, he ruled that because the outputs were merely excerpts of plaintiffs’ works and not “copies” of those works, the plaintiffs had “failed to establish that defendants ‘distributed’ ‘copies’ of their works” were a violation of the DMCA. Those claims were dismissed.

Responding to a request for comment, an OpenAI spokesperson said: “Our models empower innovation, and are trained on publicly available data and grounded in fair use.” 

The New York Times said in a statement to Reuters: “All of our copyright claims will continue against Microsoft and OpenAI for their widespread theft of millions of The Times’s works, and we look forward to continuing to pursue them.”

Email your news and story ideas to: [email protected]

Top