Media Outlets Sue OpenAI for Concealing Copyright Infringement

Three digital media outlets — Raw Story, Alternet and The Intercept — filed lawsuits against OpenAI for copyright infringement on Wednesday, adding to industry pushback against the technology company’s method of training chatbots.

Both lawsuits, one filed by The Intercept and the other filed collectively by Raw Story and Alternet, accuse the artificial intelligence company of using copyrighted works by journalists without proper attribution to train large language models. The suits argue that OpenAI has taken steps to conceal its copyright infringement actions by removing certain information, like the byline or headline of the original article.

The digital media companies are seeking damages of at least $2,500 per violation, as well as demanding that OpenAI remove all copyrighted material from data training sets.

“When they populated their training sets with works of journalism, Defendants had a choice: they could train ChatGPT using works of journalism with the copyright management information protected by the DMCA intact, or they could strip it away,” the Raw Story and Alternet suit reads. “Defendants chose the latter, and, in the process, trained ChatGPT not to acknowledge or respect copyright, not to notify ChatGPT users when the responses they received were protected by journalists’ copyrights, and not to provide attribution when using the works of human journalists.”

The New York Times Building in Manhattan.

The Intercept additionally sued Microsoft for developing its own chatbot with the same training, including copyrighted content.

Publishers have recently been faced with a difficult choice: Fight back against OpenAI and the technological encroachment on the media industry, or allow tech companies to use copyrighted materials for a hefty fee.

The New York Times filed an ambitious lawsuit in December 2023 against Microsoft and OpenAI, accusing the tech giants of copyright infringement. The suit argues that the generative AI tools that Microsoft and OpenAI have created rely on large language models (or LLM) “that were built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides and more.”

The Times’ lawsuit marks the first blockbuster case from news publishers over generative AI capabilities and how chatbots were trained, as the technology begins to embed itself in the media industry.

Meanwhile, organizations like News Corp., which owns the Wall Street Journal and the New York Post, have partnered with OpenAI as the company intends to be a “core content provider for generative AI companies who need the highest quality timely content to ensure the relevance of their products,” according to CEO Robert Thomson.

Thomson also torched the New York Times’ lawsuit against Microsoft and OpenAI, saying, “Courtship is preferable to courtrooms.”

Pamela Chelin contributed to this report.

Comments