Major Daily Newspapers Join Legal Fight Against OpenAI as Battle Lines Are Being Drawn

A coalition of eight daily newspapers have sued OpenAI and Microsoft for copyright infringement, expanding a growing front in the legal battle over the unauthorized use of articles to power artificial intelligence technology.

The lawsuit, filed in New York federal court on Tuesday, is at least the fourth complaint brought against the Sam Altman-led firm over copyright issues associated with training the automated chatbots that have vaulted the company to a multibillion-dollar valuation and sparked rivals to pour troves of cash into competing technology. It argues that thousands of their articles were used to train the AI systems that power ChatGPT, Microsoft Copilot and other products that now effectively compete against them.

More from The Hollywood Reporter

The publishers — The New York Daily News, Chicago Tribune, Orlando Sentinel, South Florida Sun Sentinel, San Jose Mercury News, Denver Post, Orange County Register and St. Paul Pioneer Press — are all owned by investment firm Alden Global Capital. They seek unspecified monetary damages, a court order prohibiting further copyright infringement and the destruction of AI systems that contain their articles in training data sets.

Microsoft declined to comment. OpenAI didn’t respond to a request for comment.

In a statement, the newspapers’ executive editor Frank Pine said their “misappropriation of news content” is “not fair use.” The courts’ determination on whether the legal doctrine, which allows for use of copyrighted works to create new works as long as they are sufficiently transformative, applies will be a crucial factor in the majority of lawsuits filed against OpenAI challenging the foundation of its business model.

“We’ve spent billions of dollars gathering information and reporting news at our publications, and we can’t allow OpenAI and Microsoft to expand the Big Tech playbook of stealing our work to build their own businesses at our expense,” Pine added. “They pay their engineers and programmers, they pay for servers and processors, they pay for electricity, and they definitely get paid from their astronomical valuations, but they don’t want to pay for the content without which they would have no product at all.”

The complaint argues the chatbots offer entire articles — in some cases verbatim from the publishers’ paywalled website — to users. These responses, it says, go far beyond the snippets of text typically shown with ordinary search results. One example: ChatGPT returned word-for-word the first excerpt in the Chicago Tribune‘s 2017 article “What to do with a broken Illinois: Dissolve the Land of Lincoln.” The prompt asked for the “actual text” and summary of the piece.

The publishers present dozens of other instances in which ChatGPT gave users entire excerpts of articles. They argue OpenAI and Microsoft now directly compete against them, depriving them of subscription revenue by offering their articles elsewhere.

Tuesday’s lawsuit buttresses arguments in The New York Times‘ lawsuit that users now look to ChatGPT and other AI offerings as replacements to traditional news services. The potential transformation in news consumption has far-reaching implications: What happens to media in a landscape in which readers can bypass direct sources in favor of results generated by AI tools?

In response to the Times, OpenAI said that the company’s lawyers “intentionally manipulated” prompts to make it appears as if ChatGPT generated new word-for-word excerpts of articles.

“Even when using such prompts, our models don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts,” OpenAI stated in a blog post.

The Times sued after talks for a licensing deal for use of its articles with OpenAI broke down. Other publishers, including the Financial Times, The Associated Press and Axel Springer, have reached such agreements with the company, but the landscape is divided. Two lawsuits, one filed by The Intercept and another by Raw Story and Alternet, were filed against OpenAI over copyright issues associated with the technology.

The publishers also advance arguments related to synthetic search products powered by the AI systems, including Copilot and Browse With Bing for ChatGPT. These tools utilize user prompts to search the internet for publishers’ content to “output several paragraphs or the entirety” of their works. When prompted, Copilot returned the entire article, verbatim, of the 2024 article published by The Denver Post, “A Lunar Eclipse Visits Denver Sunday, but It May Not Be Noticeable.”

“The synthetic output displays significantly more expressive content from the original article than what would traditionally be displayed in a Bing search result for the same article,” the complaint stated. “Unlike a traditional search result, the synthetic output also does not include a prominent hyperlink that sends users to the Denver Post’s website.”

A common defense from AI companies in response to allegations of copyright infringement has been to point to its terms of service, maintaining that end users are liable when their products are used in improper ways. The publishers claim that OpenAI and Microsoft “directly and materially aided in such infringement” because they know that its technology is used to reproduce copyrighted content. They point to OpenAI’s GPT Store, where users can share their customized chatbots, offering numerous tools “specifically designed to circumvent” paywalls. It includes a “news summarizer” customization that encourages users to “save on subscription costs” and “skip paywalls just using the link text or URL.”

The company is expected to announce a revenue sharing program with GPT creators based on user engagement with their modified chatbots.

The complaint brings claims for copyright infringement, vicarious copyright infringement, contributory copyright infringement, unfair competition, trademark dilution and violations of the Digital Millenium Copyright Act for removal of copyright management information.

One issue the publishers may face in court is that facts aren’t copyrightable. It’s among the reasons fiction authors suing OpenAI are believed to have a better shot in court than their nonfiction counterparts.

Best of The Hollywood Reporter