Major publishers sue AI startup Cohere for copyright infringement of over 4,000 articles
18 February 2025

A group of leading publishers, including Condé Nast, The Atlantic, Forbes and The Guardian, has filed a lawsuit against artificial intelligence (AI) startup Cohere, accusing it of unauthorized use of copyrighted content of over 4,000 articles. The lawsuit alleges that Cohere systematically infringed on copyrights and trademarks by scraping, reproducing and misusing thousands of articles without permission or compensation, intensifying the news industry’s legal battle over AI technology.
The complaint outlines several ways in which Cohere allegedly misused publisher content. According to the plaintiffs, Cohere scraped vast amounts of copyrighted material from news websites, magazines and other media outlets to train its AI models. This included verbatim copying of articles, which were then used to generate outputs that directly compete with the original publishers. Additionally, Cohere’s systems are accused of accessing and reproducing real-time content, such as breaking news stories, without proper licensing or attribution.

Homepage of Cohere
“Without permission or compensation, Cohere uses scraped copies of our articles through training, real-time use and in outputs to power its artificial intelligence service, which in turn competes with publisher offerings and the emerging market for AI licensing,” according to the lawsuit filed in the U.S. District Court for the Southern District of New York. It further alleges: “Not content with just stealing our works, Cohere also blatantly manufactures fake pieces and attributes them to us, misleading the public and tarnishing our brands.”
It is said that this practice not only undermines publishers’ ability to monetize their work but also risks spreading misinformation, as the AI-generated outputs often distort or misrepresent the original content or create fabricated content.
In Exhibit A, the plaintiffs listed over 4,000 articles in what they described as an “illustrative and non-exhaustive list of works that Cohere has infringed.” Additional exhibits include responses to queries and “hallucinations” that the publishers claim violate their copyrights and trademarks. The lawsuit states that Cohere “passes off its own hallucinated articles as articles from publishers.”

“Our brands are built on exceptional standards of quality and trust. Allowing our content to be stolen, distorted or misused undermines everything we stand for. We will defend our rights vigorously wherever they are threatened,” Roger Lynch, CEO of Condé Nast, stated in a press release about the complaint among other plaintiffs.
In February 2024, Cohere revealed that it would offer legal protection against intellectual property claims to its paying enterprise customers. This encompasses “full indemnification for any third-party claims that the outputs generated by [their] models infringe on a third party’s intellectual property rights.” This protection is available for Cohere customers who comply to their guidelines and do not intentionally attempt to create infringing content.
The lawsuit requests statutory damages of up to US$150,000 per infringed work under the copyright act. Additionally, it seeks actual damages based on Cohere’s profits and statutory damages up to the maximum allowed by law for trademark infringement and false designations of origin.
Condé Nast and other news publishers involved in the lawsuit have licensed their content to other AI companies, such as OpenAI. But OpenAI also stands accused of using news articles without permission in a lawsuit filed by The New York Times. The case is ongoing.
- Cathy Li