DeepSeek’s cost-effective AI model challenges industry giants

11 February 2025

On January 20, 2025, Chinese artificial intelligence (AI) startup DeepSeek launched DeepSeek-R1, a reasoning model that performs comparably to OpenAI’s latest o1. The launch of the new model quickly gained significant attention as it offers similar capabilities at a lower price.

Following R1’s release, Nvidia’s stock dropped 17 percent, leading to a market cap loss of nearly US$600 billion – the largest one-day loss in history. Analysts attributed the drop to concerns over AI market shifts, with DeepSeek seen as a potential disruptor. Although Nvidia’s stock partially rebounded, the rise of Chinese AI firms has intensified discussions about competition in AI infrastructure and cost efficiency.

DeepSeek has also faced speculation about using model distillation, a technique used to transfer knowledge from a large pre-trained model to a smaller model, which some services have prohibited in their terms of use for large language models (LLMs). Despite this, it remains a common practice in the industry due to its ability to create more efficient and cost-effective models.

In a statement emailed to the New York Times, OpenAI representative Liz Bourgeois indicated that OpenAI is reviewing the potential misuse of its models by DeepSeek. “We take aggressive, proactive countermeasures to protect our technology and will continue working closely with the U.S. government to protect the most capable models being built here,” she said.

Meanwhile, OpenAI CEO Sam Altman welcomed DeepSeek’s launch, saying that it was “invigorating to have a new competitor.” On a social media post, he described DeepSeek’s R1 model as “impressive, particularly around what they’re able to deliver for the price.” Regarding pursuing legal action, Altman told reporters in Tokyo that they have “no plans to sue DeepSeek right now.”

In December 2024, DeepSeek made its models available for free use and modification, including DeepSeek-V3. On January 10, 2025, the company released a free chatbot app powered by the V3 model, which rapidly gained popularity and topped Apple and Google download charts. Within days, the free chatbot app surpassed ChatGPT.

DeepSeek claimed the free, open-source LLM took only two months and US$5.6 million to develop using around 2,000 H800 chips from Nvidia. However, analysts suggest the costs could have been understated, as it may have excluded other expenses and considerations in the calculation. Despite this, they noted that this amount is still minimal compared to the hundreds of millions to billions spent by U.S. firms like Google, Microsoft and OpenAI on their models.

A key element of DeepSeek’s success lies in its use of a mixture-of-experts (MoE) approach. This method allows DeepSeek to selectively activate only the most relevant segments of its neural network for each query, optimizing performance and reducing computational costs. In contrast, ChatGPT is known for its broad-based conversational abilities and ease of use across diverse topics. While ChatGPT excels in handling complex and nuanced queries, DeepSeek responds faster in technical and niche tasks.

Whether DeepSeek can maintain its rapid growth remains uncertain, but its disruptive influence is already reshaping the market.

- Cathy Li

Federal Circuit upholds Myrbetriq patent, hinders generic launches

26 September 2024