DeepSeek R1 and the New Efficiency Wars

On January 20, 2025, a Chinese startup called DeepSeek released a model that shook the foundation of the AI industry. DeepSeek R1, an open-source reasoning system with 671 billion parameters, matched or exceeded OpenAI's o1 on competitive mathematics benchmarks. The training cost was reportedly $6 million.R1 achieved 79.8% accuracy on the American Invitational Mathematics Examination and 97.4% on the MATH benchmark. OpenAI's o1, the previous state of the art, had set the standard for reasoning models. DeepSeek matched it for a fraction of the cost and made the result freely available.The market reaction was swift. By January 27, DeepSeek had surpassed ChatGPT as the most-downloaded free application on the US iOS App Store. Nvidia's stock price dropped 18%.DeepSeek R1 represents something larger than a single model release. It challenges the prevailing theory of how AI capabilities advance. For years, the industry has operated on scaling laws. Labs poured billions into building larger clusters of GPUs, training ever-larger models.DeepSeek R1 suggests an alternative path. The model uses a Mixture-of-Experts architecture with 671 billion total parameters but only activates 37 billion per token. The result is a model that costs $2.19 per million tokens to use via API, compared to $60 for OpenAI's o1.

Get Breaking AI News

Don't miss major developments. Subscribe for breaking news alerts and weekly digests.