DeepSeek Topples OpenAI's O1: The Chinese Model Rewriting AI Economics at 1/70th the Cost

Jan 22, 2025 • 2 minutes to read

This is adapted from an interview when DeepSeek V2 impressed many when it debut in July, 2024. With the DeepSeek R1 coming out, we are seeing more heated discussions on X: “So the Chinese have open sourced a model that can out think any PhD I've ever met.”

The original interview was in Chinese by Sina Finance. For the full English translation of the interview, see

Words from DeepSeek Founder
“As the architect behind China's most disruptive AI force, I'm compelled to address the seismic shifts occurring in global AI. When we released DeepSeek V2 – a model outperforming OpenAI's O1 while slashing costs to 1/70th of GPT-4 Turbo – it wasn't about starting a price war. This was about proving that China's technical innovators can lead, not follow. What follows isn't corporate posturing, but a manifesto for the next era of AI development.”


DeepSeek: China's Silent AI Juggernaut Redefining the Global Order
How a Quant Fund Spinoff Became Silicon Valley's New Obsession

While Silicon Valley fixates on OpenAI and Anthropic, an unassuming Chinese AI lab has rewritten the rulebook. DeepSeek, the brainchild of quant trading giant Huanfang, now commands Twitter's AI discourse after its V2 model outperformed OpenAI's O1 benchmark. The twist? It achieves this while charging just $0.00014 per million tokens – sparking an industry-wide pricing earthquake and earning its “Temu in AI” moniker.

The Cost Revolution
DeepSeek's May 2024 offensive began with technical fireworks:

  • MLA Architecture: Their Multi-Head Latent Attention mechanism slashes GPU memory usage to 5-13% of conventional models
  • DeepSeekMoESparse: Proprietary mixture-of-experts design achieving unprecedented compute efficiency
  • $1/M tokens: API pricing that undercuts Llama3-70B by 7x and GPT-4 Turbo by 70x

“The cost reductions weren't gimmicks – they're profit-positive,” reveals founder Liang Wenfeng in his first Western-facing interview. “We simply stopped accepting that China must lag in foundational innovation.”

Silicon Valley's Reality Check
The response from AI's traditional power center has been equal parts awe and anxiety:

  • SemiAnalysis declares V2's whitepaper “2024's most consequential”
  • Ex-OpenAI engineer Andrew Carr adopts DeepSeek's training protocols
  • Anthropic co-founder Jack Clark warns: “They've assembled an army of technical wizards. China's AI will mirror its EV/drone dominance.”

Liang remains characteristically grounded: “The real shock wasn't our tech – it's that a Chinese team dared challenge architectural dogma. Attention mechanisms hadn't been meaningfully updated since 2017. Most labs considered that heresy.”

The Innovator's Dilemma, Chinese-Style
DeepSeek's playbook defies conventional wisdom about Chinese tech:

  • Zero Consumer Apps: Sole focus on core model research among China's Big 7 AI startups
  • Open Source Zeal: Full model weights released despite no external funding
  • Anti-“Follow Culture”: Rejecting the “copy Llama, monetize fast” path embraced by peers

“Scaling Law isn't manna from heaven – it's built through relentless iteration,” Liang argues. “We're tired of hearing China's ‘1-2 year gap’. The true divide is between originators and imitators.”

AGI's New Battleground
In a radical departure from China's typical application-first approach, DeepSeek bets everything on paradigm-shifting research:

  • Three AGI Pathways: Mathematical reasoning, multimodal integration, and pure linguistic mastery
  • Self-Evolving Systems: Treating coding/math as “Go-like sandboxes” for autonomous improvement
  • Resource Fluidity: Engineers freely access compute clusters without approvals

When asked about commercialization pressures, Liang counters: “Why discuss Coca-Cola's supply chain when you're building the next electricity? Obsessing over internet-era monetization models is like 1990s investors doubting Amazon because ‘bookstores exist’.”

The Talent Crucible
Contrary to assumptions about China's brain drain, DeepSeek's breakthrough team comprises:

  • 90% mainland-educated researchers (0 overseas returnees in V2 team)
  • PhD candidates tackling architectural redesigns as side projects
  • A flat hierarchy where interns debate directly with founders

“True innovation isn't about poaching stars – it's about creating constellations,” Liang notes. “Our ‘mysterious wizards’ are just kids who believed they could rebuild Attention from first principles.”

The New Calculus
As ByteDance, Alibaba, and Tencent scramble to match DeepSeek's pricing, Liang remains focused on longer horizons:

  • Hardware Realities: “Our constraint isn't capital but NVIDIA restrictions”
  • Ecosystem Vision: “Let others build apps – we'll power China's AI infrastructure”
  • Cultural Shift: “When technical innovators become society's heroes, everything changes”

For global observers, DeepSeek represents more than a cost disruptor – it's proof that China's technical vanguard now sets the pace rather than follows it. As Liang concludes: “AGI won't wait for geopolitical debates. Either you help build it, or become roadkill on its path.”

The end of the interview. The full interview reveals why Liang considers current AI competition “irrelevant noise” and how DeepSeek plans to leverage quant trading insights for AGI development. Follow Second State for more discussion.

Run DeepSeek Models with WasmEdge

In our next article, we will cover

  • How to run open source DeepSeek models on your own device
  • How to create an OpenAI-compatible API service with newest DeepSeek models

We will use the Rust + Wasm stack to develop and deploy applications for this model. There are no complex Python packages or C++ toolchains to install! See why we choose this tech stack.

LLMAI inferenceRustWebAssembly
A high-performance, extensible, and hardware optimized WebAssembly Virtual Machine for automotive, cloud, AI, and blockchain applications