We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode (Preview) 72 Hours of DeepSeek Hysteria, What DeepSeek Means for Big Tech, Lessons on the Efficacy of Chip Controls

(Preview) 72 Hours of DeepSeek Hysteria, What DeepSeek Means for Big Tech, Lessons on the Efficacy of Chip Controls

2025/1/27
logo of podcast Sharp Tech with Ben Thompson

Sharp Tech with Ben Thompson

AI Deep Dive AI Chapters Transcript
Topics
Ben: DeepSeek模型的低训练成本并非作假,而是其在模型训练技术上取得突破性进展的结果。DeepSeek采用了混合专家方法,并对通信层进行了低级别优化,从而降低了对内存带宽的需求。他们还开发了一种压缩技术,减少了对关键值存储的需求,提高了效率。虽然DeepSeek可能拥有超出声明数量的芯片,但这并不影响其技术突破的真实性。与OpenAI和Anthropic相比,DeepSeek的效率提升可能被夸大,因为我们无法得知其他公司的真实成本。 此外,DeepSeek的低价推理服务也进一步证明了其成本效率的优势。虽然存在被补贴的可能性,但DeepSeek声称其盈利能力,这并非完全没有可能。总而言之,DeepSeek的低训练成本是其技术突破和成本优化策略共同作用的结果,而非单纯的作假。 Jeremy: DeepSeek公布的550万美元训练成本存在可疑之处,怀疑DeepSeek可能隐瞒了其实际的训练成本,或者通过规避芯片禁令的方式进行训练。

Deep Dive

Shownotes Transcript

Unpacking several days of dizzying reactions to DeepSeek, including a closer look at the costs of model development, why the heightened scrutiny looks like a coping mechanism, DeepSeek’s efficiency breakthroughs, the implications for big tech, and the future of export controls on semiconductors.

To email the show: [email protected])

@SharpTechPodcast Channel — YouTube)

@Stratechery Channel — YouTube)

DeepSeek FAQ — Stratechery)

DeepSeek-R1, DeepSeek Implications — Stratechery Update)

Get all episodes of Sharp Tech, Sharp China, Stratechery Updates and Interviews, Greatest of All Talk, Asianometry and the Dithering Podcast as part of Stratechery Plus) for $15/month or $150/year.