We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

860: DeepSeek R1: SOTA Reasoning at 1% of the Cost

2025/2/7

Super Data Science: ML & AI Podcast with Jon Krohn

Jon Krohn: DeepSeek R1推理模型在性能上与OpenAI的GPT-4和Google的Gemini 2.0 Flash相当,但训练成本却大大降低,仅为它们的1%。DeepSeek是一家中国公司,其R1模型的成功对全球经济产生了影响,并对美国的技术制裁提出了质疑。该模型采用了混合专家模型等现有概念,并结合了GPU通信加速器DualPipe等创新技术,实现了高效的训练。DeepSeek还开源了其V3和R1模型的源代码和模型权重,为AI社区做出了巨大贡献。虽然DeepSeek的iOS应用存在隐私问题,但用户可以通过Ollama等平台私下使用DeepSeek模型。DeepSeek R1的出现,使得AI模型的开发、训练和运行更加经济,并降低了与AI相关的环境问题,使得AI应用能够更广泛地被使用和受益。

Deep Dive

Shownotes Transcript

DeepSeek-curious? This Five-Minute Friday is for you! Jon Krohn investigates the overwhelming overnight success of this new LLM, the product of a Chinese hedge fund. DeepSeek is a market newcomer, and yet it runs shoulder to shoulder with behemoths from OpenAI, Anthropic and Google like it’s all in a day’s work.

Additional materials: www.superdatascience.com/860)

Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected]) for sponsorship information.

860: DeepSeek R1: SOTA Reasoning at 1% of the Cost 13:08 Share

Super Data Science: ML & AI Podcast with Jon Krohn

Deep Dive

Shownotes Transcript

860: DeepSeek R1: SOTA Reasoning at 1% of the Cost