We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

#204 - OpenAI Audio, Rubin GPUs, MCP, Zochi

2025/3/24

Last Week in AI

AI Deep Dive AI Chapters Transcript

People

Andrey Kurenkov

Jeremie Harris

Topics

Andrey Kurenkov: 我认为我们对中国的工具环境了解不多。如果你是聊天机器人的用户，我们必须与BT或Claude聊天才能提出我们的疑问。我认为Ernie似乎正在扮演这个角色。新模型的出现以及它们在价格上的竞争力是一个重要因素。中国排名第一的下载应用程序刚刚切换到一个新的AI聊天机器人，它不是DeepSeek。所以事情肯定在变化。此次发布的最大优势似乎是成本，至少在围绕这一问题的讨论中，他们是这样强调的。因此，百度的目标是，百度当然是中国的谷歌，对吧？他们在那里拥有搜索引擎。他们的目标是逐步将其Ernie 4.5和X1推理模型集成到其所有产品生态系统中，包括百度搜索，这很有趣。因此，我们将看到生成式AI能力在该背景下的推出。 Jeremie Harris: 总体而言，这归结于价格。有一张非常方便的表格，比较了GPT 4.5每个令牌的成本、DeepSeek v3、Ernie以及Ernie 4.5。输入令牌的输入成本，GPT 4.5为一百万个令牌75美元。DeepSeek v3下降到基本上是0.30美元。Ernie 4.5大约是每百万个令牌0.60美元左右。所以，你知道，你在谈论数量级上的差异。当然，这些模型的性能较低。所以这就是权衡。但是，事情……是的，我认为为了提供一些视角……DeepSeqv3更类似于OpenAI数据模型中的GPT-4.0或FreeMini，其中侵略性并不那么疯狂。它可能是，我忘了，每百万个令牌大约1美元。所以它们是可以比较的。GPT-4.5与其他所有东西相比，定价简直是疯狂的。 Andrey Kurenkov: 我想我们几集前谈到了这一点，但它是一个基础模型，但它不是一个用于大规模生产的基础模型，对吧？这些是高质量的令牌，可能最适合用于创建合成数据集或回答非常具体的问题。但你不会把它看作是你想要产品化的东西，因为你是对的。我的意思是，它比其他基础模型贵两个数量级。 Jeremie Harris: 你在这里看到的提升，特别是对于Ernie X1，对吧？这是推理模型，是在推理方面，对吧？所以OpenAI的O1比Ernie X1贵大约50倍。对于输入令牌，Ernie X1大约是R1成本的一半。实际上，对于输出令牌也是如此。所以这非常重要，特别是相对于O1而言，并且向你展示……两种情况之一，要么中国的工程技术真的非常非常出色，要么背后有一些国家补贴的事情， Andrey Kurenkov: 我认为后者目前不太可能，尽管我不会排除这种可能性。当然，一些令人惊叹的工程技术使这些利润率成为可能。这真是非同寻常的事情，对吧？我的意思是，推理的成本骤降。这意味着背后进行了一些特定于推理的工程。你应该期望这同样适用于未来的训练和推理。是的，而且这有点……有趣的是，百度和谷歌之间存在相似之处，谷歌的定价也相当具有竞争力，特别是对于Gemini与Flash思考而言。所以我也可以认为这是一种公司战略。百度规模庞大，他们通过搜索赚取巨额利润。因此，他们也可以承担额外的成本，以削弱DeepSeek（一家初创公司）的地位，以……锁定市场。但无论如何，这是一个令人兴奋的消息。我想如果你在中国，我相信你不能使用ChagBt。所以，如果除此之外，对于人们来说，拥有可比的工具来使用并且 Jeremie Harris: 不错过高级LLM的乐趣，这是一件好事。我会说，我不认为百度会像至少他们的基础模型那样进行补贴，因为他们的Ernie 4.5实际上比DeepSeq v3更贵。你看到这种转变的地方是在推理模型上，这本身就是……很有趣，对吧？至少对我来说，这似乎暗示了推理方面的一些……比如推理背后的计算机架构的工程技术，或者令牌效率更高，因此计算效率更高，我应该说，因此，也许或者在推理阶段的计算效率更高。但你是对的。当你开始考虑这些事情的经济学时，所有的事情都开始混淆视听，因为它们代表着公司利润表中越来越大的比例，即使对于像百度、谷歌这样的大公司也是如此，对吧？ Andrey Kurenkov: 这些公司将被迫向我们展示他们的底牌，对吧？他们将不得不立即出售这些令牌以获利，我们最终将了解他们的实际利润率。目前尚不清楚我们是否正在了解这一点。是的，我认为我们还没有。这仍然是一个谜，人们在耍什么花招。但我也会有点赌，利润率并不高。我们确实知道的一件事是，DeepSeek， Jeremie Harris: 至少声称他们正在盈利，并且他们的模型利润率为正。我可以想象，对于例如OpenAI来说，情况并非如此，他们的收入达数十亿美元，但真正的问题是，他们真的盈利了吗？关于这一点的最后一个想法，在经济方面，当我们考虑DeepSeek声称他们正在产生正回报意味着什么时，我认为这里有一个重要的问题，即是否考虑了运营支出或资本支出，对吧？我们在他们的论文中看到，他们声称他们以600万美元的计算基础设施预算对V3进行了训练。现在，或者抱歉，以600万美元的计算预算。回想起来，这似乎是运行该计算的实际运营支出，而不是与……数千万美元的计算硬件相关的资本支出。所以很难知道，比如，你应该摊销什么？你如何将苹果与苹果进行比较？是的，很难说DeepSeek是否盈利，但就每个令牌而言，仅就推理而言，我相信他们的说法是他们正在赚钱，这在所有后端空间中都是如此。是的。有趣。是的。

Deep Dive

Chapters

This chapter discusses the release of Baidu's new Ernie models, Ernie 4.5 and Ernie X1, highlighting their competitive pricing and capabilities compared to Western counterparts like GPT-4.5 and DeepSeek R1. The discussion also touches upon the cost-effectiveness of these models and their potential impact on the market.

Baidu launched Ernie 4.5 and Ernie X1, multimodal models competitive with GPT-4.5 and DeepSeek R1 but at lower costs.
Ernie 4.5 is described as emotionally intelligent, understanding memes and satire.
Cost is a significant advantage for Baidu's models, with pricing orders of magnitude lower than GPT-4.5.

Shownotes Transcript

Our 204th episode with a summary and discussion of last week's big AI news! Recorded on 03/21/2025

Hosted by Andrey Kurenkov) and Jeremie Harris). Feel free to email us your questions and feedback at [email protected] )and/or [email protected])

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/).

Join our Discord here!) https://discord.gg/nTyezGSKwP

In this episode:

Baidu launched two new multimodal models, Ernie 4.5 and Ernie X1, boasting competitive pricing and capabilities compared to Western counterparts like GPT-4.5 and DeepSeek R1.
OpenAI introduced new audio models, including impressive speech-to-text and text-to-speech systems, and added O1 Pro to their developer API at high costs, reflecting efforts for more profitability.
Nvidia and Apple announced significant hardware advancements, including Nvidia's future GPU plans and Apple's new Mac Studio offering that can run DeepSeek R1.
DeepSeek employees are facing travel restrictions, suggesting China is treating its AI development with increased secrecy and urgency, emphasizing a wartime footing in AI competition.

Timestamps + Links:

(00:00:00) Intro / Banter

(00:01:36) News Preview

Tools & Apps

(00:02:50) Baidu launches two new versions of its AI model Ernie)

(00:10:46) OpenAI Unveils New Audio Models to Make AI Agents Sound More Human Than Ever)

(00:16:41) OpenAI’s o1-pro is the company’s most expensive AI model yet)

(00:20:53) Google brings a ‘canvas’ feature to Gemini, plus Audio Overview)

(00:22:18) Anthropic adds web search to its Claude chatbot)

(00:23:55) xAI launches an API for generating images)

Applications & Business

(00:26:28) Nvidia announces Rubin GPUs in 2026, Rubin Ultra in 2027, Feynman also added to roadmap)

(00:36:25) M3 Ultra Runs DeepSeek R1 With 671 Billion Parameters Using 448GB Of Unified Memory, Delivering High Bandwidth Performance At Under 200W Power Consumption, With No Need For A Multi-GPU Setup)

(00:40:07) Intel reaches 'exciting milestone' for 18A 1.8nm-class wafers with first run at Arizona fab)

(00:42:45) Elon Musk’s AI company, xAI, acquires a generative AI video startup)

(00:44:44) Tencent Reportedly Makes Massive NVIDIA H20 Chip Purchase for WeChat’s DeepSeek Integration)

Projects & Open Source

(00:46:32) Anthropic’s Not-So-Secret Weapon That’s Giving Agents a Boost)

(00:50:50) Mistral AI drops new open-source model that outperforms GPT-4o Mini with fraction of parameters)

(00:53:30) EXAONE Deep: Reasoning Enhanced Language Models)

Research & Advancements

(00:55:58) Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification)

(01:07:44) Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models)

(01:12:27) Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo)

(01:18:46) Transformers without Normalization)

(01:19:52) Measuring AI Ability to Complete Long Tasks)

(01:26:12) HCAST: Human-Calibrated Autonomy Software Tasks)

Policy & Safety

(01:26:45) Announcing Zochi, an Intology Project)

(01:32:46) DeepSeek, a National Treasure in China, is Now Being Closely Guarded)

(01:37:02) Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations)

Synthetic Media & Art

(01:42:27) US appeals court rejects copyrights for AI-generated art lacking 'human' creator)

(01:45:10) Trump urged by Ben Stiller, Paul McCartney and hundreds of stars to protect AI copyright rules)

#204 - OpenAI Audio, Rubin GPUs, MCP, Zochi 01:49:03 Share

Last Week in AI

Deep Dive

Shownotes Transcript

#204 - OpenAI Audio, Rubin GPUs, MCP, Zochi