We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
People
B
Ben Thompson
创立并运营订阅式新闻稿《Stratechery》,专注于技术行业的商业和策略分析。
C
Chamath Palihapitiya
以深刻的投资见解和社会资本主义理念而闻名的风险投资家和企业家。
D
Didi Das
E
Ethan Malek
F
Frank DeGods
G
Gary Tan
H
Henry
活跃在房地产投资和分析领域的专业人士,参与多个房地产市场预测和分析讨论。
J
Jared Freeman
J
Jeffrey Emanuel
J
Jim Phan
M
Mark Andreessen
N
Neil Kosla
N
Nick Carter
S
Satya Nadella
在任近十年,通过创新和合作,成功转型并推动公司价值大幅增长。
S
Signal
T
The Superhuman Newsletter
Y
Yann LeCun
一位在机器学习和计算机视觉领域具有重大影响力的法国-美国计算机科学家,现任Meta首席AI科学家和纽约大学教授。
匿名Meta工程师
Topics
Frank DeGods:我认为DeepSeek是自ChatGPT以来AI领域最好的东西。它在短短20分钟内就给我留下了深刻的印象。 Signal:我本地运行DeepSeek已经有几天了,它的性能绝对可以与O1或Sonnet相媲美。我一直在用它进行编码和其他任务,以前通过API要花一大笔钱才能完成的事情,现在完全免费了。这感觉就像一个彻底的范式转变。 Nick Carter:DeepSeek将AGI的时间线提前了五年。所以,关注健身吧,知识工作已经过时了,肌肉才是剩下的全部。 The Superhuman Newsletter:DeepSeek的R1模型震惊了硅谷,许多关于中国创新和人工智能的长期假设在一夜之间消失了。一些人称之为骗局,另一些人则称之为对人类的馈赠。 匿名Meta工程师:Meta的Gen AI团队正处于恐慌模式,工程师们正疯狂地解剖DeepSeek,并尽力复制其中的任何东西。管理层担心如何证明Gen AI团队的巨额成本是合理的。当Gen AI团队的每个领导者的薪水都超过DeepSeek v3的训练成本时,他们将如何面对领导层?我们有几十个这样的领导者。DeepSeek R1让事情变得更加可怕。 NLW:DeepSeek的训练成本远低于其他公司,这是AI行业恐慌的原因。DeepSeek声称他们的V3 LLM在三个月内以560万美元的成本进行了训练。美国实验室的前沿模型训练成本对于O1级别的模型来说接近5亿美元,而对于下一代训练运行则可能高达数十亿美元。我们没有关于创建R1模型的后期训练成本的可靠估计,但可以合理地认为预算同样紧张。 Alexander Wang:据我了解,DeepSeek拥有一个由5万个顶级英伟达H100芯片组成的集群,这违反了出口管制规定。V3论文声称该模型是在一个只有2000个英伟达H800(允许出口的芯片的降级版本)的集群上进行训练的。 Jeffrey Emanuel:DeepSeek使用了创新的训练方法,例如使用8位浮点数和多标记预测,从而降低了训练成本。 Jared Freeman:DeepSeek使用了8位浮点数、压缩键值索引和多标记预测等技术来降低成本并提高性能。 Henry:DeepSeek的API访问成本极低,即使进行大量请求。在过去的几个小时里,我已经向DeepSeek API发出了超过20万个请求。没有速率限制,整个过程只花了大约50美分。 Mark Andreessen:DeepSeek R1是我见过的最令人惊叹和印象深刻的突破之一,并且它是开源的,对世界来说是一份巨大的礼物。DeepSeek的R1是AI的‘斯普特尼克时刻’。 Gary Tan:DeepSeek的搜索感觉更具粘性,即使只是进行几次查询,因为看到推理过程,即使它对自身所知和可能不知道的东西是多么认真,也极大地增加了用户信任。 Neil Kosla:DeepSeek是中国共产党进行经济战的国家宣传活动。他们伪造了低成本的说法,以证明低价是合理的,并希望每个人都转向它,以损害美国在人工智能方面的竞争力。不要上当。 Satya Nadella:贾文悖论再次出现。随着人工智能变得更高效和更容易获得,我们将看到它的使用量激增,将其变成一种我们永远无法获得足够的商品。 Chamath Palihapitiya:我们需要转向推理,并积极地向盟友出口芯片;风险投资公司需要提高资本纪律。 Jordi Hayes:现在你能做的最爱国的事情就是开发软件,使用如此多的DeepSeek推理,以至于你让中国共产党破产。 Yann LeCun:对于那些看到DeepSeek的性能并认为中国在人工智能方面超越美国的人来说,你们的解读是错误的。正确的解读是,开源模型正在超越专有模型。DeepSeek受益于开放式研究和开源。他们提出了新的想法,并将其建立在其他人的工作之上。因为他们的工作以开源的形式发布,所以每个人都可以从中获益。这就是开放式研究和开源的力量。 Didi Das:DeepSeek R1的性能可能优于OpenAI的O3模型。 Ethan Malek:我认为市场将很快适应DeepSeek带来的成本下降。 Jim Phan:DeepSeek的开源性质将加速AI的发展。 NLW:DeepSeek的影响可能比市场反应更大。

Deep Dive

Chapters
The release of DeepSeek's R1 model has sent shockwaves through Silicon Valley. Early reactions suggest it rivals OpenAI's O1 and Google's Gemini 2.0 in performance, but at a drastically lower cost, raising questions about future AI development and market dynamics. The model's accessibility and open-source nature add to its disruptive potential.
  • DeepSeek R1's performance is comparable to OpenAI's O1 and Google's Gemini 2.0.
  • DeepSeek claims significantly lower training costs compared to US labs.
  • R1 is accessible via API at a fraction of the cost of competitors.
  • The model's open-source nature and efficient design allow it to run on various consumer devices.

Shownotes Transcript

Translations:
中文

Thank you.

Hello, friends. Quick note before we dive in. I had originally planned to do a normal episode divided between the headlines and the news, but everything today is so much about this big R1 deep seek news that the episode ended up being much longer than normal. I decided to just focus on that. We will be back with our normal format tomorrow, presumably. But for now, let's just dig into what everyone is talking about, which is deep seek and just how big a deal it actually is.

Welcome back to the AI Daily Brief. If you have spent any time online over the last few days, and I guess if you follow any AI sources, you might have seen some sentiment like this one. Frank DeGods writes, I've been using DeepSeek for 20 minutes, but I'm pretty sure this is the best thing in AI since the original ChatGPT.

Signal writes, I've been running DeepSeek locally for a few days and it's absolutely on par with O1 or Sonnet. I've been using it nonstop for coding and other tasks, and what would have cost me a fortune through APIs is now completely free. This feels like a total paradigm shift. Investor Nick Carter writes, DeepSeek just accelerated AGI timelines by five years. So focus on the gym. Knowledge work is obsolete. Muscles are all that's left.

The Superhuman Newsletter writes, DeepSeeks R1 stunned Silicon Valley. China's new DeepSeek R1 model has shocked Silicon Valley and many long-held assumptions about Chinese innovation and AI have evaporated overnight. Some are calling it a hoax while others are calling it a gift to humanity.

So what the heck are we talking about? Well, if you've been listening closely, you've probably heard me talk about DeepSeek before. In December, we started hearing about their models, which were performing really well at apparently a fraction of the training cost of big models from companies like OpenAI. Last Monday, the lab released their reasoning model R1. And while it was immediately obvious that the model was good, benchmarking at a similar standard to OpenAI's O1 and Google's Gemini 2.0, as the week progressed, it started to become clear that something bigger was happening.

A post on blind and anonymous professional social media network circulated on Thursday. It was entitled, Meta Gen AI Org in Panic Mode. It read, engineers are moving frantically to dissect DeepSeek and copy anything and everything we can from it. I'm not even exaggerating. Management is worried about justifying the massive cost of Gen AI Org. How would they face the leadership when every single leader of the Gen AI Org is making more than what it costs to train DeepSeek v3? We have dozens of such leaders. DeepSeek R1 made things even scarier.

So the big thing going on here, and the reason that the AI industry is so freaked out, is cost. Deepsea claim that their V3 LLM was trained for $5.6 million over three months. Frontier model training at US Labs is closer to a half billion dollars for O1 class models, and likely in the billions for the next generation of training runs. We don't have solid estimates on the post-training cost to create the R1 model, but it seems reasonable to think that the budget was similarly tight.

Some tech executives are openly dismissive of these claims. In an interview last week, Scale.ai CEO Alexander Wang said that his understanding is that DeepSeq has a cluster of 50,000 top-of-the-line NVIDIA H100 chips in breach of export controls. The V3 paper claims the model was trained on a cluster of just 2,000 NVIDIA H800s, the downrated version of the chip that's allowed to be exported.

Earlier this month, the South China Morning Post reported that DeepSeq has 10,000 NVIDIA GPUs, but didn't go into specifics about the chips. And for as unbelievable as it is, there are some reasons to believe the claims about their rock-bottom training costs. Quant trader Jeffrey Emanuel broke down the innovations in their training methods in a blog post. Here's a part of that explanation, although it's worth reading in its entirety.

Jeffrey writes,

It's not just limited to 256 different equal-sized magnitudes like you get with regular integers, but instead uses clever math tricks to store both very small and very large numbers, though naturally with a lot less precision than you'd get with 32 bits. The main trade-off is that while FP32 can store numbers with incredible precision across an enormous range, FP8 sacrifices some of that precision to save memory and boost performance while still maintaining enough accuracy for many AI workloads.

And if that was Greek to you, don't worry. Y Combinator partner Jared Freeman writes,

Rough summary: Use 8-bit instead of 32-bit floating-point numbers, which gives massive memory savings. Compress the key value indices, which eat up much of the VRAM. Do multi-token prediction instead of single-token prediction, which effectively doubles inference speed. Mixture of experts model decomposes a big model into small models that can run on consumer-grade GPUs.

Point being, that it's not like this is a black box where we have no idea why this is going on. There's some amount of explanation of how this actually could be. Still, whatever the truth about their training cluster, DeepSeek is serving the model at rock bottom prices. API access for R1 is priced at around 3% of OpenAI's O1. Over the weekend, X was filled with examples of people accessing the model at high volumes for fractions of a cent per query. Henry writes, I've made over 200,000 requests to the DeepSeek API in the last few hours.

Zero rate limiting and the whole thing cost me like 50 cents.

Over the weekend, the mindshare really broke through. DeepSeek's phone assistant reached number one in the app store, and the model has racked up around 150,000 downloads from Hugging Face and tops the trending list. What's more, because the model is open source and has a novel design for efficient inference, it can be run on a wide range of consumer devices. AI researcher Harrison Kinsley was able to run the full model on his admittedly beefy workstation with one terabyte of RAM. Others were running smaller distilled versions of the model on phones and on laptops.

Now, at this point, seemingly everyone in Silicon Valley has a take about what DeepSeek has achieved and what it means for the AI industry. Mark Andreessen of Andreessen Horowitz writes, DeepSeek R1 is one of the most amazing and impressive breakthroughs I've ever seen and is open source a profound gift to the world. He returned later in the weekend to declare, DeepSeek's R1 is AI's Sputnik moment.

I imagine pretty much everyone here is familiar with the reference, but Sputnik was of course the first ever satellite. Its 1957 launch signaled that Russia was leading the U.S. in the space race, which came as a shock to the United States and kickstarted the Apollo program. In short, it was the wake-up call during the Cold War that the U.S. can't be complacent in the technological arms race. Y Combinator President Gary Tan wrote, DeepSeek's search feels more sticky even after a few queries because seeing the reasoning, even how earnest it is about what it knows and what it might not know, increases user trust by quite a lot.

Indeed, the ability to view chain of thought reasoning seemed to be a pretty resonant moment for a lot of users, especially for those who have never paid to access O1. Caspian on X writes, the normies think DeepSeek is cute because it shares its thought process, sharing a conversation where other folks are talking about how DeepSeek is quote, so cute because it shares its thought process and talks to itself.

As you might imagine, some think something nefarious is going on. Neil Kosla, CEO of Curieye and son of Vinod Kosla wrote, DeepSeek is a CCP state psyop in economic warfare to make American AI unprofitable. They are faking the cost was low to justify setting price low and hoping everyone switches to it to damage AI competitiveness in the US. Don't take the bait.

Now, while many might find that a plausible theory, the one small piece of evidence that you might point to is that this is an introductory price. R1 is currently being served at an introductory rate about one-tenth the cost of R01, but next month the cost will almost triple to be one-quarter of R01. Still, Forex cheaper is no joke.

Regardless of which story you believe on DeepSeek, we've clearly entered a new era of competition in AI. There are now multiple models that are basically on par across US and Chinese labs. The biggest difference is now cost of inference, and DeepSeek is serving up the cheapest on the market.

While the market reaction has been, let's say, terrified, and we'll get to that in just a moment, the reaction from big tech has not been fear, but the thrill of opportunity. Microsoft CEO Satya Nadella writes, Javon's paradox strikes again. As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we can't get enough of.

Javon's paradox, which is a term you're going to hear a lot more about in the next couple days, refers to the phenomenon where technological progress leads to efficiency gains and cost reductions, but that rather than reducing demand, it actually leads to a massive increase in demand. As an example, think about what happened to the demand for cloud storage as the cost became negligible.

As an open source project, DeepSeek has fully described their method and provided their data set, so theoretically there's nothing to hide. Hugging Face are currently replicating the model in their own training run, so we'll know soon enough. If successful, the methods will quickly be copied by every big tech firm and hundreds of startups. The implication of cost-efficient training and extremely good models is likely that the next era of AI is all about inference. In other words, companies are no longer just competing on the quality of their models. They're competing to deliver them as cheaply as possible. Today's episode is brought to you by Vanta.

Trust isn't just earned, it's demanded. Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex. That's where Vanta comes in. Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001.

Centralized security workflows complete questionnaires up to 5x faster and proactively manage vendor risk. Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back so you can focus on building your company. Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vanta to manage risk and prove security in real time.

For a limited time, this audience gets $1,000 off Vanta at vanta.com slash nlw. That's v-a-n-t-a dot com slash nlw for $1,000 off. If there is one thing that's clear about AI in 2025, it's that the agents are coming. Vertical agents buy industry horizontal agent platforms.

agents per function. If you are running a large enterprise, you will be experimenting with agents next year. And given how new this is, all of us are going to be back in pilot mode.

That's why Superintelligent is offering a new product for the beginning of this year. It's an agent readiness and opportunity audit. Over the course of a couple quick weeks, we dig in with your team to understand what type of agents make sense for you to test, what type of infrastructure support you need to be ready, and to ultimately come away with a set of actionable recommendations that get you prepared to figure out how agents can transform your business.

If you are interested in the agent readiness and opportunity audit, reach out directly to me, nlw at bsuper.ai. Put the word agent in the subject line so I know what you're talking about. And let's have you be a leader in the most dynamic part of the AI market. Hello, AI Daily Brief listeners. Taking a quick break to share some very interesting findings from KPMG's latest AI Quarterly Pulse Survey.

Did you know that 67% of business leaders expect AI to fundamentally transform their businesses within the next two years? And yet, it's not all smooth sailing. The biggest challenges that they face include things like data quality, risk management, and employee adoption. KPMG is at the forefront of helping organizations navigate these hurdles. They're not just talking about AI, they're leading the charge with practical solutions and real-world applications.

For instance, over half of the organizations surveyed are exploring AI agents to handle tasks like administrative duties and call center operations. So if you're looking to stay ahead in the AI game, keep an eye on KPMG. They're not just a part of the conversation, they're helping shape it. Learn more about how KPMG is driving AI innovation at kpmg.com slash US.

Now, when it comes to market reactions, there has definitely been a response. The S&P 500 futures market was down more than 3% in overnight trading, and some amount of panic is absolutely settling in. The concern, of course, is that big tech has sunk hundreds of billions of dollars into AI infrastructure over the past few years and seems likely to spend a trillion dollars this year.

One argument is that DeepSeq has rendered all of those US GPUs worthless as Chinese AI proves you can do it in a totally different way without all of that expensive capex. Going back to that piece by Jeffrey Emanuel, he broke down the bear case for NVIDIA in that extensive post. He tackles a ton of areas where NVIDIA has excelled over recent years, from software to chip networking to raw performance. The logic is that competing chip makers are catching up quickly across multiple vectors. Couple this with a massive decrease in training costs and AI chips quickly become a commodity.

NVIDIA is the leader in producing top-of-the-line chips for training clusters, but if the focus shifts to being about delivering cheap inference, there are other companies that are much more competitive with NVIDIA in that space, and soon to be many more.

Investor Nick Carter writes,

And yet Nick also gets at the counterpoint. In his next tweet, he says, all of that said, I don't worry too much about equity value in NVIDIA and AI data center companies. Although he points out he does have a massive bag bias here. Why, he writes, when a commodity gets cheaper, the use of that commodity increases. So inference overnight becomes vastly more abundant.

DeepSeek's innovations will be rapidly incorporated by other model companies so AI can be embedded cheaply everywhere. This probably shifts the ratio of training to inference and AI capex in favor of the latter, but I don't believe it undermines equity value in the firms that produce the inputs for inference, GPUs, data centers, etc. Just accelerates the transition from pre-AI world to fully embedded world. All of that said, the investor premise that the model companies, OpenAI, Anthropic, etc., are where equity value will accrete has a massive hole in it now.

I've always felt and have said that I thought model companies would be capital incinerators due to high quality open source models and a race to the bottom. And I think that is more true now. But overall, I don't worry about the rest of the stack, whether it's the producers or the firms that are actually bundling up compute and selling it to the end user in the form of better consumer experiences. TLDR, for most of you, no need to panic. Although I think it will take the market some time to digest and the ride will be bumpy in the near term.

White Commendator's Gary Tan took this on as well, responding to a market analyst who said China's deep-seek could represent the biggest threat to U.S. equity markets, calling into question the utility of the hundreds of billions worth of CapEx being poured into the industry. Gary writes, Do people really believe this? If training models get cheaper, faster, and easier, the demand for inference, actual real-world use of AI, will grow and accelerate even faster, which assures the supply of compute will be used.

Ben Thompson of Stratechery also makes this point. He writes, "...in the long run, model commoditization and cheaper inference, which DeepSeek has demonstrated, is great for big tech. A world where Microsoft gets to provide inference to its customers for a fraction of the cost means that Microsoft has to spend less on data centers and GPUs, or, just as likely, sees dramatically higher usage given that inference is so much cheaper." Another big winner is Amazon. AWS has by and large failed to make their own quality model, but that doesn't matter if there are very high-quality open-source models that they can serve at far lower costs than expected.

Apple, and this is an interesting one to me, is also a big winner, Ben writes. Dramatically decreased memory requirements for inference make edge inference much more viable, and Apple has the best hardware for exactly that. Apple Silicon uses unified memory, which means that the CPU, GPU, and NPU have access to a shared pool of memory. This means that Apple's high-end hardware actually has the best consumer chip for inference. Meta, meanwhile, is the biggest winner of all. I already laid out how every aspect of Meta's business benefits from AI. A big barrier to realizing that vision is the cost of inference,

which means that dramatically cheaper inference and dramatically cheaper training, given the need for meta to stay on the cutting edge, makes that vision much more achievable. Google, he does say, is probably in worse shape. A world of decreased hardware requirements lessens the relative advantage they have from TPUs. More importantly, a world of zero-cost inference increases the viability and likelihood of products that displace search. Granted, Google gets lower costs as well, but any change from the status quo is probably a net negative.

Still, as Ben points out, the reason the stocks are down is that, quote, it seems likely the market is working through the shock of R1's existence.

The moment has unquestionable geopolitical ramifications as well. It's not the first time a Chinese lab has demonstrated cutting-edge capabilities, but it is the first time a Chinese model has grabbed this kind of mindshare. Importantly, R1 is competing on price in the same way that Chinese industries have outcompeted their U.S. counterparts for several decades. This moment runs right up against the Trump administration's goal of USAI dominance and will kickstart a new chapter in the rivalry.

Investor Chamath Palihapitiya has a long thread explaining how the chessboard has changed in his opinion. He covered the need to pivot to inference and export those chips aggressively to allies, and also warned that VCs have been asleep at the wheel and need to improve their capital discipline.

He writes,

More spending, more meetings, more oversight, more weekly reports, and the like does not equate to more innovation. Unburden our technical stars to do their magic. A more joking take came from Jordi Hayes, who said, The most patriotic thing you could do right now is develop software to use so much deep-seek inference that you bankrupt the CCP.

Some say that the battle here is not really about China versus the U.S., but about open source versus closed source. Meta's chief AI scientist, Jan LeCun, writes, To people who see the performance of DeepSeek and think China is surpassing the U.S. in AI, you're reading this wrong. The correct reading is, open source models are surpassing proprietary ones. DeepSeek has profited from open research and open source. They came up with new ideas and built them on top of other people's work. Because their work is published in open source, everyone can profit from it.

That is the power of open research and open source. Menlo Ventures' Didi Das had a contrarian take after running comparisons all weekend. He plotted R1's performance against OpenAI's O3 model and suggested it's probably better. Then again, this performance is extrapolated on massively increased inference, so who really knows? He did point out, however, quote, "...the China is crushing U.S. rhetoric totally forgets about Gemini 2.0 flash thinking. Likely cheaper, longer context, and better on reasoning."

Overall, what I think is unmistakable from this, hold aside the geopolitical implications, hold aside the stock market implications, intelligence has just gotten massively cheaper. There is no way that this doesn't drive prices down.

Professor Ethan Malek writes, I think the market will adjust to any per-token cost decrease brought on by DeepSeq quite quickly. Costs for GPT-4-level intelligence dropped by 1,000x in the last 18 months. A 95% price drop in reasoning models seems not to be something that will break the labs.

Indeed, some, many, in fact, are reminding that this is exactly the type of situation where you want to get away from mainstream media and look more deeply with people who are closer to the news. NVIDIA's Dr. Jim Phan writes, An obvious we-are-so-back moment in the AI circle somehow turned into it's-so-over in mainstream. Unbelievable short-sightedness. The power of O1 in the palm of every coder's hand to study, explore, and iterate upon. Ideas compound. The rate of compounding accelerates with open source. The pie just got much bigger faster.

We as One Humanity are marching towards universal AGI sooner. Zero-sum game is for losers. It is very tempting in AI to dismiss big headlines as hyperbole. So many of the thread boys and the YouTubers are just looking for the next dopamine hit of this changes everything. But in this case, deep seek, I think, might be as big a moment as people are feeling. My guess, though, is that it's not the moment that the market is reacting to, but the one that Dr. Jim Phan is pointing out here.

No matter what, 2025 just got a heck of a lot more interesting. So come on back as we wait for the next crazy shoe to drop. For now, that's going to do it for today's AI Daily Brief. Until next time, peace.