We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode (Preview) Xi Jinping and China's Tech Companies, The Long-Run Implications of the Chip Ban, and a Pessimistic Outlook for Taiwan

(Preview) Xi Jinping and China's Tech Companies, The Long-Run Implications of the Chip Ban, and a Pessimistic Outlook for Taiwan

2025/2/19
logo of podcast Sharp Tech with Ben Thompson

Sharp Tech with Ben Thompson

AI Deep Dive AI Chapters Transcript
People
B
Ben Thompson
创立并运营订阅式新闻稿《Stratechery》,专注于技术行业的商业和策略分析。
B
Bill Bishop
Topics
Ben Thompson: 我认为DeepSeek的出现将会对AI领域产生重大的、难以预测的影响。它开源的举动,以及其性能的优越性,不仅改变了人们对中国AI能力的认知,还对美国股市和AI相关股票造成了冲击。即使在芯片禁令的限制下,DeepSeek仍然通过创造性的方法,利用现有硬件构建了具有国际竞争力的模型,并通过开源迅速提升了中国AI模型的全球影响力。DeepSeek的成功也对美国AI公司造成了冲击,迫使他们更加积极地进行竞争,例如降低价格和改进模型。DeepSeek的成功并非完全出乎意料,市场竞争依然激烈,但其成功也证明了AI领域的现实和人们的预期之间存在差距,而现实也同样令人瞩目。美国AI公司可以从DeepSeek的经验中学习,提高模型效率。未来,芯片供应的差异可能会导致中国和美国AI发展差距的扩大,芯片禁令对中国AI发展的影响值得关注。 Bill Bishop: 中国在AI领域的突破,需要的不仅仅是资金和激励,更需要一种‘我们能做到’的信念。DeepSeek的开源决策主要由其自身决定,而非来自中国政府的直接干预。DeepSeek的开源和性能提升了中国AI在全球的影响力,并对其他AI公司造成冲击。DeepSeek的成功并非偶然,其背后是多年技术积累和模型迭代的结果。DeepSeek公布的成本数据被误读,其实际成本远高于公开数据。DeepSeek的R1模型因其推理能力和用户界面而受到欢迎,其成功与人们对中国AI的期待以及对美国AI投资的担忧有关。DeepSeek的成功,是多种因素共同作用的结果,包括其技术实力、市场环境以及公众情绪。

Deep Dive

Chapters
This chapter explores the emergence of DeepSeek, a Chinese AI model that unexpectedly disrupted the global AI market. Its open-source nature and surprising capabilities sparked debate about its origins, impact, and the future of AI competition.
  • DeepSeek's open-source model challenged established players like OpenAI.
  • Its release caused significant fluctuations in global stock markets.
  • The chapter discusses the debate around DeepSeek's organic vs. inorganic social media surge.

Shownotes Transcript

Translations:
中文

Hello and welcome to a free preview of Sharp Tech. Well, speaking of DeepSeek, one of my sort of thoughts that I sort of mentioned in passing when writing about it is that DeepSeek would provide a... There's going to be some impact that's going to be hard to know what it is now, but probably significant impact.

China and just almost more in the psyche slash belief perspective where you go back to these are very hard problems that are needed to solve. It's money's not enough. Incentive is also not enough. You sort of need the belief that we can do this. And

Is that a good read? Is there a bit where DeepSeek, again, which is a very good model. It's not the leading model, but it is in the class of the leading models, both V3 and sort of R1 beyond that. Has there been sort of this positive or do you anticipate this sense that, look, yeah, even the stuff that the West is supposed to be the best at, we're just as good?

Oh, well, DeepSeek really created that view. I mean, there are other models. Alibaba has a model. Apple apparently is going to use it for Apple intelligence in China. You know, Baidu has a model. But DeepSeek kind of came out of nowhere and they open sourced it. And then, of course, they, you know, the DeepSeek story after it percolated for a few days, crashed U.S. stock markets, crashed NVIDIA.

and caused quite a real sort of melt-up in some of the AI and tech-related stocks that trade in Hong Kong. And it was very much a turning point, I think, from a psychological perspective in the sense that, yeah, we can do this. And even though we're struggling under this chip blockade, DeepSeek showed that they could find ways to

you know, very creative ways to maximize the hardware that they had and build,

an internationally competitive model. And then they made it open source. So now everyone's using it. Baidu's integrated. I think Tencent has integrated it. Are you running it yet on your local machine? Oh, yeah. I downloaded it. Or a smaller version. I don't have a beefy enough hardware to do the full model. But yeah. Why is it open source? Is there any sense in the...

central government. I mean, again, people overestimate the extent to which the central government knows or cares about. I think DeepSeek has made their own decisions all along. Is there a sense that, oh, this is actually really valuable? Should we be open sourcing it? So I think what you said, I think they made their own decisions. You know, they originally were a hedge fund. They actually got in a little bit of trouble around there's a crackdown on quant trading. They were a quant fund. But they had bought all this hardware, all these... And Xi Jinping is now, this is my quant fund.

And well, yeah, right. And it's amazing how quickly he's risen. Liang Wenfeng, the CEO, was actually he was at this meeting with Xi on Monday. He met with the premier like a week, two weeks ago or three weeks ago. But no, I think I think they they just did it. They open sourced it. And now, though, I think there's a realization that actually this is an incredibly powerful thing for China because it's a, you know,

It's a very good model. It's open sourced. And so anyone, any country, anyone around the world can download it and have this Chinese model running instead of having to pay up for Claude or for OpenAI. It's a really fascinating way of very quickly the Chinese, at least one Chinese AI model can go global.

Yeah, I mean, the reaction to it's been really interesting because, I mean, most people, their encounter with it is not downloading it to their local machine and running it. It's using the DeepSeek app, which – but it speaks, I mean, just from like a business perspective. I think that OpenAI, number one, I said sort of from the very beginning, the chat GPT was just accident in many respects, but they're –

They had achieved the most valuable and difficult thing in tech, which is a consumer brand with like meaningful market share. Part of that is your inevitable end state is advertising and they need to get there fast so that they can give free users the best possible models. And people got deep secrets like, wow, this is amazing. It's so much better. Well, yeah, because they weren't paying up for the better sort of open AI models. It wasn't the best, but to a lot of people, it felt like it was. Right.

And just the, I don't know, like was the propaganda effective deep seek? Was it greater in China or on people in the U.S. and the West? That's a great question. I think in China, what's interesting is how it's really so quickly changed the market because now all these other companies that were trying to charge for their models, now they have to go free too. So it's not at all clear what the business model is around these models in China now. Oh, that's a question in the U.S. too. Don't worry.

Well, at least in the US, OpenAI has revenue, right? Anthropic has revenue with subscriptions. Not enough to pay for, not enough to cover costs. But I do think it was interesting. I'm a fairly skeptical person. I'm curious. The sudden surge on social media of deep seek like on X and in the App Store,

I do wonder how much of that was really totally authentic and how much of that was inorganic. Yeah, I know you mentioned that. I feel like it was pretty authentic. I think the meta bit about, because the reality is V3 came out over Christmas, which was actually a lot. They documented a lot. So they've been publishing papers and models for several years.

So this wasn't out of nowhere by any means. And then I think V3 had some of those cost estimates, which again were totally twisted and warped by everyone driving their own agenda. They were very clear in the paper. The cost they published was for the specific training run. It wasn't for all the experimentation and all the R and D and all those sorts of things. And they never said otherwise, like people are trying to paint it. They're trying to trick people. It's like, did you read the paper? They like,

The paper is very clear and lists all the things that that cost did not include. And so V3 comes out. That's the one that actually had that dollar figure, the $6 million or whatever it was. $6 million or so, yeah. And it was a very, very good model that was very, very cheap. And then R1 comes out, and I think it was a combination of

People hadn't used reasoning models yet because they were paywalled. So number one, it was people's first access to this reasoning model. Number two, the UI for deep seek was better because or the UX, I should say, because it actually laid out its thinking. And if that was the first time you use a reasoning model and you see the model like talking to itself and trying to figure out the answer.

It's kind of charming. It's like, oh, look, my little AI friend is trying to help me out and figure this out, which OpenAI did not expose for competitive reasons. And they're like, we're not going to list what we're doing. So you had a double whammy of it was behind a paywall and it was behind a competition wall or whatever you might want to be. Then you layer on the general sort of

angst about China, that China, at least we have AI. That's our great hope. And then the bit that

We're spending billions and billions of dollars. The stock market is resting on this investments of billions and billions of dollars. Is this all sort of kaput? And I think all those just created a perfect storm. It just became a current thing for a weekend. We've seen that happen before. I think it, I think just that there were so many factors that made sense for this to explode that I'm inclined to give the benefit of the doubt to it being organic as opposed to inorganic.

Okay, well, I think it's some mix, but I will defer to you on that. I think you made a pretty compelling case. I will say what's interesting, right, is DeepSeek, they disrupted, obviously, stock prices here. And to be clear, it was almost backed up to where they were. So, I mean, it was very much a current thing, but...

But the disruption was... They also disrupted the Chinese AI market, which is really interesting, right? So this is where they went. They disrupted globally. And frankly, I think... Good for them. I think the US AI companies needed to be disrupted. They were really fat and happy. Oh, yeah. No, I mean...

People were comparing to open AI pricing. That's because their margins were super large. Like, the pricing's already come down. They've already gotten more aggressive, I think, in releasing things. The 4.0 update over the weekend appears to be significantly reduced in terms of sort of the HR voice in, like, scolding you for... It's just... It's more open. I think...

we're seeing actually a pretty compelling competitive response. And by the way, like Google has models out there that are even cheaper and arguably just as good or better. So it's not, again, it was just this perfect. Everyone's perception got, it was a bubble that got prick, but it wasn't, if you were paying attention, it wasn't totally shocking. Now,

I hesitate, I almost feel bad saying that because DeepSeat deserves so much credit. And the engineering they did was amazing. And all their work, if you go back two years and read their papers, I haven't read all of them, but I've read three or four of them. It's really good stuff and some genuine breakthroughs that are going to be or have been adopted globally. But that almost sort of makes the point. It's like the myth of AI is,

has always been a bit different than the reality, but the reality is also fairly spectacular and not fully appreciated either. So there's just this crazy mishmash. No, it's interesting. I mean, again, I think that, you know, the Silicon Valley firm should thank DeepSeek for a lot of what they did, right? Because ultimately, even though the, you know, the OpenAI, Anthropic, XAI, they could buy as many as video chips as NVIDIA can make, right?

won't they be able to make their models run much more efficiently and much better if they learn from deep seek? Well, so this is the interesting thing with GROK. GROK 3 just came out this week. It appears to be the state-of-the-art model, at least the only...

O3 may be better, but O3 is this very distinct sort of thinking sort of model that OpenAI, I think, is not going to ever release directly. It is in deep research, which is incredible and has very clear flaws, to be clear, but is a very – this is a – at least for someone like me, this is a – I know the people who program have felt this way for a while because AI has –

made such a difference there, but a very visceral, like, yeah, there's a lot of jobs that are really screwed looking forward, you know? And so, so it's a state-of-the-art or state-of-the-art adjacent model. And what's incredible about XAI is it was founded 19 months ago.

And now they have a state-of-the-art model. And it's almost the inverse. It's the flip side of the DeepSeek story, which is it's incredible these optimizations DeepSeek did. They completely rethought how you do sort of a mixture of experts architecture, which is definitely –

better for inference, but it had all this training overhead and they just sort of changed how you do the training to be able to, to scale that much more gracefully because of their bandwidth limitations and the, they couldn't handle too much overhead and, and because they were, and I also believe by the way, they were using H eight hundreds, they weren't using H one hundreds, like, cause they did so many things with how they design the model that,

That speaks of, this is a company struggling with bandwidth limitations, which was the exact... And they said that. I mean, they've said, the CEO has said, other employees have said, their biggest constraint is chips. Right. Which I think is totally... It totally lines up with the way the model is designed. So I actually think DeepSeek has been...

Again, with China, with everyone in general, you should be skeptical. But this is another case where I actually, I believe them. Everything around this story sort of lines up with that. But XAI comes in and they deliver the state-of-the-art model in 19 months to

And a big part of that is they have, they've raised $16 billion or $12 billion and they bought a whole bunch of Nvidia chips and they wired them all together. And how could they do that? Because they had access to the chips and also Nvidia, one of their big differentiators is all the networking stuff they do, where they make it easy and plausible to tie a ton of chips together to get this sort of performance. And so you can look at American AI companies and say, well,

wow, why didn't you do this optimization? On the other hand, if you look at it from a comparative advantage perspective, it's like,

I always mock big companies like trying to copy a startup, like a startup, invent something and they're like, oh, we can do that too. And then you get like Facebook releasing like the poke application. It's like, why are you, why are you inventing? Something's really hard. You're almost capturing lightning in a bottle when you're small and a startup, you do it because that's the only way to do it. And by the way, most startups fail.

If you're a big company, you have large amounts of cash. You can de-risk by just going and buying the startup. Go and buy the people inventing it. Bring it in-house. Or in the case of Facebook, the way that Polk was a response to Snapchat, what they actually did is, okay, we'll just rip off stories and put it on Instagram and basically stop snapping in its track. And it's not very glamorous, but it's actually recognizing your advantage and

And I think that's what we saw with XAI. Did they do the grunt work of a deep seek to heavily optimize around a limited number of chips with low bandwidth? No, they just bought a bunch of chips because they had a bunch of money, but it also got them where they wanted to go. Right. And so it's XAI and deep seek have,

Totally different approaches, but that both of those approaches are rational given their circumstances. And that in itself, I think, is an interesting takeaway. And then one of the questions, right, is going forward, you push out a year or two years. If DeepSea continues to not have access to the best NVIDIA chips and effectively can only buy Huawei's Ascend chips, whereas XAI or OpenAI can keep buying the better NVIDIA chips, do you start seeing a real separation?

I mean, that is the big question. There's a couple of concerns that I have about this. And I think we've talked a bit about this online. So let's buckle up and sort of get into it.

All right, and that is the end of the free preview. If you'd like to hear more from Ben and I, there are links to subscribe in the show notes, or you can also go to sharptech.fm. Either option will get you access to a personalized feed that has all the shows we do every week, plus lots more great content from Stratechery and the Stratechery Plus bundle. Check it out, and if you've got feedback, please email us at email at sharptech.fm.