Hello and welcome to a free preview of Sharp Tech. Well, speaking of DeepSeek, one of my sort of thoughts that I sort of mentioned in passing when writing about it is that DeepSeek would provide a... There's going to be some impact that's going to be hard to know what it is now, but probably significant impact.
China and just almost more in the psyche slash belief perspective where you go back to these are very hard problems that are needed to solve. It's money's not enough. Incentive is also not enough. You sort of need the belief that we can do this. And
Is that a good read? Is there a bit where DeepSeek, again, which is a very good model. It's not the leading model, but it is in the class of the leading models, both V3 and sort of R1 beyond that. Has there been sort of this positive or do you anticipate this sense that, look, yeah, even the stuff that the West is supposed to be the best at, we're just as good?
Oh, well, DeepSeek really created that view. I mean, there are other models. Alibaba has a model. Apple apparently is going to use it for Apple intelligence in China. You know, Baidu has a model. But DeepSeek kind of came out of nowhere and they open sourced it. And then, of course, they, you know, the DeepSeek story after it percolated for a few days, crashed U.S. stock markets, crashed NVIDIA.
and caused quite a real sort of melt-up in some of the AI and tech-related stocks that trade in Hong Kong. And it was very much a turning point, I think, from a psychological perspective in the sense that, yeah, we can do this. And even though we're struggling under this chip blockade, DeepSeek showed that they could find ways to
you know, very creative ways to maximize the hardware that they had and build,
an internationally competitive model. And then they made it open source. So now everyone's using it. Baidu's integrated. I think Tencent has integrated it. Are you running it yet on your local machine? Oh, yeah. I downloaded it. Or a smaller version. I don't have a beefy enough hardware to do the full model. But yeah. Why is it open source? Is there any sense in the...
central government. I mean, again, people overestimate the extent to which the central government knows or cares about. I think DeepSeek has made their own decisions all along. Is there a sense that, oh, this is actually really valuable? Should we be open sourcing it? So I think what you said, I think they made their own decisions. You know, they originally were a hedge fund. They actually got in a little bit of trouble around there's a crackdown on quant trading. They were a quant fund. But they had bought all this hardware, all these... And Xi Jinping is now, this is my quant fund.
And well, yeah, right. And it's amazing how quickly he's risen. Liang Wenfeng, the CEO, was actually he was at this meeting with Xi on Monday. He met with the premier like a week, two weeks ago or three weeks ago. But no, I think I think they they just did it. They open sourced it. And now, though, I think there's a realization that actually this is an incredibly powerful thing for China because it's a, you know,
It's a very good model. It's open sourced. And so anyone, any country, anyone around the world can download it and have this Chinese model running instead of having to pay up for Claude or for OpenAI. It's a really fascinating way of very quickly the Chinese, at least one Chinese AI model can go global.
Yeah, I mean, the reaction to it's been really interesting because, I mean, most people, their encounter with it is not downloading it to their local machine and running it. It's using the DeepSeek app, which – but it speaks, I mean, just from like a business perspective. I think that OpenAI, number one, I said sort of from the very beginning, the chat GPT was just accident in many respects, but they're –
They had achieved the most valuable and difficult thing in tech, which is a consumer brand with like meaningful market share. Part of that is your inevitable end state is advertising and they need to get there fast so that they can give free users the best possible models. And people got deep secrets like, wow, this is amazing. It's so much better. Well, yeah, because they weren't paying up for the better sort of open AI models. It wasn't the best, but to a lot of people, it felt like it was. Right.
And just the, I don't know, like was the propaganda effective deep seek? Was it greater in China or on people in the U.S. and the West? That's a great question. I think in China, what's interesting is how it's really so quickly changed the market because now all these other companies that were trying to charge for their models, now they have to go free too. So it's not at all clear what the business model is around these models in China now. Oh, that's a question in the U.S. too. Don't worry.
Well, at least in the US, OpenAI has revenue, right? Anthropic has revenue with subscriptions. Not enough to pay for, not enough to cover costs. But I do think it was interesting. I'm a fairly skeptical person. I'm curious. The sudden surge on social media of deep seek like on X and in the App Store,
I do wonder how much of that was really totally authentic and how much of that was inorganic. Yeah, I know you mentioned that. I feel like it was pretty authentic. I think the meta bit about, because the reality is V3 came out over Christmas, which was actually a lot. They documented a lot. So they've been publishing papers and models for several years.
So this wasn't out of nowhere by any means. And then I think V3 had some of those cost estimates, which again were totally twisted and warped by everyone driving their own agenda. They were very clear in the paper. The cost they published was for the specific training run. It wasn't for all the experimentation and all the R and D and all those sorts of things. And they never said otherwise, like people are trying to paint it. They're trying to trick people. It's like, did you read the paper? They like,
The paper is very clear and lists all the things that that cost did not include. And so V3 comes out. That's the one that actually had that dollar figure, the $6 million or whatever it was. $6 million or so, yeah. And it was a very, very good model that was very, very cheap. And then R1 comes out, and I think it was a combination of
People hadn't used reasoning models yet because they were paywalled. So number one, it was people's first access to this reasoning model. Number two, the UI for deep seek was better because or the UX, I should say, because it actually laid out its thinking. And if that was the first time you use a reasoning model and you see the model like talking to itself and trying to figure out the answer.
It's kind of charming. It's like, oh, look, my little AI friend is trying to help me out and figure this out, which OpenAI did not expose for competitive reasons. And they're like, we're not going to list what we're doing. So you had a double whammy of it was behind a paywall and it was behind a competition wall or whatever you might want to be. Then you layer on the general sort of
angst about China, that China, at least we have AI. That's our great hope. And then the bit that
We're spending billions and billions of dollars. The stock market is resting on this investments of billions and billions of dollars. Is this all sort of kaput? And I think all those just created a perfect storm. It just became a current thing for a weekend. We've seen that happen before. I think it, I think just that there were so many factors that made sense for this to explode that I'm inclined to give the benefit of the doubt to it being organic as opposed to inorganic.
Okay, well, I think it's some mix, but I will defer to you on that. I think you made a pretty compelling case. I will say what's interesting, right, is DeepSeek, they disrupted, obviously, stock prices here. And to be clear, it was almost backed up to where they were. So, I mean, it was very much a current thing, but...
But the disruption was... They also disrupted the Chinese AI market, which is really interesting, right? So this is where they went. They disrupted globally. And frankly, I think... Good for them. I think the US AI companies needed to be disrupted. They were really fat and happy. Oh, yeah. No, I mean...
People were comparing to open AI pricing. That's because their margins were super large. Like, the pricing's already come down. They've already gotten more aggressive, I think, in releasing things. The 4.0 update over the weekend appears to be significantly reduced in terms of sort of the HR voice in, like, scolding you for... It's just... It's more open. I think...
we're seeing actually a pretty compelling competitive response. And by the way, like Google has models out there that are even cheaper and arguably just as good or better. So it's not, again, it was just this perfect. Everyone's perception got, it was a bubble that got prick, but it wasn't, if you were paying attention, it wasn't totally shocking. Now,
I hesitate, I almost feel bad saying that because DeepSeat deserves so much credit. And the engineering they did was amazing. And all their work, if you go back two years and read their papers, I haven't read all of them, but I've read three or four of them. It's really good stuff and some genuine breakthroughs that are going to be or have been adopted globally. But that almost sort of makes the point. It's like the myth of AI is,
has always been a bit different than the reality, but the reality is also fairly spectacular and not fully appreciated either. So there's just this crazy mishmash. No, it's interesting. I mean, again, I think that, you know, the Silicon Valley firm should thank DeepSeek for a lot of what they did, right? Because ultimately, even though the, you know, the OpenAI, Anthropic, XAI, they could buy as many as video chips as NVIDIA can make, right?
won't they be able to make their models run much more efficiently and much better if they learn from deep seek? Well, so this is the interesting thing with GROK. GROK 3 just came out this week. It appears to be the state-of-the-art model, at least the only...
O3 may be better, but O3 is this very distinct sort of thinking sort of model that OpenAI, I think, is not going to ever release directly. It is in deep research, which is incredible and has very clear flaws, to be clear, but is a very – this is a – at least for someone like me, this is a – I know the people who program have felt this way for a while because AI has –
made such a difference there, but a very visceral, like, yeah, there's a lot of jobs that are really screwed looking forward, you know? And so, so it's a state-of-the-art or state-of-the-art adjacent model. And what's incredible about XAI is it was founded 19 months ago.
And now they have a state-of-the-art model. And it's almost the inverse. It's the flip side of the DeepSeek story, which is it's incredible these optimizations DeepSeek did. They completely rethought how you do sort of a mixture of experts architecture, which is definitely –
better for inference, but it had all this training overhead and they just sort of changed how you do the training to be able to, to scale that much more gracefully because of their bandwidth limitations and the, they couldn't handle too much overhead and, and because they were, and I also believe by the way, they were using H eight hundreds, they weren't using H one hundreds, like, cause they did so many things with how they design the model that,
That speaks of, this is a company struggling with bandwidth limitations, which was the exact... And they said that. I mean, they've said, the CEO has said, other employees have said, their biggest constraint is chips. Right. Which I think is totally... It totally lines up with the way the model is designed. So I actually think DeepSeek has been...
Again, with China, with everyone in general, you should be skeptical. But this is another case where I actually, I believe them. Everything around this story sort of lines up with that. But XAI comes in and they deliver the state-of-the-art model in 19 months to
And a big part of that is they have, they've raised $16 billion or $12 billion and they bought a whole bunch of Nvidia chips and they wired them all together. And how could they do that? Because they had access to the chips and also Nvidia, one of their big differentiators is all the networking stuff they do, where they make it easy and plausible to tie a ton of chips together to get this sort of performance. And so you can look at American AI companies and say, well,
wow, why didn't you do this optimization? On the other hand, if you look at it from a comparative advantage perspective, it's like,
I always mock big companies like trying to copy a startup, like a startup, invent something and they're like, oh, we can do that too. And then you get like Facebook releasing like the poke application. It's like, why are you, why are you inventing? Something's really hard. You're almost capturing lightning in a bottle when you're small and a startup, you do it because that's the only way to do it. And by the way, most startups fail.
If you're a big company, you have large amounts of cash. You can de-risk by just going and buying the startup. Go and buy the people inventing it. Bring it in-house. Or in the case of Facebook, the way that Polk was a response to Snapchat, what they actually did is, okay, we'll just rip off stories and put it on Instagram and basically stop snapping in its track. And it's not very glamorous, but it's actually recognizing your advantage and
And I think that's what we saw with XAI. Did they do the grunt work of a deep seek to heavily optimize around a limited number of chips with low bandwidth? No, they just bought a bunch of chips because they had a bunch of money, but it also got them where they wanted to go. Right. And so it's XAI and deep seek have,
Totally different approaches, but that both of those approaches are rational given their circumstances. And that in itself, I think, is an interesting takeaway. And then one of the questions, right, is going forward, you push out a year or two years. If DeepSea continues to not have access to the best NVIDIA chips and effectively can only buy Huawei's Ascend chips, whereas XAI or OpenAI can keep buying the better NVIDIA chips, do you start seeing a real separation?
I mean, that is the big question. There's a couple of concerns that I have about this. And I think we've talked a bit about this online. So let's buckle up and sort of get into it.
All right, and that is the end of the free preview. If you'd like to hear more from Ben and I, there are links to subscribe in the show notes, or you can also go to sharptech.fm. Either option will get you access to a personalized feed that has all the shows we do every week, plus lots more great content from Stratechery and the Stratechery Plus bundle. Check it out, and if you've got feedback, please email us at email at sharptech.fm.