We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Monologue: What Happened With DeepSeek?

Monologue: What Happened With DeepSeek?

2025/2/6
logo of podcast Better Offline

Better Offline

AI Deep Dive AI Chapters Transcript
People
E
Ed Zitron
一位专注于技术行业影响和操纵的播客主持人和创作者。
Topics
Ed Zitron: 在一月底,DeepSeek的R1模型的发布颠覆了AI行业的现状,并对美国科技行业的主导地位提出了质疑。DeepSeek的R1模型是一种推理模型,它能够逐步解决问题,这与OpenAI的GPT模型有所不同。DeepSeek的技术不仅在能力上与OpenAI相媲美,而且在训练和运营上据称也更便宜。更重要的是,DeepSeek以开源许可证发布其模型,这意味着任何公司都可以免费使用和修改其技术。DeepSeek的出现对OpenAI及其支持者的商业模式构成了挑战,因为他们一直认为构建强大的AI模型需要大量的资金和最先进的硬件。现在,投资者可能会对OpenAI的资金需求提出质疑,因为DeepSeek已经证明了构建更高效、更经济的AI模型是可能的。我个人认为DeepSeek的开源策略加速了AI技术的普及,并可能打破OpenAI在AI领域的垄断地位。作为OpenAI的CEO,Sam Altman可能需要重新评估其公司的发展战略,以应对来自DeepSeek等新兴竞争对手的挑战。 Ed Zitron: 我认为DeepSeek的成功在于其能够以更低的成本和更高效的方式构建AI模型。他们通过对GPU进行额外的优化,从而提高了效率,而OpenAI等公司则没有考虑这些。此外,DeepSeek的开源策略也使其能够快速吸引更多的开发者和用户,从而加速其技术的发展。我曾经认为OpenAI在AI领域具有绝对的优势,但DeepSeek的出现让我意识到,创新和效率才是AI发展的关键。我希望OpenAI能够从DeepSeek的成功中吸取教训,并重新思考其发展战略,以保持其在AI领域的竞争力。总的来说,DeepSeek的出现对AI行业来说是一个积极的信号,它表明构建强大的AI模型并不一定需要大量的资金和最先进的硬件。这为更多的公司和开发者提供了参与AI发展的机会,并可能加速AI技术的普及。

Deep Dive

Shownotes Transcript

Translations:
中文

Do you want to see into the future? Do you want to understand an invisible force that's shaping your life? Do you want to experience the frontiers of what makes us human? On Tech Stuff, we travel from the mines of Congo to the surface of Mars, from conversations with Nobel Prize winners to the depths of TikTok, to ask burning questions about technology, from

From high tech to low culture and everywhere in between, join us. Listen to Tech Stuff on the iHeartRadio app, Apple Podcasts, or wherever you get your podcasts.

Hi, and welcome to the very first Better Offline Monologue. This is going to be a short weekly episode where I take a quick look at something going on in the tech industry that doesn't quite warrant a full episode. One might say they're like quick bites of content quibbies, if you will, and this is a business model that's proven successful time and time again. This week, I'm going to give you a distilled rundown of a recent situation that rocked both the economy and the AI world for those of you that either need a refresher or rejected the notion of a two-part podcast.

At the end of January, something happened that radically overturned not just the AI industry's status quo, but also called into question the dominance of the American tech industry.

Our story starts on January 20th, when a little-known Chinese company called DeepSeek released its R1 AI model, terrifying the Western tech behemoths that plowed over $200 billion combined into data centers and industrial-grade graphics processing units, GPUs for others, to power generative AI models like those behind ChatGPT and Anthropix Claude.

Like OpenAI's R1 model, DeepSeq's R1 model is a reasoning model, which is a way to say that it works through problems step by step, showing the users the steps it took to reach its conclusion.

Generally, when you make a request of a generative model, it generates an answer probabilistically, meaning it's guessing at each next bit based on the request you've made. In the case of OpenAI's R1 model, and indeed DeepSeek's R1 model, the model thinks. And I use that term loosely. These models do not know anything. They're not thinking. They have no consciousness. But they think through each step by generating it piece by piece and reviewing it piece by piece with separate parts of the model.

In theory, this ability to reason means it's well suited for tasks where there's a definitive right and wrong answer, like logic and maths. It's also what makes it different from the standard chat GPT or GPT-4-0, which is considerably faster as it doesn't undertake this step-by-step thinking, and thus is better suited for more open-ended questions such as "What would it be like if Garfield had a gun?" To be clear, this doesn't mean the answers are any good.

Now just a few weeks earlier, Deep Seek had released another model, albeit to far less fanfare, likely due to it being launched the day after Christmas of course. But nevertheless, it was called V3 and it was still pretty impressive.

V3 competes with the same model that powers ChatGPT as I just mentioned, which at the time of recording this is called GPT-4.0. And that's a more general purpose kind of product. It can write code and solve maths problems, but it's better suited for tasks that are rooted in language. Writing that term paper, summarizing a document, whatever it is you do with this. And it's also important to know that this is the most commonly used style of model. You're not really getting reasoning in everything, at least not yet. And I don't know how prevalent it'll ever be.

Now DeepSeq's tech didn't just match OpenAI in capabilities. It was also purportedly cheaper to train and to operate. Whereas OpenAI's GPT-4 model reportedly cost $100 million to train, some experts estimate that DeepSeq's reasoning model, called R1, cost a lot less than that, and their V3 model actually cost less than $6 million to train. This figure is open to some debate. But the big thing is about these models is they're dramatically cheaper.

They can be run on your computer, though much slower, or they can be run in other cloud infrastructure. And in the case of the v3 model, the one that competes with ChatGPT, it was actually about 50 times cheaper. And the reasoning model R1, about 30, which is crazy. Now these are the prices that are run on the servers where DeepSeek runs, but we're very quickly going to see, as other people host them, exactly how much cheaper they are. And they're more efficient too, which is crazy. They're so much more efficient.

And it's also important to note that they train these models using older generation Nvidia chips because they had sanctions on them from China. They got some of the newer ones too through weird resellers, but nevertheless, this made it much harder for them to get GPUs in general.

And thus, they were able to kind of squeeze more power out of them. They had to come up with really interesting kind of assembly language level stuff where they did extra things with the GPUs that, well, the fat and happy tech executives never thought of. And Sam Altman and his ilk from OpenAI never really thought of because, well, why would they have to be? Why would they have to think of that? They had the unlimited money cheat from the hyperscalers like, in the case of OpenAI, funded by Microsoft, in the case of Anthropic, funded by Amazon and Google.

And this is where the narrative has begun to kind of fall apart, because all of this has made it much harder to justify these companies building new data centers and buying new NVIDIA GPUs. This entire AI boom has been based off of the assumption that the only way to build powerful models was to get the biggest, most hugest chips from NVIDIA each year, and that there was just no way to make these models cheaper. Now, as an aside, OpenAI lost $5 billion in 2024, and all of their products are unprofitable. Even their $200 a month...

Open AI, ChatGPT Pro subscription. I hate these terms, by the way. They're all different. Nevertheless, everyone assumed that there was never going to be a more efficient model. And I personally made the mistake of saying, well, if it was going to be more efficient, surely they would want it to be. Or they could do that, right? Right? Maybe they just have to do this stuff, even though it's stupid. That was never the case. And DeepSeek proved him.

Crucially, DeepSeek released its models under an open source license, meaning any company can reuse and repurpose its tech without having to pay anyone anything, any license fees or anything, or ask anyone for permission. OpenAI, by contrast, keeps its technology under lock and key. Despite their name, OpenAI is a deeply secretive organization, open in name only.

In summary, DeepSeek has created a viable alternative to OpenAI's tech and indeed Anthropix that's equally capable, vastly cheaper, and open source, and proven that you don't need the most expensive and powerful chips to do so.

And they kind of came out of nowhere. Well, DeepSeek isn't exactly a tiny little startup. They're also not a Silicon Valley giant with billions of dollars of venture capital or someone who's backed by one of the many different companies with a $3 trillion market cap. They started off as a side project from a Chinese hedge fund. No, I'm not kidding. Now, still an $8 billion under management hedge fund. They're not small at all.

It's so strange. It's a kind of cynical version of David versus Goliath, where David is a hedge fund baby and Goliath is several different hyperscalers taped together with a bad idea. We...

But anyway, put yourself in the shoes of OpenAI's CEO and co-founder Sam Altman. You've crafted this public perception of yourself as a visionary that isn't just bringing generative AI to the masses, but you're on the path that will bring about artificial general intelligence, which is to say, an AI that's as capable as a human being.

You've crafted this myth, not just about yourself, but about your company and what you'll do. And this has allowed you to, in essence, defy the laws of physics when it comes to business. You can burn money at a rate unlike any tech company in history, with no hope of making a profit, or at least not in the short to medium term, and no real expectation that you'll do so, as investors will still line up to give you more money with your company valued at even more ludicrous numbers seemingly every other month.

You can say these outlandish things like you need $7 trillion to build the infrastructure and chip manufacturing capacity to bring your plants to life, and you don't get laughed out of the room. If I said this shit, they'd ask me if I had a concussion. You can say stuff like, I want to build $500 billion worth of data centers, and instead of people rolling their eyes, the world's largest tech companies and investors will say, damn, man, that's sick. And then it turns out that you were wrong.

You'd always assume that AI must be expensive, that the models used to power your apps like ChatGPT and DALI, their image generator, they'd always cost more to build, they'd always cost more to run, they'd always require more powerful hardware.

Or maybe you just never thought about it too hard because you never have to worry about money. And to grow, to build more capable AI models, you assume that you would always need more money and so much more money than anyone's ever had. And then here comes this Chinese company. It didn't just replicate the functionality of your model. And on top of that, by the way, O1 is OpenAI's one moat. It was their one thing that people liked. It was their most sophisticated AI model.

But this company came along and did it on a shoestring budget, both for actually training it, even if the estimates are off by like factors of 10. But these things are more efficient too. And this company didn't even have access to the most capable GPUs. They didn't have the server architecture provided by Microsoft or Amazon or Google. And wow. And what did they do next with this thing they built that's competitive with the only real moat? They gave it away. Oh, goodness me, Sammy, things aren't looking good at all.

And this is where Sam Altman's at. This is where OpenAI and the companies that backed it and their competitors, this is where they're all at. The decisive lead they once enjoyed has, like a puddle on a hot day, evaporated. And you'd see that happen a lot here in beautiful Las Vegas, Nevada. Now, don't get me wrong. OpenAI still burns money. But now, when Sam Altman dusts off his begging bowl, investors will ask, perhaps for the first time, one very simple question. Why? Why?

Do you want to see into the future? Do you want to understand an invisible force that's shaping your life? Do you want to experience the frontiers of what makes us human? On Tech Stuff, we travel from the mines of Congo to the surface of Mars, from conversations with Nobel Prize winners to the depths of TikTok, to ask burning questions about technology, from

From high tech to low culture and everywhere in between, join us. Listen to Tech Stuff on the iHeartRadio app, Apple Podcasts, or wherever you get your podcasts.

Welcome. My name is Paola Pedrosa, a medium and the host of the Ghost Therapy Podcast, where it's not just about connecting with deceased loved ones. It's about learning through them and their new perspective. I think God sent me this gift so I can show it to the world. And most of all, I help people every single day. Listen to the Ghost Therapy Podcast on the iHeartRadio app, Apple Podcasts, or wherever you get your podcasts.

you are cordially invited to...

Welcome to the Party with Tisha Allen is an iHeart Woman sports production in partnership with Deep Blue Sports and Entertainment.

Listen to Welcome to the Party, that's P-A-R-T-E-E, on the iHeartRadio app, Apple Podcasts, or wherever you get your podcasts.

We want to speak out and we want this to stop. Wow, very powerful. I'm Ellie Flynn, an investigative journalist, and this is my journey deep into the adult entertainment industry. I really wanted to be a player boy in my adult. He was like, I'll take you to the top, I'll make you a star. To expose an alleged predator and the rotten industry he works in. It's honestly so much worse than I had anticipated. We're an army in comparison to him.

From novel, listen to The Bunny Trap on the iHeartRadio app, Apple Podcasts, or wherever you get your podcasts.