We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
People
Z
Zeyi Yang
主持人
专注于电动车和能源领域的播客主持人和内容创作者。
Topics
Zeyi Yang: 我认为市场对DeepSeek的反应有些过度。作为一年来关注中国AI模型发展的人,DeepSeek的出现虽然令人惊讶,但不足以引发整个市场的恐慌。美国和中国的AI竞赛存在差距,中国公司目前处于追赶地位,DeepSeek的出现并非改变游戏规则的事件,而是中国在AI领域持续努力的结果。DeepSeek创始人杨文芳的目标是推动人类向通用人工智能(AGI)发展。 关于DeepSeek开源的意义,我认为这不仅在AI领域,在许多技术领域,封闭源和开源的竞争都非常激烈。开源意味着更多的人可以使用和改进产品,从而加快发展速度。DeepSeek的低成本和低能耗也为AI发展提供了另一种路径,这值得我们重新思考AI发展的方向。美国对中国的芯片出口管制促使中国寻找AI发展的替代方案,DeepSeek正是这种努力的体现。 关于中美AI竞争的未来,我认为人们应该更多地关注中国在AI领域的进展,避免轻视中国公司的实力。DeepSeek的出现提醒我们,中国在AI领域并非没有竞争力,我们应该密切关注其发展,并从中吸取经验教训。 主持人: DeepSeek的优势在于成本低、能耗低且为开源产品,这使其在与美国AI巨头的竞争中脱颖而出。DeepSeek的数据可能流向中国,这可能会引发美国立法者的担忧,但其开源特性降低了数据安全风险。 Zhuoi Wang & Leslie Keng: DeepSeek的用户体验与其他聊天机器人相似,易于上手,这有助于其在全球范围内的推广。 Zeyi Yang: OpenAI指责DeepSeek窃取数据,这与OpenAI自身被指控使用互联网数据训练模型的情况存在相似之处,这体现了AI时代版权的复杂性。公众对AI产品的需求可能比预想的要高,但人们是否愿意为AI产品付费仍是一个疑问。DeepSeek的成功并非对AGI叙事的挑战,而是对AI发展路径的重新思考。Stargate项目体现了美国政府和私营部门合作对抗中国AI竞争的决心。

Deep Dive

Chapters
The Chinese AI company DeepSeek released a new AI model, causing market fluctuations. The model, DeepSeek R1, is a chatbot similar to existing models but is significantly cheaper and more energy-efficient. This raises questions about the US's approach to AI development and its response, potentially signifying a shift in the AI landscape.
  • DeepSeek R1, a new AI chatbot, was released by a Chinese startup.
  • The model is significantly cheaper and more energy-efficient than US counterparts.
  • The market reacted strongly, with some viewing it as an overreaction.

Shownotes Transcript

Translations:
中文

This episode is brought to you by Shopify. Forget the frustration of picking commerce platforms when you switch your business to Shopify, the global commerce platform that supercharges your selling wherever you sell. With Shopify, you'll harness the same intuitive features, trusted apps, and powerful analytics used by the world's leading brands. Sign up today for your $1 per month trial period at shopify.com slash tech, all lowercase. That's shopify.com slash tech.

The PC gave us computing power at home, the internet connected us, and mobile let us do it pretty much anywhere. Now generative AI lets us communicate with technology in our own language using our own senses. But figuring it all out when you're living through it is a totally different story. Welcome to Leading the Shift.

a new podcast for Microsoft Azure. I'm your host, Susan Etlinger. In each episode, leaders will share what they're learning to help you navigate all this change with confidence. Please join us. Listen and subscribe wherever you get your podcasts. All right, well, the market's lost in deep thought over DeepSeek's claims that it built an AI model in just a couple of months for less than $6 million. Not a B, million dollars. It's a tiny fraction of what U.S. companies are spending on...

Earlier this week, the stock market went on one of those pit-in-your-stomach rollercoaster rides. The Nasdaq, led by chipmaker Nvidia, lost a trillion dollars in value before calming down a little. All because of a Chinese AI company called DeepSeek and a new app it rolled out on January 20th.

Zeyi Yang was watching the market chaos with a little skepticism. I mean, for me, I would say it's a little bit of an overreaction. As someone who has known this company for a year, who have been kind of tracking the development of AI models in China, this is like a little surprising to me, but not enough to cause a huge panic, right? Like of the whole market.

Ziyi covers Chinese tech for Wired. And the thing about DeepSeek R1 — that's the name of this new model — is that it's an AI chatbot. If you've tried ChatGPT or Clod or Gemini, you won't find it all that different.

But it does show you what the model is "thinking" as it works to find an answer to your question. And the information it spits back can be kind of charmingly phrased. When I asked it for clarification on something, it said, "You're keeping me on my toes. Thank you." But what really sets DeepSeek apart is that it was pretty cheap to make, uses less power than American AI chatbots, and it's Chinese.

Yeah, for a while now, there has been this sort of quiet AI arms race between the US and China. And I wonder if the DeepSeek release really turns up the volume on that. I think so. I think there's a reason why it has been quiet so far. It's because the race isn't that kind of like

level. American AI models like ChatGPT and American chips have simply been better and faster. And then Chinese companies have sort of just been like trolling behind that. They're like six months after ChatGPT was released, they released a similar chatbot that has reached a similar level of performance. So there's a race going on down there, but because there's like a clear gap between the levels in the US and China, people are just like kind of

Today on the show, DeepSeek may not be the AI version of Sputnik.

it exposes a potentially huge flaw in the American approach to AI. I'm Lizzie O'Leary, and you're listening to What Next TBD, a show about technology, power, and how the future will be determined. Stick around. Imagine what's possible when learning doesn't get in the way of life. At Capella University, our game-changing FlexPath learning format lets you set your own deadlines so you can learn at a time and pace that works for you.

This is my bill?

Now Business Taxes is a TurboTax small business expert who does your taxes for you and offers year-round advice at no additional cost so you can keep more money in your business. Now this is taxes. Intuit TurboTax. Get an expert now on TurboTax.com slash business. Only available with TurboTax Live Full Service. I wonder if we could back up and you could tell me a little bit of the story behind this DeepSeek model. The company was founded by an engineer, a

Yang Wenfang, who made a lot of money in the investment world. What was his goal with this app?

He is sort of like secluded. He doesn't really talk to media. So it's really hard to get a story from himself. But just according to his resume or to his only two interviews he has done with Chinese media, he has this belief that humans can achieve AGI, generative artificial intelligence, at some point. And he wants to be the one kind of pushing human beings towards that goal.

On Monday, DeepSeek became the most downloaded free app in the US. And I wonder if you could describe what's unique about it. What's different?

different from chatbots that American users might be familiar with? ZHUOI WANG: Well, if you have used the app, I feel like the experience feels very similar to chat-- LESLIE KENG: Yeah, it does. ZHUOI WANG: --right? And I think I've also seen reporting from my colleagues this week that even just the design of the app, like the background code of the app, really resembles open AI products. And I think that's a good thing for American people to try out DeepSeek, because there's no learning curve.

If you are someone who has used chatbot apps for a year now, you'll find it very simple to use DeepSix app. And I think that helps the product kind of become more popular, more viral outside of China. But there are some things that set it apart.

namely in the cost and the compute involved. Can you describe that for me? Yeah, so I think one of the models last year about v3, which is sort of their latest generation model but not fine-tuned for reasoning task, says that to train this model, they only used about $5.8 million USD to train that model. Wow.

Which is a very small number compared to what I guess the American AI companies have been saying. Well, I will caution that there's some nuance there. They're basically just saying that from their last generation model to this generation model, it costs only $5.8 million. Well, you also have to spend to train the last generation model.

And I don't think that includes the salary they pay for their scientists to do research on this. So the actual number has to be a lot more bigger than this. But still, just this number itself, it's surprising to a lot of people in the industry because they will assume to achieve this kind of task, you need maybe tens or hundreds of millions of dollars. My understanding is that DeepSeq is open source. Why is that significant? Is it?

I think so. I mean, this is not just in AI, but in a lot of technology fields, there's this kind of rivalry or competition between the closed source development and the open source development.

When you're a closed source, you are talking about most of the tech products we know of right now, like Microsoft, like Google. If you want to use their product, you have to pay them. You have to just use the product the way they offered it. You cannot go into the codes and see how you can tweak this and that because that's how they want you to use it.

Open source is a completely other field. You're talking about you're just releasing the product out there on the internet, releasing all the codes with it, and anyone who wants to use it for their own commercial uses can just freely adapt it. This is what DeepSeek has done. If you are, for example, an American company that wants to adapt an AI model for your very specific use, like for your business purpose, this is basically a free model for you to use compared to if you're using Chachapiti, you have to pay them for every question you ask.

So the fact that they're releasing as an open source just guarantees there are more people who will be willing to try out DeepSix products. When there's like a bigger user group, there's a bigger research collaboration going on with our products, then we're going to get feedback. We're going to get suggestions on how we can improve it faster. And that helps them catch up with the closed source companies.

I'm also curious about the energy footprint because the big American models that we've seen so far demand a tremendous amount of computing power, which then on the back end means more data servers, more power, you know, even potentially restarting nuclear power plants. So how does the energy use compare? I think what DeepSeek has achieved, maybe not intentionally, is to propose an alternative pathway for the development of the AI industry.

So far, because OpenAI has been kind of this shining star of success, everyone's just like keep following their pathway, which is to acquire more GPUs, use them for longer times of training, and believe that the scaling effect will lead us to like the most powerful AI models ever. And I think most of the people in Silicon Valley and in the government circles have like bought in on that so far.

But with DeepSeq, it's an opportunity to reconsider all of this calculation. It's like previously we're thinking, oh, we are sacrificing a lot on the environmental effects because maybe our only way to achieve a better AI is to kind of like to sacrifice the environment.

But now people are thinking that actually, if there's a way to put our resources into just making our models more efficient, making them cheaper, making them easier to train, then that has like a larger marginal return than just keep buying more chips. So I do think...

I would really love to see more conversations inspired by the DeepSeq success to just go out there. Maybe we don't have to focus on getting more chips and data centers to increase our AI capability. There are a lot of ideas that may not have been even explored at this point because people have just thought that there's only one way to go. I want to put this in a little geopolitical context. You have been covering Chinese technology for some time.

Is there a way to sort of characterize the Chinese government and Chinese companies approach to AI development? I think it's really interesting and also kind of changes. Right now, I think we're in a period where

both the Chinese government and the Chinese companies wants to catch up, especially after ChiaGPT was released. I think a lot of the people in China, both in governments and in private companies, were shocked by how much advances has happened in AI.

So from then, I believe 2022 to now, there's always been this model, "Oh, we have to catch up with the CHIPE and GBT. How do we do that?" You can say shit, it's okay. How do we do that? And the governments are always feeling like they have been tough on their tech sectors for a while. But maybe we need to relax a little bit just to make sure our companies stay competitive.

They're saying that, well, do whatever you want. Just try to not mess up. But then within the kind of the scope there, do whatever you want. In 2022, the U.S. put export controls in place that were designed to restrict China's access to AI semiconductors, manufacturing equipment, really to try to hoard it.

if I can say that, the best stuff. How did China respond? So they didn't realize that chips are so important in the making of an AI product. And also that America has such a big control over the global chip supply chain.

What has happened over the past two years is that the US government has been building and really reinforcing this chip export control regime to make sure that it's really hard for a Chinese company to buy the most advanced chips in China. And that is a challenge for basically every single AI company there in China because they want the best chips. They want to use them to maybe follow the same path that OpenAI has gone down.

But it's just not a possibility right now. That's part of why DeepSeq has gone on this adventure of finding out the most efficient way to train their models, because they already know that for the foreseeable future, it's not going to be very easy for them to get an advanced GPU. So you might as well focus on something else. The other consequence of the expert control, I would say, is that there is this push for China to develop its own GPU.

Right. But it's going to take a real long time and a lot of resources because you know chips are such sophisticated like hardwares to make. And they're making some progress, but people are still saying that it may take them even like a decade to catch up. And AI is going to advance a lot in a decade. So I think a lot of the Chinese companies or governments are still freaking out about whether they can really catch up with regarding to chips. After the break, what the U.S. plans to do if China does catch up.

This episode is brought to you by Indeed. When your computer breaks, you don't wait for it to magically start working again. You fix the problem. So why wait to hire the people your company desperately needs? Use Indeed's sponsored jobs to hire top talent fast. And even better, you only pay for results. There's no need to wait. Speed up your hiring with a $75 sponsored job credit at indeed.com slash podcast. Terms and conditions apply.

This episode is brought to you by Amazon. Sometimes the most painful part of getting sick is the getting better part. Waiting on hold for an appointment, sitting in crowded waiting rooms, standing in line at the pharmacy, that's painful. Amazon One Medical and Amazon Pharmacy remove those painful parts of getting better with things like 24-7 virtual visits and prescriptions delivered to your door. Thanks to Amazon Pharmacy and Amazon One Medical, healthcare just got less painful.

When I look at how the U.S. government is responding this week, there's a whole lot of stuff happening. In his confirmation hearing, the incoming Commerce Secretary Howard Lutnick said he took a, I'm quoting here, a very jaundiced view of China. And then when you use DeepSeek, your query data presumably is going to China. Do you think we will see lawmakers react in

in the same way they did to TikTok and ByteDance? I think that is very likely because we have seen in the past few years data security is one of the biggest concerns in the US-China kind of tech tensions. And it really manifests in this way that the US government is worried about US nationals using Chinese apps and unwillingly sending their data to China.

This is one of the major justification for banning TikTok in the US. And the way that specific law banning TikTok was written makes sure that it can cover other Chinese companies as well.

So I would say if the government really wants to do it, it would be very easy to apply that towards DeepSeq and make sure that most Americans cannot easily access DeepSeq. But again, coming back to the fact that DeepSeq is an open source model, that actually also means that if you want to use its model, there's a way for you to not send any data to DeepSeq to its Chinese servers at all, but still enjoy the capability of the model.

So in that sense, it's less of a security concern because you don't have to go through Chinese companies, unlike TikTok. In the middle of all of this, we get this announcement that OpenAI, Oracle and SoftBank are coming together to form this thing, Stargate. President Trump announces it. And I look at that.

And I wonder, like, okay, if we're trying to place that announcement in the context of this kind of cold AI war, where does it fit?

I think it really is an example of how the American government and the private sectors are coming together and recognizing this importance to compete with China or maybe other kind of geopolitical forces out there. Because previously you were most seeing that maybe the American government is concerned about the security or national interest

And some of the business will be like, well, we don't want that many regulations. We just want to do whatever we want. But when I see this Stargate project announcement, I was thinking like, oh, they're really combining their forces. They're saying that our priority right now is really to make sure the US stays ahead and China stays behind. And because of that, we're willing to maybe forget some of the discrepancies we have, but just focus on this one really costly project.

There's this other thing that happened that I find kind of ironic in all of this, where OpenAI comes out and said that DeepSeq had inappropriately taken data from its model. But of course, that is almost identical to the charge that many writers, journalists, people who have just made stuff and put it on the internet have leveled at OpenAI, that it's training its LLMs on our data.

Am I wrong to see a little irony here? I see the same as you. I think...

The thing is that copyrights in the AI age is just so messy, right? And because most of these companies don't really open up about how they gather data and use their data, you're just doing the guessing work here. The reason why they inspire kind of this looking into whether DeepSeq copied OpenAI or used their data is because sometimes when you ask DeepSeq's bot, like, who are you? They will answer that, oh, I'm Chad GPT or I'm GPT 3.5 or

something. And people see that as a proof that, oh, you must have copied from GPT. That's the only reason why. Actually, I don't think that's the only reason why. That's one of the explanations. The other explanation is that people have just been putting out so much answers they get from chat GPT on the internet now. And maybe Deepsea just cropped

all of those and learn from all of those tags. And in the end, it was trained to believe that, oh, I am ChatGPT. When I answer a question, I'm going to answer in the voice of ChatGPT. Are they just crawling the outputs of the model or are they just like copying the model structure and everything from there? I think OpenAI is probably trying to figure that out right now.

So it could actually be a question of some of the inputs in a model are the outputs of a previous model. Exactly. It's so messy. And it also speaks to the problem that people have exhausted all the training data online and just trying to grab whatever they have. They don't care if it's high quality, if this comes from another model. If you are a coherent paragraph of text, I want you and I want you to be trained into my model.

I had a funny experience earlier this week where I put the new DeepSeek model on my phone and I was comparing the answers that I was getting from it to ChatGPT. And so I decided to see what it could tell me about myself.

And both models incorrectly insisted that I went to different colleges. Neither of them got the college that I went to right. But they kept coming back to me telling me that it was wrong. And so it does make me wonder that like while there are these amazing technological advances going on.

the stuff that comes out of an LLM isn't necessarily right still. I will say that some people look at DeepScape as a better alternative to OpenAI in terms of the performance or the output. I wouldn't necessarily think so. I think they provide kind of like a comparable or very similar products. The mistakes that ChatGPT makes, DeepScape could make that mistakes too.

And we're also pretty early into like testing a product to really know how capable it is. I wouldn't really see the success of DeepSeq as like a big resistant to kind of the AGI narrative or like this dominant narrative that we really should focus on AI because it still has a lot of the problems that this narrative has. That's kind of where I was going here.

Because for all the heat and light that this has gotten over the past week, I still have questions about consumer demand. Yes, individual consumers, but also enterprise consumers. And clearly, Silicon Valley and many Chinese tech companies think that AI is the way to go. Have you seen...

a corresponding demand that says the consumer out there really wants AI in everything.

if we're talking about just individuals, everyday people like us, I actually feel like there is a larger appetite than I have previously recognized because I've been just talking to my friends, like traveling with them. And I realized a lot of them have started having this habit of using their chat GPT as a search engine. Like they do not use Google at all. So I really think those experiences made me rethink kind of how people work

want to adopt an AI product like that. The other thing I've noticed is that DeepSeq's latest model, R1, it was released as this model that's really good at doing math and reasoning. And that's what I guess most people are testing it for. But within China, there are also a lot of people who's asking the model to write poetry. And they were pretty amazed by the

the output of it. And I think people have some genuine joy there just in joining like, oh, this chap just wrote this beautiful poetry for me and I really like it. So I think there are ways people can kind of like make use of these AI models

But are they willing to pay for it? Maybe not. DeepSeek right now is completely free. I think a lot of assumptions in the business side is built on people are willing to pay for this kind of either enjoyment or knowledge they get from their models. I don't really trust that assumption. Do you think DeepSeek has fundamentally altered something about the way we think of this industry?

quiet AI competition between China and the US? I think I would like people to pay more attention to what's happening in China regarding to AI. A lot of times people just very easily discredit them. Oh, they're still less powerful. They're still censored. So there's no reason for us to talk about them at all. And I think that kind of ignorance really creates this, I guess, lack of attention and research and just

trying to understand more of what's happening in China. I mean, I'm someone who's keeping my eye on them all the time, but I also feel like I wish more people, maybe at least in the AI industry here, would at least acknowledge that sometimes when they're releasing it for the public to use freely, we can take advantage of that and we can make

use it in our models too. And sometimes when they do have really fast breakthroughs and we need to be alerted of that, I just hope there's more dialogues going on and more people just casually paying more attention to what's happening there on the other side of the planet. Thank you so much for coming on and for talking with me. Of course, this has been a really fun conversation. Zui Yang is a senior tech writer for Wired Magazine.

And that is it for our show today. What Next TBD is produced by Evan Campbell, Patrick Fort, Shaina Roth, and Paige Osborne. Our show is edited by Elena Schwartz. TBD is part of the larger What Next family. And if you like what you heard, the number one best thing you can do to support us is to join Slate Plus. You get all your Slate podcasts like this one ad-free, and you'll never hit a paywall on the Slate site. All right, we'll be back next week with more shows. I'm Lizzie O'Leary. Thanks for listening.