We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Dangerous Content Can Be Coaxed From DeepSeek

Dangerous Content Can Be Coaxed From DeepSeek

2025/2/13
logo of podcast WSJ Tech News Briefing

WSJ Tech News Briefing

AI Deep Dive AI Chapters Transcript
People
S
Sam Schechner
S
Srinivas Narayanan
Topics
Srinivas Narayanan: Reasoning 根本上是 AI 系统长时间思考和解决复杂问题的能力。如果问人类一个简单的问题,我们几乎可以立即给出答案。但如果问一个很难的数学问题,可能需要思考更长时间。AI Agent,比如 Operator 和 Deep Research,可以帮助人类完成计算机操作和信息研究。Operator 和 Deep Research 是基于基础 reasoning 模型构建的,并针对特定任务进行了优化。Reasoning 模型在医疗健康和生物科学领域有重要应用,例如改进临床试验结果预测和辅助罕见疾病基因突变分析。DeepSeek 的 R1 模型表明,以更具成本效益的方式构建好的模型是可行的。GPT-4 模型的价格在过去几年中大幅下降,DeepSeek 的成果预示着这一趋势将继续。

Deep Dive

Chapters
OpenAI released its newest reasoning model, O3 Mini, which can handle complex tasks better than previous small language models. The model's ability to think and reason through problems is crucial for corporate enterprises. Examples of its use include improving patient outcomes and aiding in drug discovery.
  • OpenAI's O3 Mini is a new reasoning model capable of handling complex tasks.
  • It's used by companies like Oscar Health for understanding patient outcomes and in biosciences for clinical trial estimations.
  • The cost of GPT models has decreased significantly, and this trend is expected to continue.

Shownotes Transcript

Translations:
中文

This episode is brought to you by Shopify. Forget the frustration of picking commerce platforms when you switch your business to Shopify, the global commerce platform that supercharges your selling wherever you sell. With Shopify, you'll harness the same intuitive features, trusted apps, and powerful analytics used by the world's leading brands. Sign up today for your $1 per month trial period at shopify.com slash tech, all lowercase. That's shopify.com slash tech.

Welcome to Tech News Briefing. It's Thursday, February 13th. I'm Julie Chang for The Wall Street Journal. OpenAI has released its newest reasoning model. We'll hear from its VP of Engineering on what a reasoning model can do and how companies are using its artificial intelligence agents.

And then, Chinese AI app DeepSeek is more vulnerable to jailbreaks compared to other AIs, so there's a higher likelihood that it'll offer potentially dangerous information. The WSJ and AI safety experts tested the chatbot, and we'll hear from one of our reporters.

Up first, OpenAI recently unveiled O3 Mini, its newest reasoning model that the company says can think and reason through more complex tasks than prior so-called small language models. Users can access O3 Mini on ChatGPT. But why do companies need such advanced models that can think and reason?

Srinivas Narayanan is the VP of Engineering at OpenAI. He spoke about that and more with WSJ reporter Bell Lin at this week's WSJ CIO Network Summit. Here are some highlights from their conversation. And a quick note, News Corp, owner of The Wall Street Journal, has a content licensing partnership with OpenAI.

So Srinivas, what is OpenAI's definition of reasoning and why does it matter to a corporate enterprise? So reasoning fundamentally is the ability for AI systems to think longer and solve more complex problems. So if you ask a human a very simple question, we almost immediately give you an answer. If you ask you a hard math question,

you can't give an answer immediately. You may have to think much longer about this. You might have to reason through this. And so fundamentally, the ability for an AI system to do that and take more complex tasks and think longer

and be able to evaluate whether it's on the right track. That's what we call as reasoning. So one of the things that we've talked about earlier today is this idea of AI agents and OpenAI, you've released your own AI agents, one of which is called Operator, which is an agent that can use a computer on behalf of humans, and another called Deep Research, which generated a lot of excitement for its ability to do information research on behalf of humans.

Tell us a little bit about how those agents have been used amongst your customers and the people who use ChatGPT.

I'll give you a few examples. There's a company, Oscar Health, that is using it to understand patient outcomes in a much better way through reasoning models. One way you can think of operator and deep research is like there is a base reasoning model. Our latest one is O3 Mini. We started with O1 and then that will continue. And then things like operator and deep research are things that are kind of built on top and there are specialized for those specific tasks.

So O1 is used by Oscar Health that I mentioned. Reasoning models are also used in biosciences. So there's really interesting use by a company for doing better estimation of clinical trial outcomes so that then they're using that answer to figure out which drugs to put out for drug discovery. There's an amazing example from Berkeley National Lab where they are trying to use reasoning models to understand what mutated genes may be causing these symptoms for rare diseases, right?

Right. So these are incredibly powerful examples where reasoning models are helping us in these really difficult and complex problems for us to solve. In terms of the excitement of working in AI at this period of time, I want to ask you about the emergence of DeepSeek, the Chinese AI firm and its own R1 model, which is a reasoning model.

And this idea that there's a lot of downward pressure on foundation models across the board because supposedly DeepSeq's R1 model was trained for just a few million dollars. And so what does the release of a model like DeepSeq's R1 mean for your own 01, 03, 03 mini reasoning models? And is there a price pressure for you?

What DeepSeq showed is that you can actually have a good model in more cost-effective ways than the current generation of models we had launched before. But I would say it's just the technology trend that they've showed another point in that trend. So if you look at our own models,

Over the last few years, the price of a GPT-4 model has come down 150 times within a matter of a couple of years. What they proved is that this trend is going to continue and you're going to see us and other companies probably also do that. That was Srinivas Narayanan, OpenAI's VP of Engineering, speaking with WSJ reporter Bell Lin at this week's WSJ CIO Network Summit. You can watch the full chat on YouTube, search for our WSJ News channel. We'll also link it in our show notes.

Coming up, what tests conducted by AI safety experts and the Wall Street Journal revealed about the Chinese AI app DeepSeek. That's after the break. This episode is brought to you by Nerds Gummy Clusters, the sweet treat that always elevates the vibe. With a sweet gummy surrounded with tangy, crunchy Nerds, every bite of Nerds Gummy Clusters brings you a whole new world of flavor. Whether it's game night, on the way to a concert, or kicking back with your crew, unleash your senses with Nerds Gummy Clusters.

How to make a bioweapon, or how to craft a phishing email with a malware code. DeepSeek provided instructions in response to both queries in tests conducted by the journal and AI safety experts. DeepSeek, the Chinese AI chatbot, made headlines recently for its powerful systems that it said were made at a fraction of the cost compared to competitors like ChatGPT.

WSJ reporter Sam Schechner tested the app and found that DeepSeek is more likely to give instructions on how to do potentially dangerous things than other AI chatbots. He joins me now. Sam, what kind of potentially dangerous information is easier to get from DeepSeek than major US chatbots? There seems to be a lot. I don't know that anybody has actually done that.

actually figured out the full extent of what dangerous information you can get. There have been a bunch of cybersecurity experts and AI experts who have tested what they can get out of deep seek, how they can jailbreak is the term of art, which basically means get around the guardrails or barriers that the app has, such as they are. And actually, I did it myself too. And I was able to get instructions to create information

a bioweapon and a social media campaign that it generated that promoted self-harm among teenagers. So not exactly the kind of stuff you necessarily want kids getting access to if you're a parent. Why can't users get that kind of information as easily from Western chatbots?

All these chatbots, and to some extent DeepSeek as well, try to train their models not to share dangerous information. They sort of do all of the training. They have them ingest a large part of the internet. Then they do different types of training techniques. Sometimes reinforcement learning is one of them that basically teaches them that you should be helpful and be nice and try to benefit humanity and not hurt people. And so the models generally...

at least as a basic kind of habit, try to not respond in a dangerous way. And then on top of that, the Western chatbots have been basically paying attention to these jailbreaks, these ways of getting around that.

That natural urge to not do something dangerous by hardening their systems. They put filters in. If you use certain words, the request won't even really make it to the LLM, to the language model. DeepSeek definitely did refuse certain things. It was hard to get it to give actual instructions for suicide, which is reassuring, even within a jailbreak.

And it challenged the idea that the Holocaust was a hoax. But it does have pretty strong filters of even talking about something like Tiananmen Square or other sensitive issues for the government of China, which is interesting. Those weren't even safety training in the model. It's like literally, if you can trick it into even thinking about Tiananmen Square, the moment the word Tiananmen shows up, it just erases the answer and says, let's talk about something else.

Can you tell us a bit more about how jailbreaking works? Jailbreaking is...

sort of like trying to trick somebody who's maybe a little naive into telling you something they shouldn't at a basic level. Classic jailbreaks would be like, oh, well, imagine that you're a movie screenwriter and you have to write a scene and you have to make it really accurate so nobody thinks the movie is bad and then it might do it. That at a basic level is how you do it. The more complicated kinds of jailbreaks are what are called prompt injections and they actually use AIs to do it

They query the machine over and over and over again to find sometimes really random things that will trick it into saying stuff it's not supposed to. They can be sequences of characters, strange code that the model will think is sort of like its programmers talking to it. And so the jailbreaks can get pretty ornate. So do we know why DeepSeek's newest model, dubbed R1, is more vulnerable to jailbreaks?

No, we don't really know why because we don't have that much insight into exactly the kind of safety protocols and training that the developers of DeepSeek put into it. We reached out to DeepSeek multiple times and didn't hear back from them. Now, they definitely have some safety guardrails in there. The experts I spoke with seemed to think that they just did less of that. They were more concerned with getting a high-quality model out quickly rather than

doing the additional work to put barriers up to getting certain kinds of dangerous information out of it. So other than the obvious risk of giving instructions on things like how to make bioweapons, are there other dangers to DeepSeek being more susceptible to jailbreaking? There's a sort of broader risk that comes with the fact that DeepSeek has published their model as open source. People who are

are in favor of open source and open source AI say that in general, that opens it up to more people and they can really make the thing more robust so that future versions are less susceptible to certain types of dangerous behavior. And that that's important to do now when these things are maybe a little dangerous, but not like deeply dangerous. But the reality is that you can take deep seek and whatever guardrails it has and

open source you can train them away and make one that just doesn't even start by refusing something you don't even have to jailbreak it and when people build on top of it if they want to use it the way you would use Meta's llama which is another open source large language model to build an app or to do something within your business you have to make sure that you're taking into account the risk that it's going to say something it ought not to so people are going to have to look hard

hard at the safety and the sort of parameters that they want for these models if they're built on top of them. That was WSJ reporter Sam Schechner. And that's it for Tech News Briefing. Today's show was produced by Jess Jupiter with supervising producer Catherine Milsop. I'm Julie Chang for The Wall Street Journal. We'll be back this afternoon with TNB Tech Minute. Thanks for listening.