Today on the AI Daily Brief, Sam Altman and OpenAI think that their new deep research product is so powerful that it can do a single digit percentage of all economically valuable tasks in the world. Before that in the headlines, DeepSeq was all the rage this week, but now it's being banned by hundreds of companies. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes.
Welcome back to the AI Daily Brief Headlines Edition, all the daily AI news you need in around five minutes. We kick off today with the latest in the deep-seek saga, where less than a week after the R1 model took the world by storm, hundreds of companies are now scrambling to block their employees from accessing it.
According to cybersecurity firm Armis, 70% of their customers have requested website blocks. Competitor Netscope have seen 52% of clients block the site. Now to Israel, the CTO of Armis said, "...the biggest concern is the AI model's potential data leakage to the Chinese government. You don't know where your information goes." It was, of course, extremely well publicized that the model came from a Chinese company and that the data security left a lot to be desired.
The Terms of Service clearly stated that DeepSeek had access to keystroke data and would share it with the Chinese government on request. Still, it seems that some U.S. government employees were logging on from their work computers. Bloomberg reported that the Pentagon blocked DeepSeek late on Tuesday, meaning that employees had access to it for two days.
They wrote, The Pentagon's IT experts are still determining the extent to which employees directly use DeepSeek's system through a web browser. The article also noted that U.S. military personnel had downloaded earlier versions of DeepSeek models to their workstations in the fall of 2024. Bloomberg paraphrased their source, stating, At the time, the downloads didn't raise concern with the Defense Department security teams as the connection to China wasn't clear to them.
Now, of course, this isn't necessarily a problem if the source is referring to downloading and running the model locally. The concern is instead about running DeepSeq models on cloud services that are hosted in China. Still, it seems like military personnel are actively using the model. Nicolas Chalain, the CEO of AskSage, a government-authorized software platform, said that thousands of Pentagon employees are using DeepSeq through them. Separately, even more security issues with DeepSeq have emerged.
And
They also noted the exposure allowed for "potential privilege escalation within the DeepSeq environment, which could have allowed control of internal systems." The vulnerability was responsibly disclosed and quickly secured by the DeepSeq team. It's not clear whether anyone else accessed the database, but WIS researchers told Wired, "It wouldn't be surprising given how simple it was to discover." And yet, if the security environment surrounding DeepSeq continues to be troubling, what's going on underneath continues to show its value.
Cash, an engineer at X, wrote, What I'm seeing, the DeepSeek R1 algorithm basically works and is being reproduced by a lot of people. I didn't expect that. I think it's basically over now. Mark Andreessen pointed out that DeepSeek was now at 23% of ChatGPT daily active users, with far more daily app downloads.
Cash actually freaked out a little bit, basically saying that this was the best example of why there is truly no moat in the frontier models. They wrote, don't invest in open AI. Do not sign any agreements with them. Do not do any business with them. I've been busy all weekend and just got plugged in. I've seen enough things across 4chan, my Xfeed, my own experiments that confirm it's over. There is no moat. There's no data labeling moat. There is no data moat. There barely is a compute moat. It's over. I don't think people actually get it.
Now, speaking of DeepSeq and compute, Mistral board member and Andreessen Horowitz partner Anshni Mitha says that DeepSeq's innovations will only increase GPU demand. He's been tuned into releases from the Chinese lab for the last six months following the release of Coder V2. At the time, it rivaled OpenAI's GPT-4 Turbo for coding specific tasks right at the top of the leaderboard. His logic is that chip demand won't slow down as a result of more efficient models. Companies will simply do more with the compute they can obtain.
He said, When people are like, OK, Anj, Mistral has raised a billion dollars. Does DeepSeek mean that all that billion dollars is completely unnecessary? No, actually, it's extraordinarily valuable for them to be able to look at DeepSeek's efficiency improvements, internalize them, and then throw a billion dollars at it. Now we can get 10 times more output from the same compute. He also noted that open source projects have an edge through free technical labor from people who want to use the products. Closed source rivals have to pay for all the labor as well as the compute. More on OpenAI's recent comments around open source in the main part of the episode.
The major change, he argues, from DeepSeek is that nations are beginning to wake up to AI as the next foundational infrastructure, akin to electricity and the internet. He believes that countries should start to consider infrastructure independence, meaning that each country should think carefully about whether they want to rely on Chinese models or Chinese-hosted data, or if they want to use Western models that follow Western laws and ethics. Still at the moment, his main problem is getting enough inference. He had a message for companies thinking about ditching their data center plans, requesting, if you have extra GPUs, please send them to Ange.
Lastly today, the EU AI law ramps up with their first major compliance deadline. As of Sunday, regulators can now ban the use of any AI system they deem to pose an unacceptable risk. The definition is aimed at AI deployments that interact with citizens in an assortment of contexts. A non-exhaustive list includes AI used for social scoring, AI that manipulates a person's decisions subliminally or deceptively, AI that exploits vulnerabilities like age, disability, or socioeconomic status, AI that attempts to predict people committing crimes based on their appearance.
Companies involved in these use cases can now be subject to fines regardless of where they are headquartered. The maximum fine is 35 million euros, or 7% of annual revenue from the prior fiscal year. Rob Summeroy, the head of technology at the British law firm Slaughter & May, said that the fines won't be imposed for some time. Organizations are expected to be fully compliant by February 2nd, but the next big deadline that companies need to be aware of is in August. By then, we'll know who the competent authorities are and the fines and enforcement provisions will take effect.
That's going to do it for today's headlines. Next up, the main episode. Today's episode is brought to you by Vanta. Trust isn't just earned, it's demanded. Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex. That's where Vanta comes in.
Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001. Centralized security workflows complete questionnaires up to 5x faster and proactively manage vendor risk. Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back so you can focus on building your company.
Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vantage to manage risk and improve security in real time.
If there is one thing that's clear about AI in 2025, it's that the agents are coming. Vertical agents by industry, horizontal agent platforms, agent-based platforms,
agents per function. If you are running a large enterprise, you will be experimenting with agents next year. And given how new this is, all of us are going to be back in pilot mode.
That's why Super Intelligent is offering a new product for the beginning of this year. It's an agent readiness and opportunity audit. Over the course of a couple quick weeks, we dig in with your team to understand what type of agents make sense for you to test, what type of infrastructure support you need to be ready, and to ultimately come away with a set of actionable recommendations that get you prepared to figure out how agents can transform your business.
If you are interested in the agent readiness and opportunity audit, reach out directly to me, nlw at bsuper.ai. Put the word agent in the subject line so I know what you're talking about. And let's have you be a leader in the most dynamic part of the AI market. Hello, AI Daily Brief listeners. Taking a quick break to share some very interesting findings from KPMG's latest AI Quarterly Pulse Survey.
Did you know that 67% of business leaders expect AI to fundamentally transform their businesses within the next two years? And yet, it's not all smooth sailing. The biggest challenges that they face include things like data quality, risk management, and employee adoption. KPMG is at the forefront of helping organizations navigate these hurdles. They're not just talking about AI, they're leading the charge with practical solutions and real-world applications.
For instance, over half of the organizations surveyed are exploring AI agents to handle tasks like administrative duties and call center operations. So if you're looking to stay ahead in the AI game, keep an eye on KPMG. They're not just a part of the conversation, they're helping shape it. Learn more about how KPMG is driving AI innovation at kpmg.com slash US.
Welcome back to the AI Daily Brief. As we have discussed quite a bit last week, all of the discussion was around DeepSeek. How powerful it was, what the geostrategic implications were, how it was likely to impact the AI industry, what it meant for the stock market. This was the conversation, and it's quite clear that OpenAI did not love being second fiddle.
Today, we're going to discuss OpenAI's latest reasoning model release, some interesting comments from Sal Maltman on open source, and why they think their new deep research product is an agent that can actually do a percent or more of all economically valuable work on the planet. First off, though, let's start with the basic news. On Friday, OpenAI released O3 Mini, the latest in their line of reasoning models. The model promises similar performance to the O1 family of models, but with increased speed and reduced cost.
OpenAI claims that external testers preferred answers from O3 Mini over O1 Mini more than half the time in A/B testing. They also observed a 39% reduction in major errors on difficult real-world questions. The new model includes three different settings for reasoning effort: low, medium, and high. These settings determine how much compute is used and allow the model more time to come up with a response. At the highest setting, O3 Mini is capable of beating the full version of O1 on some benchmarks related to coding science and mathematics questions.
OpenAI is also making this model extremely developer-friendly from day one. It's already available through APIs and is the first reasoning model to support function calling, structured outputs, and developer messages. OpenAI says this will make it production-ready out of the gate.
And indeed, it definitely feels like this release was informed by the release of DeepSeek. Specifically, breaking from the pattern they've had recently of announcing certain models but then only making them available to paid or even pro tiers, this model is available for free users as well. That makes it the first reasoning model accessible in the free tier and a break with OpenAI's usual staged rollout.
Pricing is also much more competitive than we're used to seeing from OpenAI. API access is 63% cheaper than O1 Mini and roughly twice the cost of DeepSeek R1. Paid-tier customers will now have rate limits of 150 queries per day, which is three times as many as O1 Mini.
Reflecting on just how big a change we've seen in a very short period of time, Professor Ethan Malek commented, He also added Gemini Flash thinking as well, but you have to use Google AI Studio.
Yet to some, this release felt quiet compared to the hype that normally surrounds a new model from OpenAI. Benjamin DeKraker, who, grain of salt, is a member of the XAI data team, wrote, "'Vibe check. O3 mini release reaction feels very muted, underwhelming, not seeing as much excitement on the timeline.'" Tier Taxis writes, "'I don't know what they could have released short of GPT-5 tier shocker to regain narrative momentum. Small, very strong model is a good move.'"
And indeed, while there wasn't stop-the-presses kind of energy around this, lots of folks were impressed. Coffee Vectors prompted O3 Mini to create a 3D water simulation that would run in Blender. The model created a full Python script compatible with the rendering software, although it did take a few tries.
Mike Bespalov created a fully functioning image to ASCII art conversion app. He wrote, Okay, OpenAI's O3 is insane. Spent an hour messing with it and built an image to ASCII art converter, the exact tool I've always wanted. And it works so well. Yeah, older models could do this, but with O3 I didn't rewrite a thing. No debugging, no retries. Just a few prompts and boom, it worked. Like, perfectly.
Adana Singh, a contributor to the Minecraft Bench Project, showed how dramatic the difference between 01 and 03 mini was on creative tasks. When prompted to build an amazing large organic and epic floating island city in Minecraft, the improvement was very noticeable.
Some people even found ways for O3 Mini to compete head-on with DeepSeek's R1. O3 Mini created a much better version of a realistic physics demo of a ball bouncing around a hexagon. It also out-competed in a simple snake game. Marc Adala Maria writes, ChatGPT just released O3 and it's by far the best AI coding model. It can one-shop full apps instantly and people are doing some amazing things. Still, I think it's fair to say that at least initially the hype was subdued. However, that wasn't the only thing that OpenAI had in store for the weekend.
On Sunday, they released a new agent called Deep Research. The agent can access the internet to conduct multi-step research and compile a report. OpenAI wrote, it accomplishes in tens of minutes what would take a human many hours. Powered by a version of the full O3 model, the agent can ingest a huge amount of data from text, images, and PDFs, and also has the ability to redirect its research based on the information it gathers.
OpenAI wrote, As OpenAI says, this is built for people who do, quote, intensive knowledge work. They write, It also says it's particularly effective at finding niche, non-intuitive information that would require browsing numerous websites.
While it's only been out for a very short period of time, some people have had early access and were quick to jump in and share their thoughts. Professor Ethan Malek again writes, "'OpenAI's deep research is very good. Unlike Google's version, which is a summarizer of many sources, OpenAI is more like engaging an opinionated, often almost PhD-level researcher who follows a lead. More of an agentic solution than Google's approach, which is much less exploratory but examines far more sources."
If you want an overview, Google's version is really good. If you want a researcher to go digging through a few sources, getting into the details but being very opinionated, you want OpenAI's. Neither has access to paywalled research and publications, which limits them for now. Kevin Bryan, an associate professor of strategic management at the University of Toronto, put the feature through its paces. He asked it to analyze the McKinley Tariff of 1890 through the lens of modern trade theory. It produced an 18-minute academic-style paper complete with citations in 10 minutes.
Brian added, "...how good can it do literally one shot? I mean, not bad. Honestly, I've gotten papers to referee that are worse than this. The path from here to steps where you can massively speed up the pace of research is really clear." He also believes this has some big implications for universities, adding, "...I think the research uses are obvious here. I would say, for academia, the amount of AI slop you're about to get is insane. In 2022, I pointed out that undergrads could AI their way to a B. I am sure, for B-level journals, you can publish papers you quote-unquote wrote in a day."
Many institutions will need to change to handle tech like this, and it's only getting better by the month.
Still, I think for many, their minds were not on the academic uses, but on the economic potential. Sam Altman commented, "...this is like a superpower experts on demand. It can go use the internet, do complex research and reasoning, and give you back a report. It's really good and can do tasks that would take hours a day and cost hundreds of dollars." He even added in almost a throwaway line, "...my very approximate vibe is that it can do a single-digit percentage of all economically valuable tasks in the world, which is a wild milestone."
Now, yes, grains of salt, perhaps even bags and bags full of salt, when it comes to the fact that, A, this is the CEO of a company who is currently reportedly raising more money, and B, is only saying my approximate vibe. But still, the fact that he is willing to say that this new agent can do a single-digit percentage of all economically valuable tasks in the world, which would represent over a trillion dollars of value, is fundamentally nuts.
Daria Anoukmaz, a professor at the Jackson Laboratory, wrote, I can finally reveal that I've had access to OpenAI's deep research since Friday, and I've been using it nonstop. It's an absolute game changer for scientific research, publishing, legal documents, medicine, education, for my tests, but likely many others. I'm just blown away.
Every Stan Shipper had an even more bombastic take, tweeting, It absolutely blew my mind. First AI product to do that in a while. Here's what it felt like to me. It is a chauffeured stretch limo for the information superhighway. It is a double-decker tour bus, but you're the only passenger and the city you're touring is the sum total of human knowledge. It's C-3PO but less neurotic. It's Samuel Tarly but not as bumbling. It's Hermione if she ever got tired. In other words, it is a bazooka for the curious mind.
Now, I have just started to play around with it. I've got something running right now. Later in the week, I am definitely going to do a use cases type episode. But these are big words and big claims. And I, for one, am excited to see what actually is possible.
Now, lastly today, I wanted to hit these comments from Altman around open source. From a sheer access standpoint, DeepSeek and OpenAI have put extremely powerful reasoners in the hands of a ton of people. Noam Brown wrote, O1 was released less than two months ago. O3 Mini was released two days ago. Deep Research was released today. It's a powerful tool, and I can't wait to see what the world does with it. But AI will continue to progress rapidly from here.
Sam Altman can clearly feel the acceleration. He's been publicly discussing the rapid approach of AGI for months, and during a Reddit AMA over the weekend, one person asked whether recursive self-improvement of AI models would be a gradual process or a hard takeoff. Altman responded, I personally think a fast takeoff is more plausible than I thought a couple of years ago. Probably time to write something about this.
Another Redditor asked whether OpenAI would consider releasing model weights and publishing research. Altman said, Yes, we are discussing. I personally think we've been on the wrong side of history here and need to figure out a different open source strategy. Now, what exactly that means, we don't know, but pretty interesting to see the tune shift there.
A couple more things to get you excited before we get out of here. Chief Product Officer Kevin Wheel says that they're still working on the 4.0 image generator and that it's going to be worth the wait. And apparently a full O3 version is coming in, quote, more than a few weeks, less than a few months. So that is the story here. For those of you with ProAccess, let me know what you were doing with deep research, whether it's working or not. And like I said, I will be back later this week with an update from my own explorations as well. Appreciate you listening or watching as always. And until next time, peace.