We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Will AI Surpass Human Intelligence in 2 Years?

2025/1/24

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

AI Deep Dive AI Chapters Transcript

People

AI Daily Brief 主播

Topics

AI Daily Brief 主播：OpenAI即将发布一款能够自动化简单网络任务的AI代理，未来还计划开发能够处理复杂编程任务的AI代理，以期开发出超越人类在大多数经济上有价值的工作中的AI通用人工智能。当前的AI编码助手效率低下，使用它们感觉像是在管理一群能力不足的实习生。苹果在AI战略方面落后于其他公司，其在AI方面的进展缓慢，应用案例有限，并且过于注重隐私。大型企业明年将开始尝试使用AI代理。大多数商业领袖预计AI将在未来两年内彻底改变他们的业务，但数据质量、风险管理和员工采用等问题是主要的挑战。Anthropic CEO Dario Amodei表示，Claude短期内不会推出图片和视频生成功能，因为Anthropic主要关注企业用户，并优先考虑企业级功能。Anthropic将在未来六个月内发布更高级的模型，并正在努力解决计算资源限制的问题。领先的AI实验室负责人预测，未来几年内经济将发生巨大结构性转变，现在就应该开始思考并为此做好准备。 Jason Liu：当前的AI编码助手效率低下，使用它们感觉像是在管理一群能力不足的实习生。 Ethan Malek：苹果在AI战略方面落后于其他公司，其在AI方面的进展缓慢，应用案例有限，并且过于注重隐私。 Dario Amodei：AGI将在2027年或稍后到来，这将需要对经济进行重组，但他认为如果每个人都面临同样的挑战，情况会更好。AGI的到来意味着人类生产经济价值的方式将不再有效，需要重新协商社会契约。他相信在两到三年内，AI系统将能够在所有任务上超越人类。美国需要保持在AI领域的领先地位，以应对来自中国的竞争，并有效管理自身模型的风险。

Deep Dive

Chapters

This chapter discusses OpenAI's new agent, Operator, designed to automate simple tasks. It also explores the development of a coding agent aimed at senior software engineers, potentially impacting the workforce significantly.

OpenAI's Operator agent is designed for simple tasks like booking flights.
The company is developing a coding agent for senior software engineers to handle complex programming tasks.
This coding agent aims to replicate a level 6 or senior staff engineer's capabilities.

Shownotes Transcript

Translations:

中文

Thank you.

Welcome back to the AI Daily Brief Headlines Edition, all the daily AI news you need in around five minutes. Well, friends, rumors about OpenAI's agent have reached fever pitch as the company gears up for release. The company's operator agent is slated for release this week, according to the information. They wrote that operator is designed to help with tasks like making dinner reservations or booking flights. Essentially, it sounds like it's able to automate simple tasks within a web browser.

And while these use cases might be among those that I constantly rail against people actually caring about, although I have had some of you say that you do want help booking flights, maybe the more interesting news is what OpenAI has planned for future agentic releases. The information again reported that, quote, "...the company is working on AI to help senior software engineers handle more complex programming tasks, a key step in the company's attempt to develop artificial general intelligence that outperforms people at most economically valuable work."

The goal is apparently to create a coding agent capable of handling coding problems that involve multiple steps. This would be in line with the type of agent Sam Altman was talking about when he wrote at the beginning of January that, quote, we may see the first AI agents join the workforce and materially change the output of companies.

OpenAI has reportedly been preparing to test an early version of their coding agent with select customers as they try to make the product as useful as possible. Whereas many coding assistants up till now have been aimed at helping junior and mid-level engineers, the information writes, "...the new coding agent OpenAI is developing by contrast is geared towards senior software engineers..."

It would likely connect to their code repository so that it could handle complex tasks like code refactoring, the process of simplifying code or making it more understandable to human programmers, so they can modify it more easily or prevent the introduction of glitches into the codebase. It could also help identify and reduce duplicate code across a codebase. Sources say the goal is to replicate a level 6 or senior staff engineer, a level of programmer that's expected to be able to take broad guidance from managers and then independently design new apps, features, and systems.

Jason Liu, a developer and AI consultant, said that this level of agent would have clear demonstrable ROI and could boost OpenAI's ability to win enterprise business. Liu reflected on working with the current crop of coding assistants, stating, Everything I do takes seven hours and it feels like I just end up becoming a manager to 10 dumb interns. For now, I think many are just waiting with bated breath to see what Operator actually looks like. Then again, knowing AI, by the time you're listening to this, it could have already come out.

Next up in the headlines, Databricks have officially closed that latest fundraising round we've talked about before, raising $10 billion in equity and a further $5 billion in debt.

The new announcement revealed that Meta had participated as a strategic investor. In an interview last week, Databricks CEO Ali Godzi said that his company has been working closely with Meta's Lama team. It's not clear whether Databricks is a customer of Meta or the other way around, or both. Either way, this is another big investment for Meta in the AI space following their investment in Scale AI last year. Rehar Jark writes, this move might signal Meta is starting to think about more B2B use cases outside of advertising for their LLM efforts.

Moving over into phone land, Samsung have released the latest version of their flagship mobile handset, the Galaxy S25, complete with an integrated version of Google's Gemini Assistant. The updated Assistant is now natively multimodal and capable of completing complex tasks across multiple apps. You can hold a conversation with the Assistant and also add pictures or videos for reference. In the coming months, Samsung says they'll add screen sharing and live video streaming capabilities.

Giving an example of the functionality, Google said users could ask Gemini to search for high-protein lunch ideas and then save them to the Notes app. The key point here is that the Assistant can carry out these multi-stage tasks across up to two apps from a single prompt. Basic on one level, but still a big step towards AI Assistants becoming a new type of interface rather than just an information tool.

For many, it's just another big question mark around Apple's AI strategy. Professor Ethan Malek writes, The last couple weeks seems to really challenge Apple's strategy on AI. The labs are demonstrating advanced agentic models that can run on a phone, some locally, while Apple seems locked into a long-term plan of releasing very limited on-device AI features that get obsoleted fast. Apple seems to bet heavily on slow AI development, narrow use cases, and privacy being a paramount concern to users. So far, those don't seem to be the direction things are heading.

I am never one to count Apple fully out, but boy, do they have some ground to catch up on. For now, though, that is going to do it for today's AI Daily Brief Headlines Edition. Next up, the main episode. Today's episode is brought to you by Vanta. Trust isn't just earned, it's demanded. Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex. That's where Vanta comes in.

Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001. Centralized security workflows complete questionnaires up to 5x faster and proactively manage vendor risk. Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back so you can focus on building your company.

Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vantage to manage risk and improve security in real time.

For a limited time, this audience gets $1,000 off Vanta at vanta.com slash nlw. That's v-a-n-t-a dot com slash nlw for $1,000 off. If there is one thing that's clear about AI in 2025, it's that the agents are coming. Vertical agents buy industry horizontal agent platforms.

agents per function. If you are running a large enterprise, you will be experimenting with agents next year. And given how new this is, all of us are going to be back in pilot mode. That's

That's why Superintelligent is offering a new product for the beginning of this year. It's an agent readiness and opportunity audit. Over the course of a couple quick weeks, we dig in with your team to understand what type of agents make sense for you to test, what type of infrastructure support you need to be ready, and to ultimately come away with a set of actionable recommendations that get you prepared to figure out how agents can transform your business.

If you are interested in the agent readiness and opportunity audit, reach out directly to me, nlw at bsuper.ai. Put the word agent in the subject line so I know what you're talking about. And let's have you be a leader in the most dynamic part of the AI market. Hello, AI Daily Brief listeners. Taking a quick break to share some very interesting findings from KPMG's latest AI Quarterly Pulse Survey.

Did you know that 67% of business leaders expect AI to fundamentally transform their businesses within the next two years? And yet, it's not all smooth sailing. The biggest challenges that they face include things like data quality, risk management, and employee adoption. KPMG is at the forefront of helping organizations navigate these hurdles. They're not just talking about AI, they're leading the charge with practical solutions and real-world applications.

Well,

Welcome back to the AI Daily Brief. There has been a lot of chatter and conversation recently around AGI and superintelligence and the rate of change and how everything is speeding up. And of course, right now, the annual World Economic Forum at Davos is happening, which, if nothing else, is a good chance to see how many leading figures from the business and political world think about the world in this moment.

Earlier this week, the Wall Street Journal had a conversation with Anthropic CEO Dario Amadei. It was a wide-ranging conversation that included both some nuggets about where Anthropic was and where they were headed, as well as some questions about these larger issues. For this episode, we're going to dig into some of the most interesting comments from that conversation and how they're being perceived.

Now to kick off, let's talk about some of the little details we got from the conversation that are more on the business side for Claude. Interviewer Joanna Stern desperately tried to pull out some information around when new models would be coming out with little success. One of the features she asked about was web access, which Dario said would be coming, quote, relatively soon. He also discussed a two-way voice mode that would, quote, come eventually. Interestingly, when Stern asked Amadei about photo and video generation, he said that they were explicitly not on the roadmap.

Effectively, he said that Anthropic did not see those as crucial features for enterprise users, that instead those were features focused on consumers, and that, quote, the majority of our business is enterprise-focused, so often enterprise-focused things get prioritized first. This makes sense when you look at the market share change of Anthropic relative to their competitors when it comes to LLMs in the enterprise. Whereas between 2023 and 2024, OpenAI saw a 16% decrease in their market share in the enterprise, Anthropic's

Anthropic saw a doubling of their market share from 12% to 24%. Now, it's interesting that they are being so explicit that they do not see photo and video generation as key for enterprise, but this also may just be reflective of the fact that even a well-financed startup like Anthropic can't go after everything all at once. Now, the thing that Stern and the Wall Street Journal were maybe most interested in was whether Anthropic would be releasing a more advanced model sometime in the coming months. His answer was yes, and the only timescale he would give was within the next six months.

When it came to questions of Anthropic's tight usage limits, which has been a frequent complaint on Twitter, he basically said that they're working hard to address resource constraints, but that right now, getting access to compute involves waiting lists, and it's really just a tricky, difficult thing.

He also said that revenue had grown 10x over the last year to close to a billion dollars and that that's not slowing down. The next conversation was one about agents. Dario took the chance to try to reframe the conversation a little bit. He referred to Anthropic's forthcoming agents as, quote, virtual collaborators, saying, The thing we have in mind is a model that is able to do anything on a computer screen that a virtual human could do, and you talk to it and give it a task.

Maybe it's a task it does over a day. You say we're going to implement a product feature and it writes code, tests, deploys that code, talking to coworkers, writing Slack, sending emails. Just like a human, the model goes off and does a bunch of things and checks in with you once in a while.

Now, I understand why people who have been in this space for a long time sometimes bristle at the overuse and overhype around the term agents. I also think that this is just frankly a losing argument. Agents to people right now mean AI that does stuff for me without me. They're not considering degrees of autonomy and technical nuance. And I think trying to get people to use a different set of terms or a more nuanced set of terms is just a losing battle.

In any case, even the virtual collaborators version of agents in Amode's estimation could imply job replacement, which was something he was asked about as well. Dario noted that every time machines end up automating 90% of a certain time of work, human workers are able to leverage the most crucial 10% for huge efficiency gains.

He also discussed the difference between deploying enterprise AI in a replacement mode versus a complementary mode. Amadei referenced research that suggested complementary AI deployment leads to greater productivity gains. And this, of course, leads to that language of virtual collaborator. I've talked about this a lot before, and I will continue to talk about it. I think it is inevitable that corporations are going to be looking and asking how agents can replace entire categories of tasks, which could impact jobs. But I also think that there's going to be significant social pressure and a new forming of norms that

that's going to put a lot of pressure on those companies to think in this sort of complementary way rather than in pure replacement terms.

Then the interview got into the big questions. Amadei said that he still believes that AGI will arrive by 2027 or slightly after that, just two years from now. And thinking about the sociological implications, he acknowledged that AGI is likely to require a reorganization of the economy. He said, the only good thing about it is that we'll all be in the same boat. I'm actually afraid of the world where 30% of human labor becomes fully automated by AI. That's going to cause this incredible class war between the groups that have been replaced and those that haven't. If we're all in the same boat, it's not going to be easy, but I actually feel better about it.

We're going to have to sit down and recognize that we've reached the point as a technological civilization that there's huge abundance and huge economic value. But the idea that the way to distribute that value is for humans to produce economic labor is invalidated. Another thing that you probably heard me speak about quite a bit is the idea of a renegotiation of the social contract, and this is just further evidence and a different articulation of exactly that idea.

In a separate interview also at Davos, he said, I've never been more confident that we're close to powerful AI systems. What I've seen inside Anthropic and out of that over the last few months led me to believe that we're on track for human-level systems that surpass humans in every task within two to three years. This is nothing new, but it's important to repeat because it brings up questions, as Johnny Miller here asks, Sincere question, how, if at all, are you considering adjusting your life trajectory based on this?

Punch Bowl technology reporter Ben Brody writes, however much this is hype, however long the timeline actually ends up being, this is by far the most disruptive thing on the horizon of our lives. And I say that as a reporter in Trump's D.C. Now, I won't get too much into it, but he was also asked about the new administration. And effectively, he deflected, saying that Anthropic is a policy actor, but not a political actor, and making it clear that the types of issues that they care about are things like the race with China.

Although he acknowledges that it's really tricky. He said, having a lead on China, which is becoming increasingly difficult, gives us the buffer to take care of the risk of our own models. If we have that lead, we're in this catch-22 where if you slow down three months to mitigate the risk of our own models, then China will overtake us. We don't want to end up in that situation in the first place. You might remember that he recently co-authored an article in the Wall Street Journal called Trump Can Keep America's AI Advantage that's all about the need to continue to support and even expand the export controls that went into effect under Biden.

All in all, I share this interview because it is once again a leader of one of the frontier labs in the best position to actually know, saying that we are just a couple years out from a massive structural shift in our economy. The time to start thinking about those implications and preparing is, if not yesterday, then certainly today. And I'm glad to be a part of your journey to do just that. For now, that is going to do it for today's AI Daily Brief. Until next time, peace.

Will AI Surpass Human Intelligence in 2 Years? 14:22 Share

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

Deep Dive

Shownotes Transcript

Will AI Surpass Human Intelligence in 2 Years?