We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
People
主持人
专注于电动车和能源领域的播客主持人和内容创作者。
Topics
Dario Amodei: 我认为在三到六个月内,AI 将编写 90% 的代码;一年内,AI 将编写几乎所有代码。这基于我们在 Anthropic 的发现,以及 AI 代码辅助工具的快速发展,例如 Claude 3.7 Sonnet 和 Claude Code,以及‘vibe coding’ 的兴起。 我理解这对于那些没有使用过这些工具的人来说可能难以置信,但这代表着我之前预测的时间表正在加速。这不仅仅是关于编写代码,更是关于创造力,以及人们如何使用这些工具来构建应用程序。 即使 AI 编写了所有代码,程序员仍然需要负责一些关键任务,例如指定条件、设计应用程序、确保代码安全等。因此,我认为人类的生产力实际上会得到提高。 Arvind Krishna: 我认为 AI 只能编写 20% 到 30% 的代码。虽然 AI 在一些简单的用例中表现出色,但在许多复杂的用例中则无能为力。 然而,如果我们可以用相同数量的人员编写 30% 更多的代码,那么我们将编写更多代码,而不是更少。历史上,生产力最高的公司获得了市场份额,并能够生产更多产品,从而获得更多市场份额。因此,AI 的赢家将是那些利用 AI 做更多事情的公司,而不是那些用更少的人做同样的事情的公司。 Mark Cuban: AI 永远不是答案,AI 只是工具。无论你有什么技能,你都可以用 AI 来增强它们。 Sundar Pichai: 在 Google,超过四分之一的新代码是由 AI 生成的,然后由工程师审查和接受。 主持人: Dario Amodei 的预测引发了广泛讨论,涉及 AI 代码生成工具的快速发展、软件工程的未来以及工作岗位的变迁等多个方面。 当前 AI 代码生成工具在构建原型和查找 bug 方面表现出色,但在处理大型企业代码库方面存在挑战。大型企业代码库的复杂性以及多工程师协作等问题,会阻碍 AI 代码生成的全面应用。代码安全性和合规性问题,也会限制 AI 代码生成的快速应用。 然而,未来代码库规模将大幅增加,AI 编程能力的提升将有助于解决这一问题。AI 将取代所有任务,但这并不意味着工作岗位会消失,而是工作性质会发生改变。未来人们将成为 AI 的管理者,而不是任务执行者。AI 将提高生产力,但也会彻底改变工作结构。 如果 AI 只擅长模式匹配,而不擅长发明创造,那么编程语言的发展可能会停滞。

Deep Dive

Chapters
This chapter explores the recent OpenAI announcements regarding their new agentic tools, including the Responses API and Agents SDK. These tools aim to simplify agent development, sparking debate about their impact on the AI platform wars and the future of coding.
  • OpenAI released new agentic tools to accelerate agent platform wars.
  • The toolset includes Responses API, web search tool, file search tool, and computer use tool.
  • Agents SDK supports building multi-agents and monitoring workflows.
  • OpenAI aims to be the all-in-one platform for agent development.

Shownotes Transcript

Translations:
中文

Thank you.

Today, we have a story that could easily have been the main episode. I just thought the conversation around Dario Amadei's predictions around AI coding was so interesting that I wanted to go a little bit deep on it. OpenAI has released a new suite of agentic tools which absolutely are going to accelerate the agent platform wars. They released this with a huge new breakdown of everything that was included.

The toolset includes their new Responses API, which they say combines the simplicity of the Chat Completions API with the tool use capabilities of the Assistance API for building agents, built-in tools including web search, file search, and computer use, a new Agents SDK to orchestrate single-agent and multi-agent workflows, and integrated observability tools to trace and inspect agent workflow execution.

Now, sometimes with an announcement like this, the Twitter thread boys are useless because they're so caught up in their own desire to hype things up that they don't actually have anything substantive. But sometimes, especially when it's an extremely dense technical announcement, they can be very useful for breaking it down. So let's turn to Elvis here because he did a great summary of what was actually released. He writes, OpenAI has already launched two big agent solutions like Deep Research and Operator. The tools are now coming to the APIs for developers to build their own agents.

The first built-in tool is called the web search tool. This allows the models to access information from the internet for up-to-date and factual responses. It's the same tool that powers ChatGPT search, powered by a fine-tuned model under the hood.

The second tool is called the File Search tool. This is useful for agentic RAG-related use cases. It now supports metadata filtering and Direct Search Endpoint which enables direct search to your vector databases. The third tool is the Computer Use tool. This is like the Operator available via the APIs. It allows you to control the computer you operate. This comes with the computer use model that's used by Operator.

Elvis continues, they also announced the Responses API. Unlike the traditional chat completions API, this new API is flexible enough to support multiple turns and tools more natively.

Elvis continues, you can also pair tools together with the responses API. It can call multiple tools at once and give you a final response in one request. The computer use tool can also be used with the responses API. You can add instructions and customize the display. What about for those multi-agent systems? Well, Elvis continues, OpenAI has also made Swarm, their agent orchestration framework, more production ready. It has been rebranded to the agents SDK. It uses the responses API under the hood, but other vendors are also supported. The

The agent's SDK, which is open source, supports building multi-agents out of the box. The triage agent can hand off tasks with the relevant context to execute tasks. It also supports monitoring and tracing out of the box, which can be used for debugging your agents. The tracing UI is also available to track traces of your agentic workflows.

Big thanks to Elvis, who is himself an agent builder for that more simple breakdown, at least. Basically, what's going on here is that OpenAI is asserting its place and its offering for developers in the white hot agent building space. It's very clear that even though OpenAI is absolutely, I think, going to build some number of agents that they want to own themselves to keep close to the relationship with the customer, they also recognize that they're not going to be able to build everything, but they do want a piece of everything.

Olivier Godemint writes, Trying to explain the relationship between the Responses API and the agent's SDK, project manager Nakunj Honda writes,

The Responses API is like this atomic unit of using models and tools to do a particular thing. The agent's SDK is having multiple of those atomic units work together to solve even more complicated tasks.

But what does this actually mean in practice? Simon Taylor writes, OpenAI's responses API in Agents SDK is a huge moment for the AI platform wars. The goal is to make building workflow agents trivially easy. It can do things like connect the browsers, files, and apps, chain multiple agents together, and monitor performance in real time. Most startups spent the last year building what OpenAI just gave away for free. Here's what it replaces. Months of prompt engineering and iterating, complex orchestration logic,

endless fine-tuning and testing, i.e. observability and evals. And ultimately, this means that OpenAI is trying to be the all-in-one platform. Will it work? The bargain is, we'll make the tooling easy if you use our LLM, but you can't use Claude 3.7, which many like. Yet, for many developers, this will be tempting. This isn't the end of the competition, it's the beginning. There's now two visions for the world, Claude's Open Model Context Protocol and OpenAI's Tool Use SDK and Responses API.

And I think he's absolutely right that this is a major, major moment in the agent platform wars, which will dictate the shape of a lot of things to come in the coming months. That was, in fact, not the only OpenAI news, however. One of the things that GPT-4.5 is clearly better at is writing. And yet, OpenAI seems to also have a new writing-focused agent, or at least a new model, in development.

Yesterday, Sam Altman tweeted, Now, for the sake of the headlines, I will not read the short story that Sam attached, but you can be assured that the fourth wall was decimated by this model's metafiction.

Another rumor is percolating from a subtle mention in OpenAI's API changelog. The post referenced a model called O3 Mini Pro. When prompted to fix the typo, Adam GBT, who does go to market for OpenAI, commented, I don't see any typos.

Although we don't have any official information, you can probably figure out what the model does based on the name. If it follows the same convention as O1 Pro, it will be a more capable version of the underlying model that uses significantly more inference. Still, speaking about the naming convention, Chubby commented, Please don't. Don't make an O3 Mini Pro next to O1 Pro and O3 Mini and O3 Mini High and O3 and O3 Pro. Please don't, OpenAI.

Lastly today, Meta has begun testing their in-house chips designed for AI training. According to Reuters, the first batch has arrived from TSMC, and Meta has set up a small cluster for testing. One source mentioned that the chip is a dedicated AI accelerator rather than a GPU, which could make it more power efficient. This is the first so-called tape-out for the chip, the process of finalizing the design and completing the first test run. It's very common for chips to go through multiple tape-outs to refine the design and fix issues before production is ready to ramp up. Each tape-out typically takes between three and six months.

Meta has deployed custom AI chips before, but only for inference rather than training. Indeed, one effort to develop an inference chip in 2022 went pretty badly awry, leading Meta to scrap the project and pivoting to becoming NVIDIA's largest customer in an effort to catch up in the AI race. If this test is successful and Meta can ramp up production, it will be a big step towards reducing reliance on NVIDIA.

The timeline for that is still at least six months away, even if everything goes according to plan. Still, the infrastructure build-out continues apace. For now, that is going to do it for today's AI Daily Brief headlines. Next up, the main episode. Today's episode is brought to you by Vanta. Trust isn't just earned, it's demanded.

Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex. That's where Vanta comes in. Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001. Centralized security workflows complete questionnaires up to 5x faster and proactively manage vendor risk.

Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back so you can focus on building your company. Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vanta to manage risk and prove security in real time.

For a limited time, this audience gets $1,000 off Vanta at vanta.com slash nlw. That's v-a-n-t-a dot com slash nlw for $1,000 off. There is a massive shift taking place right now from using AI to help you do your work

to deploying AI agents to just do your work for you. Of course, in that shift, there is a ton of complication. First of all, of these seemingly thousands of agents out there, which are actually ready for primetime? Which can do what they promise? And beyond even that, which of these agents will actually fit in my workflows? What can integrate with the way that we do business right now? These are the questions at the heart of the super intelligent agent readiness audit.

We've built a voice agent that can scale across your entire team, mapping your processes, better understanding your business, figuring out where you are with AI and agents right now in order to provide recommendations that actually fit you and your company.

Our proprietary agent consulting engine and agent capabilities knowledge base will leave you with action plans, recommendations, and specific follow-ups that will help you make your next steps into the world of a new agentic workforce. To learn more about Super's agent readiness audit, email agent at bsuper.ai or just email me directly, nlw at bsuper.ai, and let's get you set up with the most disruptive technology of our lifetimes.

Today, we are talking about a topic that has absolutely lit up AI Twitter for the past day or so, which is a prediction from Anthropic CEO Dario Amadei that AI will write 100% of code or nearly 100% of code within a year. Where this comes from is Amadei sat down at the Council on Foreign Relations for a wide-ranging interview. The discussion covered the future of AI leadership, the role of innovation in geostrategic competition, and the outlook for frontier models.

Still, it was his comments about the pace of worker replacement in the tech sector that have gone viral.

Dario said, if I look at coding, which is one of the areas where AI is making the most progress, what we're finding is that we are not far from a world, I think we'll be there in three to six months, where AI is writing 90% of the code. And then in 12 months, we may be in a world where AI is writing essentially all of the code. Now, to those of you who are not using all of these text-to-code tools, it may seem completely obvious. Still, for Dario, it's a significant acceleration of the timelines that he had previously expressed.

In fact, I think it's the first time, at least the first time that I've seen, that he's actually given a concrete forecast for the adoption of automated AI coding. When he was doing the interview circuit back in Davos in January, Dario only spoke in general terms about the overall workforce, for example, stating, I've never been more confident than ever before that we're close to powerful AI systems. What I've seen inside Anthropic and out of that over the last few months led me to believe that we're on track for human-level systems that surpass humans in every task within two to three years.

So what's been happening since then that might make this timeline feel like it's accelerating, at least when it comes to coding? Dario's company, Anthropic, is obviously a big part of this. They released Claude 3.7 Sonnet alongside their agent decoding tool, Claude Code. Both releases represent major progress for AI coding assistants. Of course, surrounding these tools, we've also seen the rise of vibe coding.

This is the idea of people who were not coders before being able to use a tool like a lovable or a bolt to actually build applications. Got people thinking in completely new ways about what it means to build software, but also what it means to be a creator more broadly. Riley Brown, who got his start as the biggest AI TikToker, has completely shifted to building apps as a form of content. He's convinced that this is where all creators are going to head. And a few weeks ago on February 18th,

He shared a Google search comparison of prompt engineering as a term versus vibe coding as a term and said checking on this in one year. He then came back yesterday to show that vibe coding had actually already started to surpass prompt engineering at

as a search term. Riley added, it only took three weeks. Very clearly then we are in the middle of a paradigm shift for software engineering work, and it feels like somewhere along the way we might have passed an inflection point. Now this has been a big topic in industry conversations that are going on right now. For example, the discussion about AI coders replacing human engineers was a big topic at South by Southwest, which is currently still happening down in Austin. Responding to Amadei's prediction, IBM CEO Arvind Krishna was very skeptical.

He commented, "I think the number is going to be more like 20 to 30% of the code could get written by AI, not 90%. Are there some really simple use cases? Yes, but there's an equally complicated number of ones where it's going to be zero." Now, well, as you'll see, I'm not sure I agree with that point.

He did add something that I think is an extremely important narrative and something that you've probably heard on the show before saying, if you can do 30% more code with the same number of people, are you going to get more code written or less? Because history has shown that the most productive company gains market share, and then you can produce more products, which lets you get more market share.

This is basically a way of him saying the winners of AI will not be those who choose to do the same with less, but those who choose to do either more with the same or way more with a little more. So Arvind, if you happen to be a listener, thanks for amplifying this point.

Mark Cuban had a similar take during his panel at the conference. While he didn't touch on AI coding, he made a more general point about work replacement, saying, AI is never the answer. AI is the tool. Whatever skills you have, you can use AI to amplify them. And while I don't disagree with that, I think the evidence is starting to show something a little bit different. For example, back in October, when, frankly, coding assistants were less capable than they are now, Google CEO Sundar Pichai said during Google's Q3 earnings call,

Today, more than a quarter of all new code at Google is generated by AI than reviewed and accepted by engineers. So if the IBM CEO is saying that it'll only be 20 to 30%, but back in October, Google was already seeing 25% of new code coming from AI, someone's going to be wrong here.

From a technical perspective, one of the questions raised by Amadei's comments is what improvements do we need to see over the next three to six months to get to 90% of code being AI generated? And by extension, what are the challenges that need to be solved to get that number to 100% within a year's time? One of the big divides in this moment is the entrepreneurs and enthusiasts spinning up new products, this huge rise of vibe coders, for example, and on the other hand, the professional programmers working with enterprise codebases.

Many of the current tools are unbelievably good at building prototypes, hunting bugs, letting people go from zero to one really quickly. That's not the same as being able to scale to huge enterprise code bases where lots of people have to contribute. If you go check out the Claude Reddit, for example, you can find lots of examples of struggles with situations that deal with things like multiple files across an enterprise code base. And that's without even getting into issues involving coordinating multiple engineers working on the same code base in parallel. There are also more mundane constraints

in how fast this shift will happen. Milan Janan, for example, pointed out, you can't sign a contract saying your code is secure if none of your employees have read it or understand how it works. Of course, none of these issues are insurmountable. They make it more difficult to go full AI at the enterprise level, but they also create a pretty damn big honeypot for someone who wants to take on this particular set of challenges. Developer Nick Dobos wrote, Keep seeing versions of AI coding is great until your app gets too complex for the AI to handle.

Some thoughts. Codebases are going to 1000x to 1 millionx in size over the next 10 to 20 years. Keeping it all in your head will not be viable. Two, this unknown large codebase is already the norm for large companies. The size, maturity, and turnover of modern companies, as well as use of libraries and packages, means huge pieces are and always will be unknown to you. You still need to be productive in this labyrinth.

Three, AI coding will get more powerful, and with it the ability for an AI to understand this labyrinth will grow. You simply need to ask for summaries and ask questions about the code as you go. Trying to keep this much info memorized in your brain will not work. Four, the number one skill of an AI programmer right now is constraining AI and providing just enough context when giving commands for the AI to do things correctly. Five, this will grow to include knowing what questions to ask your codebase so you have only enough info and context to do your tasks correctly.

What's more, Siggy Bilstein, the CTO of Maza, suggested that some amount of this may be a skill issue. He wrote, And so of course one of the questions becomes, if AI is writing all the code, does that mean we don't have any software engineers? Dario actually addressed this in the next part of the interview. He said,

The programmer still needs to specify what are the conditions of what you're doing? What is the overall app you're trying to make? What's the overall design decision? How do we collaborate with other code that's been written? How do we have some common sense on whether this is a secure design or an insecure design? So long as there are small pieces that a human programmer needs to do that an AI isn't good at, I think human productivity will actually be enhanced. I think that's true. But I also think that when people hear statements like this, they have a tendency to lump them in together with some common tropes that you hear around AI right now.

Things like AI is just going to replace the tedious tasks or this little chestnut that you probably see on LinkedIn about 20 times a day. AI won't replace you. A person using AI will. To put my cards on the table, I think these sentiments reflect staggering amounts of cope and or not seeing where things are headed. When people who are heading these foundation labs are talking about AI being better at every task than humans in two to three years, that means every task, not just the tasks that are tedious.

My base case then is that effectively 100% of the tasks, quote unquote, that we do today are going to be done by AI in the future. Where I think people lose the plot and lose the nuance here is that that does not mean a priori that jobs go away. Instead, I think that on a fundamental level, jobs change. And I think about 100% of our jobs are going to change.

Simply put, we are not going to be task executors and doers in the future. People are going to be generals of their own little armies and CEOs of their own little companies where agents and AIs are doing the tasks that they once had done themselves. The reason that I'm so bullish is that I think that all of this leads to just more being produced. Exactly what the CEO of IBM was talking about, which is a recognition that market winners always produce more and win market share by producing more better things.

That's what AI is going to enable. But the path to it enabling that is going to be a complete, effective 100% replacement of the way that jobs are structured now with a totally new way of working.

And by the way, from where I'm sitting, I don't think that Dario's timelines are all that crazy. Yes, I think that there are very real structural constraints and human constraints and legal constraints and inertia constraints that will slow this down in the enterprise. I think that is absolutely undeniable. But when you get outside of the enterprise sector, it is not hard to see examples of how fast this is changing. Y Combinator partner Jared Freeman recently said that one quarter of YC founders said that 95% of their code base was AI generated.

That means a quarter of the companies that have come out of the most important accelerator in the world are seeing nearly all of their code generated by AI. Sahil from Gumroad says, we're already at 100%. If you're writing code, you're making a conscious choice not to ask AI to write the code for you. Fine, but it's a choice, similar to doing your dishes by hand instead of using a dishwasher.

Like I said, I do not think that this means that all software engineers are going to lose their jobs. As Adi from Trade Your Meme put it, software engineering has never been primarily about writing code. The code is downstream to thinking precisely about how to model a particular domain. I think he's right that that's going to be an even more important skill in the future.

What's more, there is also this other interesting lurking question for me, which is that if AI does write all the code, but AI remains in a place where it's not particularly good at inventing new things, it's just good at pattern matching old things, does that mean that we're going to be stuck with the programming languages we have right now in the state that they are forever? If engineers aren't in there actively working with the code, do we lose on some of that progress?

Gurgelio Rose writes, So there are still interesting things to explore even in this 100% vision of the world.

Anyway, I think at this point you can probably see why this has been such fodder for conversation. It is, yes, about coding, about this huge, important breakout area of AI and agent innovation, which has become even more broadly important with the rise of vibe coding, which has allowed all the people who are non-technical and non-developers to start participating in it. But it's also touching on the broader questions of job displacement more generally and how work looks in the future.

Hopefully your mind's now worrying with some new thoughts. Let me know in the comments, of course, whether you think the prediction is right on, crazy, or not ambitious enough. For now, though, that is going to do it for today's AI Daily Brief. Appreciate you listening as always. And until next time, peace.