We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Faster/Slower: Where AI Is Moving Ahead of Expectations and Where its Lagging

Faster/Slower: Where AI Is Moving Ahead of Expectations and Where its Lagging

2025/2/27
logo of podcast The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

AI Deep Dive AI Chapters Transcript
People
N
NLW
知名播客主持人和分析师,专注于加密货币和宏观经济分析。
Topics
NLW: 我认为2024年大部分时间,AI模型的底层能力提升速度低于预期,直到年末才出现显著提升。预训练模型的扩展性在有效性方面有所减弱,收益递减。企业对AI的采购速度远超预期,但实际应用和利用率却远低于预期。许多企业在其程序员和开发人员中遇到了对AI工具的阻力,这与消费者领域的情况形成鲜明对比。AI的成本降低速度惊人,远超摩尔定律。政策变化的速度低于预期,实际政策进展缓慢。社会上一些重大变化(例如,人们将AI用于与已故亲人的互动)的速度低于预期。中国和开源AI的发展速度都超出了预期。代理模型的能力提升速度(尤其是在特定领域和功能方面)正在加快,而通用型代理模型的应用进展相对缓慢。代理模型的采用率即将大幅提升,成为企业界讨论的焦点。 Grok: Grok的深度搜索结果显示,企业采购速度快,但实际应用和集成速度慢。 Perplexity: Perplexity的结果也显示企业采购速度快,但实际利用率低,模型创新出现停滞,监管和伦理框架进展缓慢。 ChatGPT: ChatGPT的深度研究结果显示,企业投资和初步应用快速增长,但从试点到生产的流程缓慢,大规模应用的投资回报率未实现。模型能力和创新快速发展,但可靠的推理和真实性进展缓慢。监管讨论的紧迫性增加,但正式监管滞后;消费者采用率高,但创意行业接受度和公众信任度低。 Swix: Swix的列表显示深度研究、强化学习和代理模型、开发代理和低代码代理发展迅速,但电子邮件代理、调度代理、可穿戴设备和实时语音到语音的应用进展缓慢。语音代理技术发展迅速,成为许多企业家关注的重点领域;可穿戴设备领域的AI应用进展缓慢。

Deep Dive

Chapters
The podcast opens by introducing the concept of 'punctuated equilibrium' in technological advancements, comparing it to biological evolution. It then sets the stage for discussing AI developments that have exceeded or lagged behind expectations in 2024. The episode will cover personal observations, insights from AI research tools, and a curated list from a fellow podcast host.
  • Punctuated equilibrium in AI development
  • Faster and slower advancements in AI

Shownotes Transcript

Translations:
中文

Today on the AI Daily Brief, a fun game called Faster and Slower, where we see what's moving more quickly in AI than expected and what's moving a little bit less quickly than expected. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. ♪

Hello, friends. Welcome back to another AI Daily Brief. As you know, I am traveling this week, so things are a little bit different. No video for one, some slightly different topics for another, but I think you're going to have fun with this one, or at least I hope you will. One of the things that's absolutely happening right now, and I think everyone who's paying close attention feels like it, is that we are in a punctuated equilibrium moment.

For those of you who aren't familiar with that term, it comes from Stephen Jay Gould and was a term that was used to describe and really change how we thought about evolution. For a long time, we thought about evolution as a steady, gradual incline, all at kind of the same pace, and up into the right curve at the same angle the whole time.

In point of fact, what it actually looks like when you dig into the fossil record is long periods of dormancy followed by massive explosionary periods of change, followed by periods of dormancy, followed by periods of massive change, in this sort of interesting step function that goes up and gets us to the same spot, but happens in a very different and much messier way than we thought.

Technology evolution feels a bit like that as well. Where sometimes, yes, there's just general increases, but you have these periods where it feels like you're kind of on a low burn. And then other times where it feels like everything is shifting all at once.

Now, of course, when you dig underneath, perhaps part of what the difference was, was that things were bubbling and brewing during those theoretically quiet times. But whatever it is, I think that it's safe to say that a good chunk of 2024 felt like one of those low periods. So much of the time was spent trying to catch up to GPT-4, and then everything got there. And we just kind of sat there for a while.

That was until the end of the year when it started to feel like things were picking up again with the launch of reasoning models, the emergence of more capable agents, and a number of other trends that have all contributed to the sense that I think people have now that we are in another punctuated equilibrium moment.

So with that as background, let's talk about a few things that are moving faster and slower. And what we're going to do is go through three sets of lists. We're going to talk through first the quick list that I came up with off the top of my head. Then second, we're going to look at what the deep research tools from Grok, OpenAI, and Perplexity thought. And then we're going to look at one list curated from the web, which I thought was particularly interesting and had some different details than I had put in mine.

All right, so starting with my list, and I'm going to bounce between faster and slower because as you'll see, sometimes they're a both and. So just to really level set, let's talk about capabilities. I kind of gave this away a little bit in the intro, but I think that for most of 2024, it felt like capabilities, and by that I obviously mean the specific capabilities of the underlying models and the state of the art, was a little bit slower than people expected.

It felt like there was this blistering race across 2023, but then we stagnated for most of 2024 at GPT-4 level, roughly speaking. That seemed really weird, and in fact, some people wondered if this was just OpenAI slow-walking it because it made more sense strategically than to get out as far ahead as it seemed like they were probably going to. Obviously, now that has started to shift a little bit, and it feels like there has been a major capabilities increase.

Part of that has to do with the switch to a new approach to scaling that isn't strictly based on the amount of compute and data thrown in pre-training, but is based on new strategies like test time compute. And in fact, that leads me to my next slower. It's very clear that the pre-training scaling model has slowed down in terms of its efficacy. It's not that there are no gains to eke out. It's that if you look at the difference between something like Cloud 3.7 Sonnet and the previous models, or Grok 3 as compared to previous models, which was of course trained on the Colossus supercluster,

The type of gains that you might have expected are just not as high. That doesn't mean that the scaling model has broken entirely, as many will point out, but it certainly suggests for diminishing returns. Our next up, in terms of one that's both faster and slower, is around enterprise adoption. I think when it comes to reorganizing structures to try to adopt AI and actually making AI purchases, enterprise adoption has gone way faster than I think anyone would have anticipated.

I had this thought as I was speaking to a 3,000-person live conference in Nationwide that was exclusively about Gen AI, less than two years after ChatGPT was the starting gun for this whole industry. Enterprises have never jumped on anything this fast. There is a very clear recognition of just how disruptive and transformative this technology is going to be that goes up and down the org chart, but certainly comes straight from the top. And that's showing up in how these companies are engaging with this.

Now on the flip side, enterprise adoption in practice, specifically the utilization of these tools, has been much, much slower. The caveat here is of course that a lot of usage is in this secret cyborg category where people are keeping it under wraps. This was a big topic of discussion throughout the last year, where people were concerned either that A, their work wouldn't be looked at as legitimate, B, they just wanted to use the toolset that they could use personally, which was ahead of the available enterprise toolset,

But in any case, it is absolutely true that there are many situations in which a big company has paid for 10,000 licenses of some tool, often Microsoft Copilot, and is only using 20% of them or 30% of them. Like I said, I think there are a lot of explanations for that, including the quality differential between enterprise tools and consumer-grade tools. But one area that I think is particularly interesting where it has been absolutely slower is the resistance that many enterprises have found among their coders and developers.

Part of why this is so interesting is that it's in such stark contrast to the broader consumer AI space where coding tools have revolutionized how developers work. If you look at startups or individual developers, tinkerers, hackers, entrepreneurs, builders, solopreneurs, these people are outputting five, ten times the amount of work that a person in their position would have been before thanks to this new slate of coding tools. But inside the enterprises, there is real hesitance.

Now, some of that is cultural and the uncomfortable friction between different expectations of productivity. There are also some real technical issues. The type of trade-offs that individual tinkerers and startups might be willing to make don't necessarily always apply in the enterprise. And a lot of those extremely high value, low code or no code tools aren't necessarily optimized for interacting with enterprise code bases. Still, it does surprise me, I have to say, every time that I talk to a new big company and they're struggling to get their developers to dig in and experiment with these new tools.

I think something's got to give in that area because I don't really believe it's a fight that the coders who are trying to keep doing it the old way have any chance of winning. Now, is there a market opportunity to retrofit certain types of AI coding tools specifically for the enterprise? Absolutely. And maybe that's what it takes. Still, I think it's a really interesting area that shows both faster and slower.

Moving back to the faster side for a minute, I think that the cost reduction speed in AI has been head spinning for everyone. AI is very expensive and there are these big questions of business model and how it's going to be possible for the big tech companies to make back the amount that they're spending on CapEx. But hold aside that when it comes to the end user or the end developer who's building with these tools, the cost of intelligence is just cratering at such an incredible rate.

I think Sam Altman recently said that it was down something like by a factor of 10 each year, which is obviously radically faster than Moore's law. One of the negative externalities, in fact, of how fast this is changing is that I think it's going to make it really difficult to sort out how in particular agents are supposed to be priced. My guess is that agent companies are going to try to price it as a comparison to the equivalent human labor, but then other agent companies are going to say, screw that, we should base it on cost of goods sold, which is effectively negligible.

Anyways, there's going to be a lot of really interesting things that play out based on the fact that cost reduction is happening at a much faster clip than I think anyone would have thought.

Over on the slower side, one of the areas that I think has been most dramatic is policy shifts. 2023 came out screaming, looking like there was going to be a big policy discussion that really got everyone talking about AI. And we got all these AI safety institutes and different conferences and conventions and all this sort of stuff. But in terms of actual practical policy, there's been almost nothing. The only region of the world that's really put anything big into place is the EU with its AI Act and the

And the vast majority of that was created way in advance of generative AI. In fact, the EU is now concerned that they overplayed their hand with generative AI and they're losing out because of it. Now, this one might be more understandable given that the US had a very contentious presidential election in 2024, and that's never a recipe for big policy changes getting done, but it still seems extremely notable to me.

Another one that's slower that I wouldn't have expected is weird societal changes. I thought we were going to almost instantly see things like people AI-ifying their loved ones and trying to interact with dead relatives. And there's certainly been some of that type of experimenting, but there hasn't been nearly as much mainstream conversation around things like that as I would have expected.

Now, to some extent, this might just be me missing out on big trends that are happening because they're not up in my face. Certainly, for example, every time I heard character AI statistics of kids interacting with AI bots, they sounded crazy to me. So it's totally possible that I'm just missing something here.

It's also possible that it was always wrong to expect this to move as fast as I thought it was going to, and that it'll take an entire generational shift for things like that, people AI-ifying their loved ones, for example, to be normalized. But I still, and this is just me personally, do feel like some of those big weird changes haven't happened quite as fast as I thought they would have. Flipping over again to the faster side, both China and open source have been moving, I think, much more quickly than people would have anticipated.

With Lama 3 last year, open source moved very close to the state of the art. And really, ever since the beginning of this, open source has been, I think, out-competing the closed source models in ways that people outside of the open source movement, at least, might not have expected. Now, how related that is to the fact that China is clearly not nearly as far behind as we would have thought is an open question.

Obviously, US presidential administrations have been very aggressive with China when it comes to access to advanced AI chips. And yet still, the big shocking event of the last couple of months is DeepSeek, a model which, while not necessarily beating out the state of the art from companies like OpenAI, was close enough and good enough that it has absolutely changed the competitive landscape. You'll know if you're a regular listener that there are a lot of geopolitical implications of China being as hot on the US's heels as they are. And that I think is going to be a big factor in how things play out over the next year.

Finally, let's talk about agents. Once again, I think agent capabilities up until about the last couple of months felt to many like they were moving slower than people might have anticipated. And in some areas, I still think that that's true even to today. For example, I think agentic computer use is a bit behind where people thought it would have been. And I think that historically, there's been so much concentration around general purpose agents, right? Like people's personal agent assistants, that the fact that that use case hasn't really come to fruition has been surprising for some.

Now, I never thought that that was where agents were going to go, so that doesn't surprise me as much. But I also think that we are now officially heading into the faster category as agent capabilities, particularly in specific verticals and in specific functions, start to come online. Basically, we needed to make a shift from thinking about agents as general purpose to specific purpose, and now things are really starting to accelerate.

Alongside that, agent adoption is poised to absolutely explode as well. Something that you can probably tell if you listen regularly is that agent adoption has completely sucked all of the oxygen out of the room when it comes to every other type of AI discussion in corporate boardrooms right now. And I think that that is going to do nothing but accelerate. Today's episode is brought to you by Vanta. Trust isn't just earned, it's demanded.

Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex. That's where Vanta comes in. Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001. Centralized security workflows complete questionnaires up to 5x faster and proactively manage vendor risk.

Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back so you can focus on building your company. Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vanta to manage risk and prove security in real time.

If there is one thing that's clear about AI in 2025, it's that the agents are coming. Vertical agents by industry, horizontal agent platforms, HLAs.

Agents per function. If you are running a large enterprise, you will be experimenting with agents next year. And given how new this is, all of us are going to be back in pilot mode.

That's why Superintelligent is offering a new product for the beginning of this year. It's an agent readiness and opportunity audit. Over the course of a couple quick weeks, we dig in with your team to understand what type of agents make sense for you to test, what type of infrastructure support you need to be ready, and to ultimately come away with a set of actionable recommendations that get you prepared to figure out how agents can transform your business.

If you are interested in the agent readiness and opportunity audit, reach out directly to me, nlw at bsuper.ai. Put the word agent in the subject line so I know what you're talking about. And let's have you be a leader in the most dynamic part of the AI market. So that's my personal list of faster and slower. But now let's head over and see what a couple of these different research models think. First, I used Grok3's deep search.

In faster progress areas, they had model performance and capabilities, corporate purchasing, and new applications and use cases. Under slower progress areas, they had actual utilization and integration, ethical and bias issues, data quality and management, employee adoption and change management, and regulation and governance. So basically where Grok agrees with me is around divide between corporate purchasing, which has moved faster, and every other aspect of enterprise adaptation, which has gone slower.

In fact, basically all of Grok's slower progress areas have something to do with actual utilization or adoption inside the enterprise. What about perplexity? Perplexity, once again, pointed first to enterprise procurement and tool acquisition as an accelerated area. Another one that they thought was interesting was synthetic data adoption, and I think this is actually a good call-out. We basically have run up against the wall of available information much faster than we thought, which has kind of forced a shift to synthetic data adoption in a lot of cases.

Now, on the slower progress side, once again, they pointed to organizational maturity and integration as well as underutilization of purchased tools. So at this point, definitively, between me, Grok, Perplexity, there is some big divide between enterprise purchases and enterprise utilization. Perplexity also got that there had been model innovation plateaus, which is obviously something I talked about, and they also pointed to an unexpected slowness in regulatory and ethical frameworks as well.

And what about Big Daddy Chat GPT with its deep research? The way the deep research took it is instead of producing a list of faster and a list of slower, they went category by category and looked at what was faster or slower within that category. So on enterprise and business adoption, they once again identified that evolving faster than expected was the rapid uptake in investment as well as initial ROI and use case delivery, but that the pilot to production pipeline was happening at a crawl and that ROI was unrealized at scale.

On the research and breakthrough side, they identified surprising leaps in capability and an explosion of model innovation and diversity, but pointed to reliable reasoning and truthfulness as evolving slower than expected.

Over in regulation and policy, they pointed to a sudden urgency in oversight discussions as evolving faster, but formal regulation lagging much behind. On creative and consumer adoption, maybe the most obvious one, they pointed that consumer uptake is happening at record speed, that there's incredible creative tool adoption and output, but that evolving slower is acceptance within creative professions and public trust and content quality issues.

On the infrastructure and compute side, they pointed to a surge in AI infrastructure investment and advances in specialized hardware and tools as evolving faster than expected, but energy efficiency and cost dilemmas as well as supply constraints and GPU crunch as evolving slower than expected. So a lot of the same themes that we heard from these other areas as well.

Lastly, a couple that come from Swix, who's the host of the Latent Space podcast, as well as the curator of the AI Engineer Summit, which I emceed in New York on Friday. His list of faster than expected included deep research, reinforcement learning and agents, dev agents and low-code agents like Cursor and Bolt, voice customer support agents like Sierra and Decagon. But then on the slower side, he pointed to email agents, scheduling agents, wearables, and real-time voice-to-voice.

A couple of specifics that I wanted to call out from Swix's list. First of all, voice agents as a theme is totally ascendant right now. I think it's too early to tell exactly how good it's going to be in practice, but it is absolutely happening at breakneck speed and is a huge and dominant area of building for many, many entrepreneurs, including us. One of the things that we are constantly looking at is where voice is a better input method for information because voice agents are just capable of doing that.

I think the other really obvious one that Swix pointed out that has to have a mention is wearables. It has obviously been just a sea of carnage in the AI wearable space, perhaps best exemplified in the entire Humane pin team going to work on AI-connected printers recently. So that is the faster and slower list for now. Obviously, this conversation is meant to provoke conversation. Come join us in the comments, share what you think, hit me up on Twitter. But for now, that is going to do it for today's AI Daily Brief. Appreciate you listening as always. Until next time, peace.

you