We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode 890: The “State of AI” Report 2025

890: The “State of AI” Report 2025

2025/5/23
logo of podcast Super Data Science: ML & AI Podcast with Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

AI Deep Dive AI Chapters Transcript
People
J
Jon Krohn
Topics
Jon Krohn: 我发现小型模型在性能上有了显著的提升,并且运行成本也大幅降低。在2022年,需要5400亿参数的Palm模型才能在MMLU基准测试中获得60%以上的分数,但到了2024年,仅需38亿参数的Microsoft Fi 3 Mini就能达到相同的性能。这意味着模型大小在两年内减少了142倍。此外,查询AI模型的成本也大幅下降,从2022年11月的每百万tokens 20美元降至2024年10月的0.07美元,降幅高达280倍。LLM推理价格从2023年到2024年每年下降9到900倍,这使得AI的应用更加经济实惠。

Deep Dive

Shownotes Transcript

Translations:
中文

This is episode number 890 on the state of AI in 2025. Welcome back to the Super Data Science Podcast. I am your host, Jon Krohn. In today's 5-Minute

five-minute Friday episode. I'll cover the five biggest takeaways from the 2025 edition of the renowned AI Index Report, which was published a few weeks ago by the Stanford University Institute for Human-Centered AI. Every year, this popular report, often called the State of AI Report, covers the biggest technical advances, new achievements in benchmarking, investment flowing into AI, and more. We've got a link to the colossal full report in the show notes.

Today's episode will cover the five most essential items as curated by yours truly by me. So first, smaller models have become way better. Get this, in 2022, Palm, a model from Google, which had 540 billion model parameters, was the smallest model scoring above 60% on a very common, very important benchmark for LLMs called MMLU.

By 2024, two years later, Microsoft's Fi 3 Mini achieved the same performance threshold on only 3.8 billion parameters. So we went from 540 billion parameters to about 4 billion parameters to get the same effectiveness, the same capabilities that represents a 142-fold reduction in model size over two years. That's crazy. Yeah.

We're down to 1% of the model size to get the same results in a two-year period. All right. My second takeaway is that not only have models become way better for their size, LLMs have become way, way cheaper to run as well. The cost of querying an AI model with GPT 3.5 equivalent performance, so that's 65% accuracy on MMLU, but

The cost fell from about $20 per million tokens in November 2022 to just $0.07 per million tokens by October 2024. That's using Gemini 1.5 Flash 8B from Google. That represents a 280-fold reduction in cost in approximately 18 months.

Depending on the task, LLM inference prices have declined between 9 and 900 times annually from 2023 to 2024. Crazy.

Third, the AI agents being powered by these increasingly powerful, increasingly economical LLMs are showing great promise. In 2024, there was a launch of a new benchmark called REbench, which introduced a rigorous way for evaluating AI agents on complex tasks.

In short time horizon settings, two hours or less, top AI systems score four times higher or better than human experts. But still today with longer timeframes, humans do outperform AI, achieving two to one better scores at a 32 hour time point.

Nevertheless, AI agents already match human expertise in select tasks, such as writing specific types of code while delivering faster results. The takeaway from this one is that AI agents can already handle a lot of complex tasks, including ones taking many minutes, even an hour or two, and outperform humans on those tasks.

I wouldn't be surprised if in the next year we're talking about a dozen hours or more that AI agents can outperform humans on.

Alright, my fourth takeaway is that the increasing capabilities of AI models and the lower prices of them has led businesses to use AI in droves. Survey data show organizational AI adoption grew significantly with respondents reporting company-wide AI implementation rising from just about half in 2023

to 78% in 2024. Similarly, the percentage of participants indicating generative AI use in business functions saw a dramatic increase, climbing from just 38% in 2023 to a remarkable 71% by the following year. So going from a clear minority of businesses using Gen AI in 2023 to almost three quarters of them using it in 2024.

And finally, with all that corporate use of AI happening, it is perhaps unsurprising that private investment in AI, for example, by venture capital firms, was higher than ever in

in 2024. In the US alone, there was $109 billion of private investment in AI in 2024, topping the previous peak in 2021, which was about $20 billion lower. So absolutely crushing with a new high benchmark for private investment in 2024. In Europe as well, private investment in AI reached new heights in 2024, although it was less than a fifth

of US investment in that year, coming in at $19 billion. Interestingly, China bucked the trend seen in the US and Europe since peaking at around $25 billion of private investment in 2021. Chinese investment in AI has actually decreased every single year since, now coming in at just $9 billion, which is less than half of private European investment and less than a tenth of private American investment in AI in 2024.

All right, that's it for today's episode. I hope you enjoyed this quick snapshot of the state of AI in 2025. I'm Jon Krohn, and you've been listening to the Super Data Science Podcast. If you enjoyed today's episode or know someone who might consider sharing this episode with them, leave a review of the show on your favorite podcasting platform, tag me in a LinkedIn or Twitter post with your thoughts, and if you haven't already, obviously subscribe to the show. Most importantly,

Whatever you do, I just hope you'll keep on listening. Until next time, keep on rocking it out there. And I'm looking forward to enjoying another round of the Super Data Science Podcast with you very soon.