We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Google DeepMind CEO Demis Hassabis + Google Co-Founder Sergey Brin: Scaling AI, AGI Timeline, Simulation Theory

Google DeepMind CEO Demis Hassabis + Google Co-Founder Sergey Brin: Scaling AI, AGI Timeline, Simulation Theory

2025/5/21
logo of podcast Big Technology Podcast

Big Technology Podcast

AI Deep Dive AI Chapters Transcript
People
D
Demis Hassabis
S
Sergey Brin
Topics
Demis Hassabis: 我认为要实现通用人工智能(AGI),可能还需要一到两次新的突破。我们DeepMind有很多有希望的想法正在酝酿中,并希望将它们引入到Gemini的主分支中。同时,我认为既要尽可能地扩展现有技术,又要努力研发新的突破性技术。我们需要算法和算力的共同进步,才能实现AGI。我们正在努力提高模型的准确性,构建更好的世界模型,并探索如何让AI系统进行真正的创造性发明。 Sergey Brin: 我认为算法的进步可能比计算的进步更重要。通过在AI中加入思考能力,可以显著提高其能力。AI的思考过程可以通过使用工具或其他AI来改进最终输出。现在是计算机科学领域的一个独特时期,所有计算机科学家都应该从事AI研究。AI在科学上比之前的技术革命更令人兴奋,并且对世界的影响将更加巨大。

Deep Dive

Chapters
This chapter explores the potential for improvement in frontier models, discussing the roles of scaling and reasoning techniques. Experts debate whether scaling alone is sufficient or if algorithmic advancements are crucial for significant progress toward Artificial General Intelligence (AGI). The discussion includes the impact of reasoning paradigms and test-time compute on model performance.
  • Incredible progress is being made with existing techniques, but breakthroughs are needed for AGI.
  • Scaling and algorithmic advancements are both necessary for progress.
  • Reasoning paradigms significantly improve model performance, as seen in AlphaGo and AlphaZero.
  • Test-time compute allows models to improve their output given more time to think.

Shownotes Transcript

Translations:
中文

From LinkedIn News, I'm Leah Smart, host of Every Day Better, an award-winning podcast dedicated to personal development. Join me every week for captivating stories and research to find more fulfillment in your work and personal life. Listen to Every Day Better on the LinkedIn Podcast Network, Apple Podcasts, or wherever you get your podcasts.

From LinkedIn News, I'm Jessi Hempel, host of the Hello Monday podcast. Start your week with the Hello Monday podcast. We'll navigate career pivots. We'll learn where happiness fits in. Listen to Hello Monday with me, Jessi Hempel, on the LinkedIn Podcast Network or wherever you get your podcasts. All right, everybody, we have an amazing crowd here today. We're going to be live streaming this. So let's hear you make some noise so everybody can hear that you're here. Let's go. Woo! Woo!

I'm Alex Kantrowitz. I'm the host of Big Technology Podcast, and I'm here to speak with you about the frontiers of AI with two amazing guests. Demis Hassabis, the CEO of DeepMind is here. Google DeepMind. Good to see you, Demis. Good to see you too. And we have a special guest. Sergey Brin, the co-founder of Google, is also here. All right. So this is going to be fun. Let's start with the frontier models. Demis, this is for you.

With what we know today about frontier models, how much improvement is there left to be unlocked? And why do you think so many smart people are saying that the gains are about to level off?

I think we're seeing incredible progress. You've all seen it today, all the amazing stuff we showed in the keynote. So I think we're seeing incredible gains with the existing techniques, pushing them to the limit. But we're also inventing new things all the time as well. And I think to get all the way to something like AGI, I think may require one or two more new breakthroughs. And I think we have lots of promising ideas that we're cooking up and we hope to bring into the main branch of the Gemini branch.

Alright, and so there's been this discussion about scale. You know, does scale solve all problems or does it not? So I want to ask you, in terms of the improvement that's available today, is scale still the star or is it a supporting actor?

I think I've always been of the opinion you need both. You need to scale to the maximum the techniques that you know about. You want to exploit them to the limit, whether that's data or compute scale. And at the same time, you want to spend a bunch of effort on what's coming next, maybe six months, a year down the line, so you have the next innovation that might do a 10x leap in some way to kind of intersect with the scale. So you want both, in my opinion. But I don't know. Sergey, what do you think?

I mean, I agree it takes both. You know, you can have algorithmic improvements and simply compute improvements. Better chips, more chips, more power, bigger data centers. I think that historically if you look at things like the n-body problem and simulating, you know, just gravitational bodies and things like that, as you plot it, the algorithmic advances have actually beaten out the computational advances, even with Moore's law. If I had to guess,

I would say the algorithmic advances are probably going to be even more significant than the computational advances. But both of them are coming up now, so we're kind of getting the benefits of both. And Demis, do you think the majority of your improvement is coming from

Building bigger data centers and using more chips like there's talk about how the world will be just wallpapered with data centers Is that your vision? Well, no look, I mean it we're definitely gonna need a lot more data centers It's amazing that you know, it still amazes me from a scientific point of view. We turn sand into thinking machines It's pretty incredible, but actually it's not just for the training

It's now we've got these models that everyone wants to use, and actually we're seeing incredible demand for 2.5 Pro, and I think Flash, we're really excited about how performant that is for the incredible sort of low cost. I think the whole world's gonna want to use these things.

And so we're gonna need a lot of data centers for serving. And also for inference time compute, giving, you saw DeepThink today, 2.5 Pro DeepThink, the more time you give it, the better it'll be. And certain tasks, very high value, very difficult tasks, it will be worth letting it think for a very long time. And we're thinking about how to push that even further. And again, that's gonna require a lot of chips at runtime. - Okay, so you brought up test time compute.

We've been about a year into this reasoning paradigm, and you and I have spoken about it twice in the past as something that you might be able to add on to traditional LLMs to get gains. So I think this is a pretty good time for me to be like, what's happening? Can you help us contextualize the magnitude of improvement we're seeing from reasoning? We've always been big believers in what we're now calling this thinking paradigm. If you go back to our very early work on things like AlphaGo and AlphaZero, our agent work on

on playing games, they will all have this type of attribute of a thinking system on top of a model. And actually, you can quantify how much difference that makes if you look at a game like chess or Go. We had versions of AlphaGo and AlphaZero with the thinking turned off. So it was just the model telling you its first idea. And it's not bad. It's maybe like master level, something like that. But then if you turn the thinking on, it's way beyond world champion level. It's like a 600 ELO plus difference.

between the two versions. So you can see that in games, let alone for the real world, which is way more complicated. And I think the gains will be potentially even bigger by adding this thinking type of paradigm on top. Of course, the challenge is that your models-- and I talked about this earlier in the talk-- need to be a kind of world model. And that's much harder than building a model of a simple game, of course.

and it has errors in it, and those can compound over longer-term plans. But I think we're making really good progress on all those fronts. Yeah, look, as Demis said, DeepMind really pioneered a lot of this reinforcement learning work, and what they did with AlphaGo and AlphaZero, as you mentioned, showed, as I recall,

something you would take 5,000 times as much training to match what you were able to do with still a lot of training and the inference time compute that you were doing with Go. So it's obviously a huge advantage and obviously like most of us we get some benefit by thinking before we speak. And although... Not always. I always get reminded to do that. But

I think that the AIs obviously are much stronger once you add that capability. And I think we're just at the tip of the iceberg right now in that sense. It's been less than a year than these models have really been out. Especially if you think about, obviously, with an AI, during its thinking process, it can also use a bunch of tools or even other AIs during that thinking process to improve what the final output is. So I think it's going to be an incredibly powerful paradigm.

Deep think is very interesting. I'm going to describe it. I'm trying to describe it right. It's basically a bunch of parallel reasoning processes working and then checking each other and then it's like reasoning on steroids. Now, Demis, you mentioned that the industry needs a couple more advances to get to AGI.

Where would you put this type of mechanism? Is this one of those that might get the industry closer? I think so. I think it's maybe part of one, shall we say. And there are others too that we need to... Maybe this can be part of improving reasoning. Where does true invention come from where you're not just solving a mass conjecture, you're actually proposing one or hypothesizing a new theory in physics.

I think we don't have systems yet that can do that type of creativity. I think they're coming. These types of paradigms might be helpful in that, things like thinking and then probably many other things. I think we need a lot of advances on the accuracy of the world models that we're building. I think you saw that with VO, the potential VO3 of how it amazes me how it can intuit the physics.

of the light and the gravity. I used to work on computer games, not just the AI, but also graphics engines in my early career. And I remember having to do all of this by hand and program all of the lighting and the shaders and all of these things, incredibly complicated stuff we used to do in early games. And now it's just intuiting it within the model. It's pretty astounding.

I saw you shared an image of a frying pan with some onions and some oil. There was no subliminal messaging about that? No, not really. Just maybe a subtle message.

So we said the word AGI or the acronym AGI a couple of times. There's, I think, a movement within the AI world right now to say, let's not say AGI anymore. The term is so overused as to be meaningless. But Demis, it seems like you think it's important. Why? Yeah, I think it's very important. But I think, I mean, maybe I need to write something about this also with Shane Legg, who's our chief scientist, who was one of the people who invented the term 25 years back.

I think there's sort of two things that are getting a little bit conflated. One is like, what can a typical person do, an individual do? And we're all very capable, but we can only do-- however capable we are, there's only a certain slice of things that one is expert in.

Or you could say, what can you do, what 90% of humans can do? That's obviously going to be economically very important, and I think from a product perspective also very important. So it's a very important milestone. So maybe we should say that's typical human intelligence. But what I'm interested in, and what I would call AGI, is really a more theoretical construct, which is what is the human brain as an architecture able to do?

Right? And that's the human brain is an important reference point because it's the only evidence we have maybe in the universe that general intelligence is possible. And there, you would have to show your system was capable of doing the range of things even the best humans in history were able to do with the same brain architectures. Not one brain, but the same brain architecture. So what Einstein did, what Mozart was able to do, what Marie Curie

and so on. And that, it's clear to me today's systems don't have that. And then the other thing that why I think it's sort of overblown the hype today on AGI is that our systems are not consistent enough to be considered to be fully general yet. They're quite general, so they can do thousands of things. You've seen many impressive things today. But every one of us have experience with today's chat bots and assistants. You can easily, within a few minutes, find some obvious flaw with them.

some high school math thing that it doesn't solve, some basic game it can't play. It's not very difficult to find that, those holes in the system. And for me, for something to be called AGI, it would need to be consistent, much more consistent across the board than it is today. It should take like...

couple of months for maybe a team of experts to find a hole in it, an obvious hole in it. Whereas today it takes an individual minutes to find that. - Sergey, this is a good one for you. Do you think that AGI is gonna be reached by one company and it's game over? Or could you see Google having AGI, OpenAI having AGI, Anthropic having AGI, China having AGI? - Wow, that's a great question.

I guess I would suppose that one company or country or entity will reach AGI first. Now it is a little bit of a spectrum. It's not like a completely precise thing, so it's conceivable that there will be more than one roughly in that range at the same time. After that what happens, I think it's very hard to foresee, but you could certainly imagine there's going to be multiple entities that come through and

In our AI space, we've seen when we make a certain kind of advance, other companies are quick to follow and vice versa. When other companies make certain advances, it's a constant leapfrog. I do think there's an inspiration element that you see, and that would probably encourage more and more entities to cross that threshold. Dennis, what do you think?

Well, I think we probably do. I think it is important for the field to agree on a definition of AGI, so maybe we should try and help that to coalesce. Assuming there is one, there probably will be some organizations that get there first, and I think it's important that those first systems are built reliably and safely and

And I think after that, if that's the case, we can imagine using them to shard off many systems that have safe architectures sort of built under-- sort of provably underneath them. And then you could have personal AGIs and all sorts of things happening. But it's quite difficult. As Sergei says, it's pretty difficult to predict

sort of see beyond the event horizon to predict what that's going to be like. Right, so we talked a little bit about the definition of AGI and a lot of people have said AGI must be knowledge, right? The intelligence of the brain. What about the intelligence of the heart? Demis, briefly, does AI have to have emotion to be considered AGI? Can it have emotion? I think it will need to understand emotion. I don't know if... I think it will be a sort of almost a design decision if we wanted to mimic emotions.

I don't see any reason why it couldn't in theory. But it might be different or it might be not necessary or in fact not desirable for them to have the sort of emotional reactions that we do as humans. So I think again it's a bit of an open question as we get closer to this AGI timeframe and sort of events, which I think is more on a five to ten year time scale. So I think we have a bit of time, not much time, but some time to research those kinds of questions.

When I think about how the timeframe might be shrunk, I wonder if it's going to be the creation of self-improving systems. And last week, I almost fell out of my chair reading this headline about something called Alpha Evolve, which is an AI that helps design better algorithms and even improve the way LLMs train. So, Demis, are you trying to cause an intelligence explosion?

No, not an uncontrolled one. Look, I think it's an interesting first experiment. It's an amazing system, great team that's working on that, where it's interesting now to start pairing other types of techniques, in this case evolutionary programming techniques, with the latest foundation models, which are getting increasingly powerful. And I actually want to see in our exploratory work a lot more of these kind of combinatorial systems and sort of pairing different approaches together.

And you're right, that is one of the things, a self-improvement, someone discovering a kind of self-improvement loop would be one way where things might accelerate further than they're even going today. And we've seen it before with our own work, with things like AlphaZero, learning chess and Go and any two-player game from scratch.

within less than 24 hours, starting from random with self-improving processes. So we know it's possible, but again, those are in quite limited game domains, which are very well described. So the real world is far messier and far more complex. So it remains to be seen if that type of approach can work in a more general way. - Sergey, we've talked about some very powerful systems, and it's a race. It's a race to develop these systems. Is that why you came back to Google?

I think as a computer scientist, it's a very unique time in history. Honestly, anybody who's a computer scientist should not be retired right now, should be working on AI. That's what I would just say. There's just never been a greater problem, an opportunity, a greater cusp of technology. I wouldn't say it's because of the race.

Although we fully intend that Gemini will be the very first AGI, to clarify that. But to be immersed in this incredible technological revolution, I mean it's unlike, you know, I went through sort of the web 1.0 thing, it was very exciting, and whatever, we had mobile, we had this, we had that, but I think this is scientifically

far more exciting. And I think ultimately the impact on the world is going to be even greater. And as much as the web and mobile phones have had a lot of impact, I think AI is going to be vastly more transformative. So what do you do day to day?

I think I torture people like Demis. It was amazing, by the way. He tolerated me crashing this fireside. I'm across the street pretty much every day, and they're just...

people who are working on the key Gemini text models, on the pre-training, on the post-training. Mostly those I periodically delve into some of the multimodal work. VO3 is, you've all seen. But I tend to be pretty deep in the technical details. And that's a luxury I really enjoy, fortunately, because guys like Demis are minding the shop.

And yeah, that's just where my scientific interest is. It's deep in the algorithms and how they can evolve.

OK. Let's talk about the products a little bit, some that were introduced recently. I just want to ask you a broad question about agents, demos. Because when I look at other tech companies building agents, what we see in the demos is usually something that's contextually aware, has a disembodied voice, is often interacted-- you often interact with it on a screen. When I see DeepMind and Google demos, oftentimes it's through the camera. It's very visual.

there was an announcement about smart glasses today. So talk a little bit about if that's the right read, why Google is so interested in having an assistant or a companion that is something that sees the world as you see it. Well, it's for several reasons, several threads come together. So as we talked earlier, we've always been interested in agents. That's actually the heritage of DeepMind, actually. We started with agent-based systems in games.

We are trying to build AGI, which is a full general intelligence. Clearly, that would have to understand the physical environment, the physical world around you. And two of the massive use cases for that, in my opinion, are a truly useful assistant that can come around with you in your daily life, not just stuck on your computer or one device. We want it to be useful in your everyday life for everything. And so it needs to come around you and understand your physical context.

And then the other big thing is I've always felt for robotics to work, you sort of want what you saw with Astra on a robot. And I've always felt that the bottleneck in robotics isn't so much the hardware, although obviously there's many, many companies

and working on fantastic hardware, and we partner with a lot of them. But it's actually the software intelligence that I think is always what's held robotics back. But I think we're in a really exciting moment now where finally, with these latest versions, especially 2.5 Gemini, and more things that we're going to bring in, this kind of VO technology and other things, I think we're going to have really exciting algorithms to make robotics finally work and sort of realize its potential, which could be enormous.

So I think this and then in the end AGI needs to be able to do all of those things.

So for us, and that's why you can see we always had this in mind. That's why Gemini was built from the beginning, even the earliest versions to be multimodal. And that made it harder at the start because it's harder to make things multimodal than just text only. But in the end, I think we're reaping the benefits of those decisions now. And I see many of the Gemini team here in the front row of the correct decisions we made. They were the hardest decisions, but we made the right decisions. And now you can see the fruits of that with all of what you've seen today, actually.

Sergey, I've been thinking about whether to ask you a Google Glass question. SEGREG VARGAS: Oh, fire away. What did you learn from Glass that Google might be able to apply today, now that it seems like smart glasses have made a reappearance? SEGREG VARGAS: Wow, yeah, great question. I learned a lot. I mean, that was-- I definitely feel like I made a lot of mistakes with Google Glass, I'll be honest.

I am still a big believer in the form factor, so I'm glad that we have it now. And now it looks like normal glasses, doesn't have the thing in front. I think there was a technology gap, honestly. Now in the AI world, the things that these glasses can do to help you out without constantly distracting you, that capability is much higher. There's also just a...

I just didn't know anything about consumer electronics supply chains really, and how hard it would be to build that and have it be at a reasonable price point, managing all the manufacturing and so forth. This time we have great partners that are helping us build this. So that's another step forward. What else can I say? I do have to say I miss the airship with the wingsuiting skydivers for the demo.

Honestly, it would have been even cooler here at Shoreline Amphitheater than it was up in Moscone back in the day. But maybe we'll have to...

We should probably polish the product first this time. We'll do it that way around this time. Make sure it's ready and available, and then we'll do a really cool demo. So that's probably a smart move. Yeah, what I will say is, I mean, look, we've got, obviously, an incredible history of Glass devices and smart devices. We can bring all those learnings to today. And I'm very excited about our new Glasses, as you saw. But what I was always talking to our team and Shoram and the team about is that, I mean, I don't know if Sergey would agree, but I feel like--

the universal assistant is the killer app for smart glasses. And I think that's what's going to make it work, apart from the fact that the hardware technology has also moved on and improved a lot. I feel like this is the actual killer app, the natural killer app for it. Okay. Briefly on video generation, I sat in the audience in the keynote today and was like,

fairly blown away by the level of improvement we've seen from these models. And I mean you had filmmakers talking about it in the presentation. I want to ask you, Demis, specifically about model quality. If the internet fills with video that's been made with artificial intelligence, does that then go back into the training and lead to a lower quality model than if you were training just from human generated content?

Yeah, well, look, there's a lot of worries about this so-called model collapse. I mean, video is just one thing, but in any modality, text

There's a few things to say about that. First of all, we're very rigorous with our data quality management and curation. We also, at least for all of our generative models, we attach SynthID to them. So there's this invisible AI actually made watermark that is pretty very robust, has held up now for a year, 18 months since we released it. And all of our images and videos are embedded with this watermark so we can detect and

and we're releasing tools to allow anyone to detect these watermarks and know that that was an AI-generated image or video. And of course, that's important to combat deepfakes and misinformation, but it's also, of course, you could use that to filter out, if you wanted to, whatever was in your training data.

So I don't actually see that as a big problem. Eventually, we may have video models that are so good, you could put them back into the loop as a source of additional data, synthetic data, it's called. And there, you've just got to be very careful that you're actually creating from the same distribution that you're going to model. You're not distorting that distribution somehow. The quality is high enough. We have some experience of this in a completely different main with things like AlphaFold.

where there wasn't actually enough real experimental data to build the final alpha fold. So we had to build an earlier version that then predicted about a million protein structures. And then we selected, it had a confidence level on that. We selected the top three, 400,000 and put them back in the training data. So there's lots of

it's very cutting edge research to like mix synthetic data with real data. So there are also ways of doing that, but in terms of the video sort of generator stuff, you can just exclude it if you want to, at least with our own work, and hopefully other gen media companies follow suit and put robust watermarks in. Also, obviously, first and foremost to combat deep fakes and misinformation.

Okay, we have four minutes. I got four questions left. We now move to the miscellaneous part of my questions. Let's see how many we can get through and as fast as we can get through them. Let's go to Sergey with this one. What does the web look like in 10 years? What does the web look like in 10 years? I mean... You go one minute. Boy, I think 10 years, because of the rate of progress in AI, is so far beyond anything we can see. Best guess.

Not just the web. I don't think we really know what the world looks like in 10 years. OK. Demis?

Well, I think that's a good answer. I do think the web, I think in nearer term, the web is going to change quite a lot if you think about an agent-first web. Like, does it really need to, you know, it doesn't necessarily need to see renders and things like we do as humans using the web. So I think things will be pretty different in a few years. Okay. This is kind of an under-over question. AGI before 2030 or after 2030?

2030, boy, you really kind of put it on that fine line. I'm going to say before. Before? Yeah. Demis? I'm just after. Just after. Okay. No pressure, Demis. Exactly. I have to go back and get working harder. I can ask for it. He needs to deliver it. Exactly. Pop Sandbag. We need that next week. That's true.

I'll come to the review. All right. So would you hire someone that used AI in their interview? Demis? Oh, in their interview? Depends how they used it. I think using today's models tools, probably not. But I think that would be... Well, it depends how they would use it, actually. I think it's probably the answer. Sergey? I mean, I never interviewed at all. So...

I don't know. I feel it would be hypocritical for me to judge people exactly how they interview. Yeah, I haven't either, actually. So, snap on that. I've never done a job like that. Okay.

So, Demis, I've been reading your tweets. You put a very interesting tweet up where there was a prompt that created some sort of natural scene. Oh, yeah. Here was the tweet. Natured assimilation at the press of a button does make you wonder, with a couple of emojis. And people ran with that and wrote some headlines saying, Demis thinks we're in a simulation. Are

Are we in a simulation? Not in the way that Nick Bostrom and people talk about. I don't think this is some kind of game, even though I wrote a lot of games. I do think that ultimately underlying physics is information theory. So I do think we're in a computational universe, but it's not just a straightforward simulation. I can't answer you in one minute.

But I think the fact that these systems are able to model real structures in nature is quite interesting and telling. And I've been thinking a lot about our work we've done with AlphaGo and AlphaFold and these types of systems. I've spoken a little bit about it. Maybe at some point I'll write up a scientific paper about what I think that really means in terms of what's actually going on here in reality. Sergey, you want to make a headline?

Well, I think that argument applies recursively, right? If we're in a simulation, then by the same argument whatever beings are making the simulation are themselves in the simulation for roughly the same reasons and so on and so forth. So I think you're gonna have to either accept that we're in an infinite stack of simulations or that there's got to be some stopping criteria. And what's your best guess? I think that

we're taking a very anthropocentric view, like when we say simulation in the sense that some kind of conscious being is running a simulation that we are then in and that they have some kind of semblance of desire and consciousness that's similar to us. I think that's where it kind of breaks down for me. So I just don't think that we're really equipped to reason about sort of one level up in the hierarchy.

Okay, well, Dennis, Sergey, thank you so much. This has been such a fascinating conversation. Thank you. Thank you all. All right. Thanks, Alex. Thank you. Sergey. Pleasure.