We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Demis Hassabis on AI, game theory, multimodality, and the nature of creativity

Demis Hassabis on AI, game theory, multimodality, and the nature of creativity

2025/4/9
logo of podcast Possible

Possible

AI Deep Dive AI Chapters Transcript
People
D
Demis Hassabis
Topics
Demis Hassabis: 我认为人工智能将是史上最具变革性的技术,它将影响世界上的每一个行业、每一个国家。因此,我认为全球合作至关重要,人工智能的设计和应用不应仅限于少数地区或公司,而应广泛吸收全球各领域的专业知识,包括哲学、社会科学、经济学等,以确保其公平公正地被使用。 我坚信,人工智能应该由全世界共同参与设计,而不是仅仅由少数科技公司或科学家决定。只有这样,才能确保人工智能造福全人类,避免其被用于不正当目的。 Reid Hoffman: (无核心论点,主要为引导性问题) Aria Finger: (无核心论点,主要为引导性问题)

Deep Dive

Chapters

Shownotes Transcript

Translations:
中文

AI is going to affect the whole world. It's going to affect every industry. It's going to affect every country. It's going to be the most transformative technology ever, in my opinion. So if that's true, and it's going to be like electricity or fire, then I think it's important that the whole world

participates in its design. I think it's important that it's not just a hundred square miles of patch of California. I do actually think it's important that we get these other inputs, the broader inputs, not just geographically, but also different subjects, philosophy, social sciences, economists, not just the tech companies, not just the scientists involved in deciding how this gets built and what it gets used for.

Hi, I'm Reid Hoffman. And I'm Aria Finger. We want to know how, together, we can use technology like AI to help us shape the best possible future. With support from Stripe, we ask technologists, ambitious builders, and deep thinkers to help us sketch out the brightest version of the future, and we learn what it'll take to get there. This is possible.

In the 13th century, Sir Galahad embarked on a treacherous journey in pursuit of the elusive Holy Grail. The Grail, known in Christian lore as the cup Christ used in the Last Supper, had disappeared from King Arthur's table. The knights of the round table swore to find it. After many trials, Galahad's pure heart allowed him the unique ability to look into the Grail and observe divine mysteries that could not be described by the human tongue.

In 2020, a team of researchers at DeepMind successfully created a model called AlphaFold that could predict how proteins will fold. This model helped answer one of the holy grail questions of biology. How does a long line of amino acids configure itself into a 3D structure that becomes the building block of life itself?

In October 2024, three scientists involved with AlphaFold won a Nobel Prize for these efforts.

This is just one of the striking achievements spearheaded by our guest today. Demis Hassabis is a British artificial intelligence researcher, co-founder, and CEO of the AI company DeepMind. Under his leadership, DeepMind developed AlphaGo, the first AI to defeat a human world champion in Go, and later created AlphaFold, which solved the 50-year protein folding problem.

He is considered one of the most influential figures in AI. Reid and I sat down for an interview with Demis, in which we talked about everything from game theory to medicine to multimodality and the nature of innovation and creativity. Here's our conversation with Demis Hassabis. ♪

Demis, welcome to Possible. It was awesome dining with you at Queens. It was kind of a special moment in all kinds of ways. And, you know, I think I'm going to start with a question that kind of came from your Babbage Theater lecture and also from the Fireside Chat that you did with Mohamed El-Erian, which is, share with us the moment where you went from thinking,

chess is the kind of the thing that I have, you know, spent my childhood doing to what I want to do is start thinking about thinking. I want to accelerate the process of thinking and that computers are a way to do that. And how did you arrive at that? What age were you? What was the, what was the, what was that turn into the, into metacognition?

Well, yeah. Well, first of all, thanks for having me on the podcast. Chess for me is where it all started actually in gaming. And I started playing chess when I was four, very seriously, all through my childhood, playing for most of the England junior teams, captaining a lot of the teams. And for a long while, my main aim was to become a professional chess player, a grandmaster, maybe one day possibly a world champion. And that was my whole childhood, really. It

Every, every spare moment, not at school, I was, I was playing, going around the world, playing chess, you know, against adults in, in, in international tournaments. And then around 11 years old, I sort of had an epiphany really that although I love chess and I still love chess today, is it really something that one should spend their entire, your entire life on? Is it the best use of my mind?

So that was one thing that was troubling me a little bit. But then the other thing was, as we were going to training camps with the England chess team, you know, we started to use early chess computers and to try and improve your chess. And I remember thinking,

Of course, we were supposed to be focusing on improving the chess openings and chess theory and tactics. But actually, I was more fascinated by the fact that someone had programmed this inanimate lump of plastic to play very good chess against me. And I was fascinated by how that was done. And I really wanted to understand that and then eventually try and make my own chess programs.

I mean, it's so funny. I was saying to read before this, my seven-year-old school just won the New York State Chess Championship. So they have a long way to go before they get to you. But he takes it on faith, like, oh, yeah, mom, I'm just going to go play chess kid on the computer. Like, I'll go play against the computer a few games, which, of course, was sort of a revelation sort of decades ago. And

I remember, you know, when I was in middle school, it was obviously the deep blue versus Garry Kasparov. And this was like a man versus machine moment. And one thing that you've gestured at about this moment is that it illustrated, like in this case, based on Grandmaster Data, it was like brute force versus like a self-learning system. Can you say more about that dichotomy?

Yeah, well, look, first of all, I mean, it's great. Your son's playing chess and I think it's fantastic. I'm a big advocate for teaching chess in schools as a part of the curriculum. I think it's fantastic training for the mind, just like doing maths or programming would be. And it's certainly affected the way, you know, the way I approach problems and problem solve and visualize solutions and plan, you know, teaches you all these amazing meta skills, dealing with pressure. So you sort of learn all of that as a young kid.

which is fantastic for anything else you're going to do. And as far as Deep Blue goes, you're right. Most of these early chess programs and then Deep Blue became the pinnacle of that were these types of expert systems, which at the time was the favored way of approaching AI, where actually it's the programmers that solve the problem, in this case, playing chess,

then they encapsulate that solution in a set of heuristics and rules, which guides a kind of brute force search towards, you know, in this case, making a good chess move. And I always had this, although I was fascinated by these Ailey chess programs that they could do that, I was also slightly disappointed by them. And actually by the time it got to Deep Blue, I was already studying at Cambridge in my undergrad. I was actually more impressed with Kasparov's mind

because I'd already started studying neuroscience than I was with the machine, because here was this brute of a machine, all it can do is play chess, and then Kasparov can play chess at the same sort of roughly the same level, but also can do all the other things, amazing things that humans can do. And so I thought, doesn't that speak to the wonderfulness of the human mind? And it also more importantly means something was missing from very fundamental from Deep Blue and these expert system approaches to AI.

very clearly because Deep Blue did not seem, even though it was a pinnacle of AI at the time, it did not seem intelligent. And what was missing was its ability to learn, learn new things. So for example,

It was crazy that Deep Blue could play chess to world champion level, but it couldn't even play tic-tac-toe, right? You'd have to reprogram. Nothing in the system would allow it to play tic-tac-toe. So that's odd, right? That's very different to a human grandmaster who should obviously play a simpler game trivially. And then also it was not general, right, in the way that the human mind is. And I think those are the hallmarks. That's what I took away from that match is those are the hallmarks of intelligence and they were needed if we wanted to crack AI.

And go a little bit into the deep learning, which obviously is part of the reason why DeepMind was nameboard is because part of, I think, that what would seem to be completely contrarian hypothesis that you guys played out with self-play and kind of learning system was that this learning approach was the right way to generate these significant systems. So say a little bit about having the hypothesis, what the trek through the desert looked like, and then what finding the Nile ended up with.

Yes. Well, look, of course, we started DeepMind in 2010 before anyone was working on this in industry and there was barely any work on it in academia. And we partially named the company DeepMind, the deep part, because of deep learning. It was also a nod to deep thought in, you know, Hitchhiker's Guide to the Galaxy and Deep Blue and other AI things. But it was mostly around the idea we were better on these learning techniques.

Deep learning and hierarchical neural networks, they've just sort of been invented in seminal work by Jeff Hinton and colleagues in 2006. So it's very, very new. And reinforcement learning, which has always been a speciality of DeepMind, and the idea of learning from trial and error, learning from your experience.

right? And then, and making plans and acting in the world. And we combine those two things really, we sort of pioneered doing that and we called it deep reinforcement learning, these two approaches and deep learning to kind of build a model of the environment or what you were doing in this case, a game, and then the reinforcement learning to do the planning and

and the acting and actually accomplish it, be able to build agent systems that could accomplish goals in the case of games is maximizing the score, winning the game. And we felt that that was actually the entirety of what's needed for intelligence. And the reason that we sort of were pretty confident about that is actually

from using the brain as an example, right? Basically, those are the two major components of how the brain works. You know, the brain is a neural network. It's a pattern matching and structure finding system. But then it also has

reinforcement learning and this idea of planning and learning from trial and error and trying to maximize reward, which is actually in the human brain and the animal brain, a mammal brain is the dopamine system implements that, a form of reinforcement learning called TD learning. So that gave us confidence that if we pushed hard enough in this direction, even though no one was really doing that, that eventually this should work, right? Because we have the existence proof of the human mind.

And of course, that's why I also studied neuroscience because when you're in the desert, like you say, you need any source of water or any evidence that you might get out of the desert. Even a mirage in the distance is a useful thing to understand in terms of giving you some direction when you're in the midst of that desert. And of course, AI was itself in the midst of that because several times this had failed. The expert system approach basically had reached a ceiling.

I could easily hog the entire interview, so I'm trying not to. So one of the things that the learning system obviously ended up creating was solving what was previously considered an insoluble problem. There were even people who thought that computers couldn't, like classical computational techniques couldn't solve Go, and it did.

But not only did it solve Go, but in the classic Move 37, it demonstrated originality, creativity that was beyond the thousands of years of Go play and books and the hundreds of years of very serious play. What was that moment of Move 37 like for understanding where AI is? And what do you think the next Move 37 is?

Well, look, the reason Go was considered to be and ended up being so much harder than chess, so it took another 20 years, even us with AlphaGo, and all the approaches that have been taken with chess, these expert systems approaches had failed with Go.

right? Basically couldn't even be a professional, let alone a world champion. And the reason was two main reasons. One is the complexity of Go is so enormous. You know, it's one way to measure that is there are 10 to the power, 170 possible positions, right? Far more than atoms in the universe. There's no way you can brute force a solution to Go.

it's impossible. But even harder than that is that it's such a beautiful, esoteric, elegant game. It's sort of considered art, an art form in Asia, really. And it's because it's both beautiful aesthetically, but also it's all about patterns rather than sort of brute calculation, which chess is more about. And so even the best players in the world

can't really describe to you very clearly what are the heuristics they're using. They just kind of intuitively feel the right moves, right? They'll sometimes just say that this move, why did you play this move? Well, it felt right, right? And then it turns out their intuition of their brilliant player, their intuition is brilliant, fantastic. And it's an amazingly beautiful and effective move.

But that's very difficult then to encapsulate in a set of heuristics and rules that to direct how a machine should play Go. And so that's why all of these kind of deep blue methods didn't work.

Now, we got around that by having the system learn for itself what are good patterns, what are good moves, what are good motifs and approaches, and what are kind of valuable and high probability of winning positions are. So it kind of learned that for itself through experience, through seeing millions of games and playing millions of games against itself. So that's how we got AlphaGo to be better than world champion level.

But the additional exciting thing about that is that it means those kinds of systems can actually go beyond what we as the programmers or the system designers know how to do. Right. No expert system can do that because, of course, it's strictly limited by what we can what we already know and can can describe to the machine.

but these systems can learn for themselves. So, and that's what we resulted in move 37 in game two of the famous, you know, world championship match, the challenge match we had against Lisa doll, uh, in Seoul in 2016. And that was a truly creative move. You know, go has been played for thousands of years.

It's the oldest game humans have invented and it's the most complex game. And it's been played professionally for hundreds of years in places like Japan. And even still, even despite all of that exploration by brilliant human players,

this move 37 was something never seen before. And actually, worse than that, it was thought to be a terrible strategy. In fact, if you go and watch the documentary, which I recommend, it's on YouTube now, of AlphaGo, you'll see the professional commentators nearly fell off their chairs when they saw move 37 because they thought it was a mistake. They

They thought that the computer operator, Adger, had misclicked on the computer because it was so unthinkable that someone would play that. And then of course, in the end, it turned out 100 moves later, that move 37, the stone, the piece that was put down on the board was in exactly the right place to be decisive

for the whole game. So now it's studied as a great classic of the history of Go, that game and that move. And of course, and then even more exciting for that is that's exactly what we hoped these systems would do because the whole point of me and my whole motivation, my whole life of working on AI was to use AI to accelerate scientific discovery. And it's those kinds of new innovations, albeit in a game, is what we were looking for from our systems.

And, you know, that I think is a awesome rendition of kind of why it is these learning systems are, you know, even now doing original discovery. What do you think the next, you know, move 37 might be for kind of opening our minds to what is the way that AI can add a whole lot to the kind of quality of human thought, human existence, human science? Yeah, well, look, I think, um,

I think there'll be a lot of move 37s in almost every area of human endeavor. Of course, the thing I've been focusing on since then is mostly being how can we apply those types of AI techniques, those learning techniques, those general learning techniques to science. Big areas of science, I call them root node problems. So problems where if you think of the tree of all knowledge that's out there in the universe, can you unlock

some root nodes that unlock entire branches or new avenues of discovery that people can build on afterwards, right? And for us, protein folding and alpha fold was one of those. It was always, you know, top of my list. I have a kind of mental list of all these types of problems that I've come across throughout my life and just being generally interested in all areas of science.

and, um, and, and, and sort of thinking through which ones would be suitable, uh, would both be hugely impactful, um, but also suitable for these types of techniques. Um, and I think we're, you know, we're going to see a kind of new golden era of, of these types of new strategies, new ideas in very important areas of human endeavor. Now I would say one thing to say though, is that we haven't

fully cracked creativity yet, right? So I don't want to claim that. I think that there are, you know, I describe as three levels of creativity and I think AI is capable of the first two. So first one would be interpellation. So you give it, you know, a million pictures,

of cats and AI system, many pictures of cats. And you say, create me a prototypical cat. And it will just like average all the million cats pictures that it's seen. And that prototypical one won't be in the training set. So it will be a unique cat, but it's not very, you know, that's not very interesting from a creative point of view, right? It's just an averaging.

But the second thing would be what I call extrapolation. So that's more like AlphaGo, where you've played 10 million games of Go, you've looked at a few million human games of Go, but then you come up with, you extrapolate from what's known to a new strategy never seen before, like Move37.

Okay. So that's very valuable already. I think that is true creativity. But then there's a third level, which I call it kind of invention or out of the box thinking, which is not only can you come up with a move 37, but could you have invented go? Or another measure I like to use is if we went back to time of Einstein in 1900, early 1900s, could an AI system actually come up with general relativity with the same information that Einstein had at the time?

And clearly today, the answer is no to those things. It can't invent a game as great as Go, and it wouldn't be able to invent general relativity just from the information that Einstein had at the time. And so there's still something missing from our systems to get true out of the box thinking. But I think it will come, but we just don't have it yet. Will Barron: I think

I think so many people outside of sort of the AI realm would sort of be surprised that sort of it all starts with gaming, but that's sort of gospel for what we're doing. It's like, that's how we created these systems. And so switching gears from board games to video games, can you give us just like the elevator pitch explanation for what exactly makes an AI that can play StarCraft II like AlphaStar so much more advanced and fascinating than the one that can play chess or go?

Yeah, with AlphaGo, we sort of cracked the pinnacle of board games, right? So Go was always considered the Mount Everest, if you like, of games AI for board games. But there are even more complex games by some measures if you take on board the most complex strategy games that you can play online.

on computers. And StarCraft II is acknowledged to be the sort of classic of the genre of real-time strategy games. And it's a very complex game. You've got to build up your base and your units and other things. So every game is different, right? And the board game is very fluid and you've got to move many units around in real time. And the way we cracked that was to add this additional level in of a league of agents competing against each other, all seeded

with slightly different initial strategies. And then you kind of get a sort of survival of the fittest. You have a tournament between them all. So it's a kind of multi-agent setup now. And the strategies that win out in that tournament go to the next, you know, the next epoch. And then you generate some other new strategies around that. And you keep doing that for many generations. You're kind of both having this idea of self-play that we had in AlphaGo, but you're adding in this multi-agent competitive, almost evolutionary dynamic in there.

And then eventually you get an agent that or series or set of agents that are kind of the Nash distribution of agents that no other strategy dominates them, but they dominate the most number of other strategies. And then you have this kind of Nash equilibrium and then you pick out the, you know, you pick out the top agents from that.

And that succeeded very well with this type of very open-ended kind of gameplay. So it's quite different from what you get with chess or Go where the rules are very prescribed and the pieces that you get are always the same.

And it's sort of very ordered game, something like Starcraft's much more chaotic. So it's sort of interesting to have to deal with that. It has hidden information too. You can't see the whole map at once. You have to explore it. So it's not a perfect information game, which is another thing we wanted our systems to be able to cope with is partial information situations, which is actually more like the real world, right?

Very rarely in the real world do you actually have full information about everything. Usually you only have partial information and then you have to infer everything else in order to come up with the right strategies. And part of the game side of this is, I presume you've heard that there's this kind of theory of homo ludens. Yes. That we're game players. Is that informing the kind of thinking about how games is both strategic but also effective?

you know, kind of framing for like science acceleration framing for, you know, kind of the serendipity of innovation is, is, is in addition to the kind of the fitness function, the, the kind of evolution of self-play, the ability to play scale compute, are there other deeper elements to the game playing nature that allows this thinking of thinking?

Well, look, I'm glad you brought up Homer Ludens and it's a, it's a wonderful book. And it basically argues, um, that, that, that games playing is, is, is, is actually a fundamental part of being human, right? In many ways, that's the, you know, the, the act of play, what could be more human than that? Right. And, and then of course it, it leads into creativity, fun, uh, you know, all of these things.

kind of get built on top of that. And so I've always loved them as a way to practice and train your own mind in situations that you might only ever get a handful of times in real life, but they're usually very critical. What company to start, what deal to make, things like that. So I think games is a way to practice those scenarios. And if you take games seriously, then you can actually simulate a lot of the pressures one would have in decision-making situations.

And going back to earlier, that's why I think chess is such a great training ground for kids to learn because it does teach them about all of these situations. And so, of course, it's the same for AI systems too. There was the perfect proof

proving ground for our early AI system ideas, partly because they were invented to be challenging and fun for humans to play. And of course, there's different levels of gameplay. So we could start with very simple games like Atari games,

and then go all the way up to the most complex computer games like StarCraft, right? And continue to sort of challenge our system. So we were in the sweet spot of the S-curve. So it's not too easy, it's trivial or too hard. You can't even see if you're making any progress. You want to be in that maximum sort of part of the S-curve where you're making almost exponential progress. And we could keep picking harder and harder games as our systems got improved. And then the other nice feature about games is

because they're some kind of microcosm of the real world, they've usually been boiled down to very clear objective functions, right? So winning the game or maximizing the score is usually the objective of a game. And that's very easy to specify to a reinforcement learning system or an agent-based system. So it's perfect for hill climbing against, right? And measuring ELO scores, ratings, and exactly where you are.

And then finally, of course, you can calibrate yourselves against the best human players. So you can sort of calibrate what your agents are doing in their own tournaments. In the end, even with the StarCraft agent, we had to eventually challenge a professional grandmaster at StarCraft to make sure that our systems hadn't overfitted somehow to their own tournament strategies, right? It actually needed to be, oh, we grounded it with, oh, it can actually be a genuine human grandmaster StarCraft player.

The final thing is, of course, you can generate as much synthetic data as you want with games too, which is coming into vogue right now, again, about data limitations and with large language models and how many tokens left in the world and has it read everything in the world. Obviously, for things like games, you can actually just play the system against itself and generate lots more data from the right distribution.

Can you double click on that for a moment? Like you said, it is in vogue to talk about, are we running out of data? Do we need synthetic data? Like, where do you stand on that issue? Well, I've always been a huge proponent of simulations and simulations and AI. And, you know, it's also interesting to think about what the real world is, right, in terms of a computational system. And so I've always been involved with trying to build very realistic simulations of things.

And now, of course, that interacts with AI because you can have an AI that learns a simulator of some real world system just by observing that system or all the data from that system. So I think the current debate is to do with these large foundation models now pretty much use the whole internet, right? And so then once you've tried to learn from those, what's left, right? That's all the language that's out there.

Of course, there's other modalities like video and audio. I don't think we've exhausted all of that kind of multimodal tokens, but even that will reach some limit. So then the question comes of like, can you generate synthetic data? And I think that's why you're seeing quite a lot of progress with maths and coding, because in those domains,

it's quite easy to generate synthetic data. The problem with synthetic data is, are you creating data that is from the right distribution, the actual distribution, right? Does it mimic the kind of real distribution? And also, are you generating data that's correct, right? And of course, for things like maths, for coding and for things like gaming, you can actually test the final data and verify if it's correct.

Before you feed it in as input into the training data for a new system. So it's very amenable, certain areas. In fact, it turns out the more abstract areas of human thinking that you can verify and prove that it's correct. And so therefore, that unlocks the sort of ability to create a lot of synthetic data.

On this podcast, we like to focus on what's possible with AI because we know it's the key to the next era of growth. A truth well understood by Stripe, makers of Stripe Billing, the go-to monetization solution for AI companies. Stripe knows that when launching a new product, your revenue model can be just as important as the product itself.

In fact, every single one of the Forbes top 50 AI companies that has a product on the market today uses Stripe to monetize it. See what Stripe can do for your business at stripe.com. So one of the things that, you know, is kind of also in addition to the kind of the frequent discussion around, you know, data, how do we get more? But one of the questions is in order to do AI, right?

Is it important to actually have it embedded in the world? Yeah. Well, interestingly, if we talked about this five years ago or certainly 10 years ago, I would have said that

some real world experience, you know, uh, maybe through robotics. Usually when we talk about embodied intelligence, we meaning robotics, but it could also be a very accurate simulator, right? Uh, like some kind of ultra realistic game, uh, environment would be needed to fully understand the, say the physics of the world around you, right. And, and the physical context around you. And there's actually a whole branch of neuroscience that is, uh,

predicated on this is called action in perception. So this is the idea that one can't actually fully perceive the world unless you can also act in it. And the kinds of arguments go is like, how can you really understand the concept of the weight of something, for example, unless you can pick things up and sort of compare them with each other. And then you get this sort of idea of weight. Like, can you actually, you know, can you really get that notion just by looking at things? It

It seems hard, right? Certainly for humans. Like I think you need to act in the world. So this is idea that acting in the world is part of your learning. You're kind of like an active learner. And in fact, reinforcement learning is like that because the decisions you make give you new experiences, but those experiences depend on the actions you took. But also those are the experiences that you'll then subsequently learn from. So in a sense, reinforcement learning systems are involved in their own learning process.

right? Because they're active learners. And I think you can make a good argument that that's also required in the physical world. Now, it turns out, I'm not sure I believe that anymore, because now, you know, with our systems, especially our video models, if you've seen VO2, you know, our latest video models, completely state of the art, which we released late last year. And it

It kind of shocked even me that even though we're building this thing, that it can sort of basically by watching YouTube, a lot of YouTube videos, it can figure out, you know, the physics of the world. There's a sort of funny Turing test of, you know, in some sense, Turing says in verb commas of video models, which is, can you chop a tomato? Can you show a video of, you know, a knife chopping a tomato with the fingers and everything in the right place? And the tomato doesn't, you know, magically spring back together or the knife goes through the tomato without cutting it.

et cetera, and VO can do it. And if you think through the complexity of the physics, you know, to understand this, you know, you've got to, what you've got to keep consistent and so on. It's pretty amazing. It's like, it's hard to argue that it doesn't understand something about physics and, and the physics of the world. And it's done it without acting in the world and certainly not without, certainly not acting as a robot in the world.

Now, so it's not clear to me there is a limit now with just sort of passive perception. Now, the interesting thing is that I think this has huge consequences for robots as an embodied intelligence as an application, because the types of models we've built, Gemini and also now Veo, and we'll be combining those things together at some point in the future, is we've always built Gemini, our foundation model, to be multimodal from the beginning.

And the reason we did that and, you know, we still lead on all the multimodal benchmarks is because for twofold. One is we have a vision for this idea of a universal digital assistant, an assistant that goes around with you on the digital devices, but also in the real world, maybe on your phone or a glasses device and actually helps you in the real world.

Like recommend things to you, navigate, you know, help you navigate around, you know, help with physical things in the world, like cooking, stuff like that. And for that to work, you obviously need to understand the context that you're in. It's not just the language I'm typing into a chatbot. You actually have to understand the 3D world I'm living in, right? I think to be a really good assistant, you need to do that.

But the second thing is, of course, is exactly what you need for robotics as well. And we released our first big sort of Gemini robotics work, which has caused a bit of a stir. And that's the beginning of showcasing what we can do with these multimodal models that do understand physics of the world with a little bit of robotics fine-tuning on top to do with the actions, the motor actions and the planning a robot needs to do. And it looks like it's going to work.

So actually now I think these general models are actually going to transfer to the embodied robotic setting without too much extra sort of special casing or extra data or extra effort, which is probably not what most people, even the top roboticists would have predicted five years ago.

I mean, that's wild. And, you know, thinking about benchmarks and what we're going to need these digital assistants to do, like when we look under the hood of these big AI models, there's, well, some people would say it's attention. So the trade-offs is thinking time versus output quality. We need them to be fast, but of course we need them to be accurate. And so talk about like, what is that trade-off and how is that going in the world right now?

Well, look, we, of course, we, we, we sort of pioneered all that area of thinking systems because that's what our original gaming systems all did, right? Go, AlphaGo, but actually most famously AlphaZero, which was our follow-up system that could play any two-player game.

And there you always have to think about your time budget, your compute budget you've got to actually do the planning part, right? So the model you can pre-train, just like we do with our foundation models today. So you can play millions of games offline and then you have your model of chess or your model of Go or whatever it is. But at test time, at runtime, you've only got one minute to think about your move, right? One minute times however many computers you've got running. So that's still a limited capability.

compute budget. So what's very interesting today is there's this trade-off between, do you use a more expensive, larger base model, foundation model? So in our case, we have different size names like Gemini Flash or Pro or even bigger, which is Ultra, but those models are more costly to run. So they take longer to run, but they're more accurate and they're more capable. So you can run a bigger model

with a shorter number of planning steps, or you can run a very efficient, smaller model that's slightly less powerful, but you can run it for many more steps. And it's actually

Currently, what we're finding is it's sort of roughly about equal. But of course, what we want to find is the Pareto frontier of that, right? Like actually the exact right trade-off of the size of the model and the expense of running that model versus the amount of thinking time you want to... And thinking steps that you're able to do per unit of compute time. And I think that's actually...

fairly cutting edge research right now that I think all the leading labs are probably experimenting on. And I think there's not a clear answer to that yet. You know, all the major labs, DeepMind, others are all working intensely on coding assistance. And there's, you know, a number of reasons, everything from, you know, like, A, it's one of the things that accelerates productivity across the whole front. It has a kind of good fitness function. It's also, of course, one of the ways that

Everyone is going to be hands-on productivity is having a software, kind of copilot agent for helping. There's just a ton of reasons. Now, one of the things that gets interesting here is as you're building these, obviously there's a tendency to start with these computer languages that have been designed for humans. What would be computer languages that would be designed for AIs or an agentic world or designed for this hybrid process of a human plus an AI?

Is that a good world to start looking at those kind of computer languages? How would it change our theory of computation, linguistics, et cetera? I think we are entering a new era in coding, which is going to be very interesting. And as you say, all the leading labs are pushing on this frontier for many reasons. It's easy to create synthetic data. So that's another reason that everyone's pushing on this vector. And I think we're going to move into a world where

Uh, you know, sometimes it's called vibe coding, uh, where you're basically coding with natural language really. Right. And, and, and then, and, and we've seen this before with, with computers, right? I remember when I first started programming, you know, in the, in the eighties, we were doing assembler. And then of course you, you know, that seems crazy now. Like why would you do machine code? You just, you know, you, you start with C and then you get Python and so on. And really one could see is the natural evolution of going higher and higher up the abstraction stack.

of programming languages and leaving the lower, the more and more of the lower level implementation details to the compiler in a sense. And now this is just, you know, one could just view this as the, as the natural sort of final step is, well, we just use natural language. Uh, and then, and then the whole, you know, everything is, is high level program, you know, super high level programming language.

And I think eventually that's maybe what we'll get to. And the exciting thing there is that, of course, it will make accessible coding to a whole new range of people, creatives, right? Who normally would, you know, designers, game designers, app writers, that would normally would not have been able to implement their ideas without the help of, you know, teams of programmers. So that's going to be pretty exciting, I think, from a creativity point of view. But it may also be very good

certainly in the next few years for coders as well, because I think there's, and I think this in general with these AI tools is I think that the people that are gonna get most benefit out of them initially will be the experts in that area who also know how to use these tools in precisely the right way, you know, whether that's prompting or interfacing with your existing code base, you know, there's gonna be this sort of interim period

where I think the current experts who embrace these new tools, whether that's filmmakers, game designers, or coders, are going to be superhuman in terms of what they're able to do. And I see that with some film directors and film designer friends of mine who are able to create pitch decks, for example, for new film ideas in a day.

on their own, you know, and then they can, but it's very high quality pitch deck that they can pitch for a $10 million budget for. And normally they would have had to spend a few tens of thousands of dollars just to get to that pitch deck, which is a huge risk for them. So, so it becomes, um, you know, I think it's, there's going to be a whole new, incredible set of opportunities. And then there's the question of like, if you think about creative, the creative arts, whether there'll be new ways of working much more fluid, it's

Instead of doing, you know, Adobe Photoshop or something, you're actually co-creating this thing with this fluid responsive tool. And that could be kind of feel more like minority report or something, you know, I imagine with the kind of interface and there's this thing swirling around you and you're kind of, but it will require people to get used to a very new workflow.

to take like maximum advantage of that. But I think when they do, it will be probably incredible for those people. They'll be like 10x more productive. So I want to go back to the world of multimodal that we were talking about before with sort of robots in the real world. And so

Right now, most AI doesn't need to be multimodal in real time because the internet is not multimodal. And for our listeners, that means absorbing many types of input, voice, text, vision at once. And so can you go deeper in what you think the benefits of truly real-time multimodal AI will be? And like, what are the challenges to get to that point?

I think, first of all, we live in a multimodal world, right? And we have our five senses and that's what makes us human. So if we want our systems to be brilliant tools or fantastic assistants, I think in the end, they're going to have to understand the world, the spatial temporal world that we live in, not just our linguistic maths world. Right.

right? Abstract thinking world. I think that they'll need to be able to act in and plan in and, and process, uh, things in the real world and understand the real world.

I think that the potential for robotics is huge. I don't think it's had its chat GPT or its alpha fold moment yet, say in science and language, or alpha go moment. I think that's to come, but I think we're close. And as we talked about before, I think in order for that to happen, I think that the shortest path I see that happening on now is these general multimodal models.

being eventually good enough, and maybe we're not very far away from that, to sort of install on a robot, perhaps a humanoid robot with the cameras. Now there's additional challenges of you've got to fit it locally or maybe on the local chips to have the latency fast enough and so on. But as we all know, just wait a couple of years and those systems that stay with you all today will fit on a little mobile chip tomorrow. So I think it's very exciting multimodal from that point of view

robotics, assistance. And then finally, I think also for creativity, I think we're the first model in the world, Gemini 2.0, that you can try now in AI Studio that allows native image generation. So not calling a separate program, you know, in this separate model, in our case, Imogen 3, you know, which you can try separately, but actually Gemini itself natively coming up in the chat flow of images. And I think

people seem to be really enjoying using that. So it's sort of like you're now talking to a multimodal chatbot, right? And so you can get it to express emotions in pictures, or you can give it a picture and then tell it to modify it and then continue to work on it with word descriptions. You know, can you remove that background? Can you do this? So this is

This goes back to the earlier thing we said about programming or any of these creative things in a new workflow. I think we're just seeing the glimpse of that if you try out this new Gemini 2 experimental model of how that might look in image creation. And that's just the beginning. Of course, it will work with video and coding and all sorts of things. So in the land of the real world, multimodal,

One of the things that, you know, frequently people speculate is, you know, geolocation of AI work. And obviously in the U.S., we intensely track everything that's happening on the West Coast. We also intensely track DeepMind and then somewhat less Mistral, you know, and others. What's some of the stuff that's really key for the world to understand what's coming out of Europe?

What's the benefit of having there be multiple major centers of innovation and invention, you know, not just within the West Coast, but also obviously DeepMind in London and Mistral in Paris and others? And what are some of the things to for people to pay attention to why it's important and what's happening, especially within the UK and European AI ecosystem?

We started DeepMind in London and still headquartered here for several reasons. I mean, this is where I grew up. That's what I know. It's where I had all the contacts that I had. But the competitive reasons were that we felt that the talent in the UK and in Europe

was the coming out of universities was the equivalent of the top US ones. You know, Cambridge, my alma mater and Oxford, they're up there with the MITs and Harvard's and the Ivy Leagues, right? I think they're sort of, you know, they're always in the top 10 there together on the university world tables.

But if you, this is certainly true in 2010, if you were coming, say you had a PhD in physics out of Cambridge and you didn't want to work in finance at a hedge fund in the city, but you wanted to stay in the UK and be intellectually challenged, there were not that many options for you, right? There were not that many deep tech startups.

So we were the first really, and to prove that could be done. And actually we were a big draw for the whole of Europe. So we got the best people from the technical universities in Munich and in Switzerland and so on. And for a long while, that was a huge competitive advantage. And also salaries were cheaper here than in the West Coast and you weren't competing against the big incumbents. And also it was conducive. The other reason I chose to do that was

I knew that AGI, which was our plan from the beginning, you know, solve intelligence and then use it to solve everything else. That was our, where we articulated our mission statement. And I still like that, that framing of it. It was a 20 year mission. And, and,

If you're on a 20 year mission and we're now 15 years in, and I think we're sort of on track, uh, unbelievably, right. Which is strange for any 20 year mission, but is, is you don't want to be too distracted on the way in a deep sign, deep technology, deep scientific mission. So, uh, uh,

One of the issues I find with Silicon Valley is lots of benefits, obviously, contacts and support systems and funding and amazing things and the amount of talent there, the density of talent. But it is quite distracting, I feel. Everyone and their dog is trying to do a startup that they think is going to change the world, but it's just a photo app or something. And then the cafes are filled with this. Of course, it leads to some great things, but it's also a lot of noise if one

actually wants to commit to a long-term mission that you think is the most important thing ever. And you don't want to be too, you know, you and your staff and want to be too distracted and like, Oh, I could make a, maybe I could make a hundred million though, if I jumped and did this, you know, quickly did this gaming app or something. Right. And, and I think that's sort of the, the milieu that you're in, uh,

in the Valley, at least back then. Maybe this is less true now. There's probably more mission-focused startups now. But I kind of also wanted to prove it could be done elsewhere. And then the final reason I think it's important is that AI is going to affect...

the whole world, right? It's going to affect every industry. It's going to affect every country. It's going to be the most transformative technology ever, in my opinion. So if that's true, and it's going to be like electricity or fire, more impactful than even the internet or mobile, then

I think it's important that the whole world participates in its design and with the different value systems that we think are out there that are, you know, philosophies that are, you know, are good philosophies and, you know, from democratic values, you know, Western Europe,

US, I think it's important that it's not just a hundred square miles of a patch of California. I do actually think it's important that we get these other inputs, the broader inputs, not just geographically, but also, and I know you agree with this Reid, different subjects, philosophy, social sciences, economics,

academia, civil society, not just the tech companies, not just the scientists involved in deciding how this gets built and what it gets used for. And I feel that I've always felt that very strongly from the beginning. And I think having some European involvement and some UK involvement at the top table of the innovation is a good thing.

So Demis, one of the areas of AI that when anyone asks me like, hey, Aria, I know you're interested in AI, but like, well, you can write my emails. Like, why is it so special? I just say, no, think about what it can do in medicine. I always talk about AlphaFold. I tell them about what Reed is doing. Like, I'm just so excited for those breakthroughs. Can you give us just a little bit, you had the seminal breakthrough in AlphaFold and what is it going to do for the future of medicine?

I've always felt that what are the most important things AI can be used for? And I think there are two. One is human health. That's number one, trying to solve and cure terrible diseases.

And then number two is to help with energy sustainability and climate, the planet's health, let's call it. So there's human's health and then there's a planet's health. And those are the two areas that we have focused on in our science group, which I think is fairly unique amongst the AI labs actually in terms of how much we push that from the beginning.

And then, and protein folding specifically was this canonical for me. I sort of came across it when I was an undergrad in Cambridge, you know, 30 years ago. And it's always stuck with me as this fantastic puzzle that would unlock so many possibilities. You know, the structure of proteins, everything in life depends on proteins and we need to understand the structure so we know their function.

And if we know the function, then we can understand what goes wrong in disease and we can design drugs and molecules that will bind to the right part of the surface of the protein if you know the 3D structure.

So it's a fascinating problem. It goes to all of the computational things we were discussing earlier as well. Can you enumerate, can you see through this forest of possibilities, all these different ways a protein could fold? Some people estimate that Leventhal very famously in the 1960s estimated an average protein can fold in 10 to the 300 possible ways.

So how do you enumerate those astronomical possibilities? And yet it is possible with these learning systems. And that's what we did with AlphaFold. And then we spun out a company, Isomorphic, and I know Reid's very interested in this area too, with his new company of like, if we can reduce the time it takes to discover a protein structure from, it used to take a PhD student their entire PhD as a rule of thumb to discover one protein structure. So four or five years.

And there's 200 million proteins known to science. And we folded them all in one year. So we did a billion years of PhD time in one year is another way you can think of it. And then gave it to the world, you know, freely to use. And, you know, 2 million researchers around the world have used it.

And we spun out a new company, Isomorphic, to try and go further downstream now and develop the drugs needed and try and reduce that time. I mean, it's just amazing. I mean, Demis, there's a reason they give you the Nobel Prize. Thank you so much for all of your work in this area. It's truly amazing. Thank you.

And now to rapid fire. Is there a movie, song or book that fills you with optimism for the future? There's lots of movies that I've watched that have been super inspiring for me. Things like even like Blade Runner is probably my favorite sci-fi movie. But maybe it's not that optimistic. So if you want an optimistic thing, I would say the Culture series by Ian Banks. I think that's the best depiction of

of a post-AGI universe where, you know, AIs and you've basically got societies of AIs and humans and kind of alien species actually, and sort of maximum human flourishing across the galaxy. That's a kind of amazing, compelling future that I would hope for humanity. What is a question that you wish people asked you more often?

The questions I sort of often wonder why people don't discuss a lot more, including with me, are some of the really fundamental properties of reality that I

actually drove me in the beginning when I was a kid to think about building AI to help us sort of this ultimate tool for science. So for example, you know, I don't understand why people don't worry more about what is time, what is, what is, you know, what is gravity, what, what, you know, or the, basically the fundamental fabric of reality, like, which is sort of staring us in the face all the time, all these very obvious things that impact us all the time. And we, we don't really have any idea how it works. And I don't know why that it doesn't trouble people more.

It troubles me. And, you know, I'd love to have more debates with people about those things. But actually, most people don't seem to, you know, they seem to sort of shy away from those topics. Where do you see progress or momentum outside of your industry that inspires you? That's a tough one because AI is so general. It's almost touching, you know, what industry is outside of the AI industry. I'm not sure there's many.

Maybe, you know, the progress going on in quantum is kind of interesting. I still believe AI is going to get built first and then will maybe help us perfect our quantum systems. But I have, you know, ongoing bets with some of my quantum friends like Hartman Nevin on they're going to build quantum systems first and then that will help us accelerate AI. So I always keep a close eye on the advances going on with quantum computing systems.

Final question. Can you leave us with a final thought on what is possible over the next 15 years if everything breaks humanity's way? And what's the first step to get there? Well, what I hope for next 10, 15 years is what we're doing in medicine to really have new breakthroughs and breakthroughs.

I think maybe in the next 10, 15 years, we can actually have a real crack at solving all disease. That's the mission of Isomorphic. And I think with AlphaFold, we showed what the potential was to sort of do what I like to call science at digital speed. And why couldn't that also be applied

to finding medicines. And so my hope is 10, 15 years time, we'll look back on the medicine we have today, a bit like how we look back on medieval times and how we used to do medicine then, you know, and that would be, I think, the most incredible benefit we could imagine from AI. Possible is produced by Wonder Media Network. It's hosted by Aria Finger and me, Reid Hoffman. Our showrunner is Sean Young.

Possible is produced by Katie Sanders, Edie Allard, Sarah Schleid, Vanessa Handy, Aaliyah Yates, Paloma Moreno-Jimenez, and Malia Agudelo. Jenny Kaplan is our executive producer and editor. Special thanks to Surya Yalamanchili, Sayida Sepiyeva, Thanasi Dilos, Ian Ellis, Greg Beato, Parth Patil, and Ben Rellis.

And a big thanks to Leila Hajjaj, Alice Talbert, and Denise Owusu-Afrie.