We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

cover of episode Meet AlphaEvolve: The Autonomous Agent That Discovers Algorithms Better Than Humans With Google DeepMind’s Pushmeet Kohli and Matej Balog

Meet AlphaEvolve: The Autonomous Agent That Discovers Algorithms Better Than Humans With Google DeepMind’s Pushmeet Kohli and Matej Balog

2025/6/26

No Priors: Artificial Intelligence | Technology | Startups

This chapter introduces AlphaEvolve, an AI coding agent developed by Google DeepMind, capable of discovering new algorithms and solving open scientific problems. Its algorithms are practical and deployed in Google's infrastructure.

AlphaEvolve is an AI coding agent.
It discovers new algorithms.
Its algorithms are deployed in Google's infrastructure.

Shownotes Transcript

Hi, listeners, and welcome back to No Priors. Today, we're joined by two of the key folks behind one of the most compelling developments in AI this year, Alpha Evolve. Pushmit Kohli and Matej Balog worked on this autonomous coding agent that uses Gemini models and evolutionary search to discover new algorithms. It marks a major leap in AI's ability to contribute to core computer science and math, and perhaps sciences beyond that.

It's not just a stochastic parrot or a boilerplate generator. It has shown what you might consider technical creativity in the way that Move 37 did with AlphaGo, something humans hadn't done before, even in thousands of years of play. It might even be a real step on the path to self-improving AI.

Pushmeet, Matei, thank you so much for being here. Thank you for having us. It's a pleasure. Congratulations on the success and the launch of Alpha Evolve. Can you give me a brief description of what it is broadly? Yeah, so in maybe one sentence, Alpha Evolve is an AI coding agent that is able to discover new algorithms that are able to make new discoveries on open scientific problems.

And at the same time, those algorithms can be so practical that they are already deployed in key parts of Google's own infrastructure. What is the origin story of working on this particular form of coding agent or this problem statement?

So we are not new to this space of algorithm discovery. As you might know, the mission of all of DeepMind is to build AI responsibly to benefit humanity. And the way our particular team has been doing it for years now is to look for ways how AI can discover new algorithms.

New algorithms are everywhere around us. So this is a very, very important question and can have very high impact when we can discover algorithms that solve important computational problems with higher efficiency than what we have been able to do so far. And kind of the first breakthrough we had in this space was in 2022, when we released a system called Alpha Tensor.

And so that was a system that was an AI system using reinforcement learning that for a very specific but fundamental computational task. So multiplying matrices for the first time showed that AI agents can discover better algorithms than what humans had been able to do before them.

So this was the first system that gave weight to this idea that indeed with AI, we'll be able to go into the superhuman region of algorithms that we as humans have not been able to discover ourselves. How do you differentiate Alpha Evolve from like Alpha Tensor and FundSearch and some other projects in the sort of lineage of this? One way to also describe what we have done is if you look back at the history of DeepMind,

and see a number of projects that have come even before we started working on computer science. Our earlier work, and if you go back to the project on AlphaGo,

where the AlphaGo agent was able to beat the world Go champion in the game of Go. And the remarkable sort of thing in that agent was that it was able to explore this amazingly large search space

of all possible sort of go positions in such an efficient manner that it can sort of come up with what is the optimal move at that time, right? And that really surprised people, both go professionals as well as scientists. Scientists believe that that event would come much, much later because it was a very hard problem, right?

And so what that gave evidence for is the ability of these large-scale neural network-based systems to be able to reason and do very efficient exploration in these large search spaces and come up with amazing new insights about

the particular domain. And in the game of Go, I mean, there is this move called Move 37, which is a very creative new move that the agent discovered that was not in the Go literature, right? That really surprised the Go professionals.

So in some sense, we asked ourselves the question that if you have an agent which can do very efficient search in the domain of Pagot, why can't you use the same kind of philosophy to search for algorithms in the space of algorithms?

And in fact, that sort of was the underlying basis of the work on our first sort of attempt at that problem, which culminated in AlphaTensor. So how we structured the algorithmic discovery problem is we looked at first a very important problem. And that problem was matrix multiplication. It is a problem that is ubiquitous in computations.

computer science. It's one of the key fundamental operators that underlies not only computer science, but also neural networks and machine learning and AI. We said, can we find a way to improve metric multiplication algorithms? So there's a history of metric multiplication, which is very interesting for people who might be interested in it. Even though it's such a fundamental operator,

people thought that the complexity or the time it takes to multiply two matrices is order cube. And around 50 years back, more than 50 years back now, a German mathematician, Strassen, came up with this very counterintuitive construction, which showed that, in fact, the complexity was not...

n to the power three or what's not cubic where n is the sort of dimensionality of the matrix it's lower and so and that was a very counterintuitive sort of result and but it stayed for more than 50 years and until sort of Alphatensor came up and we said well can we actually improve

this result. And remarkably, we were able to show that. That Alpha Tensor, while by having this amazing ability to do search in this very large space, even much larger than the space of possible Go moves,

was able to come up with this amazingly new algorithm which improved things. But then the question was, well, we now have proved the thesis that you have these super intelligent agents which can go beyond what human computer scientists have been able to do. But can we generalize them?

This is sort of AlphaTensor was very smart, but was only sort of purposefully constructed for the matrix multiplication problem. Can we build an agent that is more general, both more general in the sense of it can handle more general problems, but can also search in the space more naturally in the space of

programs rather than in the space of very specific sort of operations that were required for matrix modification. And that was the origin of sort of the first attempt of us with FundSearch, which was an LLM-based agent, which for the first time, by searching in the space of programs, showed that you can come up with completely new

solutions, and made the first scientific discovery from an NLM. And Alpha Evolve is basically an extension of that. I'm very inspired by the idea, I think as many people are, that AI will actually have creativity, does actually have technical creativity, as you are describing as one way to conceptualize this, where you're outside of the patterns that we already know.

as engineers. I want to go back to some of the mechanics here and the limits to generalization and how to think about automated evaluators and a lot of different topics. But, you know, when you think about these problems that are clearly like economically valuable and interesting, like matrix multiplication, the potential efficiency of it, what is your intuition for

So why, you know, those solutions have not been found before? Is it simply like the search space is too large or people in this field were complacent in that they believed a certain solution was like the maximum efficiency or whatever?

Because clearly there's value to be had here. My sort of opinion on this is basically that if you look at the structure of the algorithm what Strassen produced was quite sort of ingenious. It was not a natural thing that you would sort of think of. And that was for only two by two matrices.

As you sort of go to larger sizes, the space is so huge. The constructions are not sort of something which is very natural.

These are very involved and intricate sort of constructions that would be very hard to discover by chance. So it's quite interesting that it has this very special structure, but it's not something that comes naturally to a human computer scientist. Just to add to that, so I definitely agree. The search space is just unbelievably vast.

The solutions are maybe non-intuitive. And the third thing I want to emphasize is that I really believe the people who worked on this in the past were definitely not complacent. And in fact, the problems we chose to apply Alpha Evolve to in the first instance, both on the scientific side and the practical side,

We deliberately chose problems which have been worked on for a very long time by the very best people. So on the scientific side, since we're talking about matrix multiplication, this has been a known open problem for decades and many people have been working on it.

And similarly for the practical applications that we mentioned in our Alpha Evolve release in key parts of Google's infrastructure. Again, like these are things that have been heavily optimized inside Google because they are so important. And so by having a system like Alpha Evolve or any other platform

discover something new on these problems. I think that's as strong a demonstration as I can imagine of the fact that this is indeed something that is new because no one found it before. And also it is something that was not easy to discover because those results stood for such a long time and have been worked on by such strong people.

I noted that this is not a comment on the, you know, broad efforts of the computer science industry to date on matrix multiplication or data center optimization. I think this is a good moment to try to demystify what's happening under the hood for a broader set of people.

Can you walk us through a concrete example of how Alpha Evolve actually evolves code? And say, let's take the example of trying to optimize data center scheduling, right? What does the step-by-step process look like from initial random code to final solution that saves millions of dollars of power? I can walk you through that. So the user of a system like Alpha Evolve, they basically specify what is the problem that they are trying to solve. So that's the most important thing.

And you specify it by providing what is called an evaluation function. What this function does is whenever there is a proposed solution for solving the problem, you're able to tell how good this solution is. So you basically define what makes a good solution. For

For discovering an algorithm for scheduling jobs on a data center, this evaluation function could be something like a simulator of jobs in a data center that given an algorithm for doing the scheduling, it simulates how good this algorithm is. So that's what the user provides. And this is a simulator you already had. Yes. So that's a simulator that we already had. And I would say it's something that is

quite natural to have in many domains because whenever you want to innovate on something, you need to have a way of telling, okay, is the innovation actually good or not? So it's a very natural object to have at least in principle.

So you define the what by providing the evaluation function and then Alpha Evolve fills in the how. So that's the job of our system. And you can do it in two fairly different ways. One is you tell Alpha Evolve, I have no idea how to solve this problem. Let's start completely from scratch and let's try to be creative and come up with something completely new. So that's one option you can take.

Another option you can take is, actually, we have already worked on this problem for a really long time. Here is a very strong initial solution that we can provide to the system. And you can start from here. And that's what we did for the application to discovering new algorithms for scheduling jobs in a data center. So Alpha Evolve takes this initial solution.

And then on a high level, it combines the creative power of large language models to propose creative new ways how to improve that solution. The strictness of the evaluation function provided by the user that is able to actually filter out the things that work from the ones that don't.

And then this is wrapped inside an evolutionary algorithm that makes sure that we can discover the whole space of algorithms in that region so that we don't commit to a very specific type of solution early on. But instead, we maintain a diverse pool of potential solutions. Over time, maybe we combine ideas from different solutions that are already strong until we actually have an algorithm that's so strong that we are happy to deploy it to a critical part of Google's infrastructure, let's say.

And intuitively, not in the machine learning sense, but in the evolution sense, you have different generations where you're getting closer to an optimal solution. Yeah, that's right. Like you would expect that in each iteration of evolution, what you're doing is you're looking at the previous iteration, looking at maybe the strongest solutions you have, and then trying to be creative about how can I combine ideas from those solutions or maybe bring in completely new ideas to come up with something even better.

And so, yes, each generation gets stronger and stronger. How much scaling are we talking about? Like, is there a way to predict how many generations it takes or how do you constrain the number of iterations that the model can use? So there are two parts to your question. One is about, OK, how does scaling work and then how can you predict it? So for the first part, this is actually a really nice feature of Alpha Evolve that it can adapt to the difficulty of the problem.

If you ask Alpha Evolve to find a solution to a problem that's actually unexpectedly easy, then it will just do it very, very quickly. Like almost immediately you will have the solution.

But if you ask it a problem that's really, really difficult, and by really, really difficult, I mean like really difficult, maybe an open question that has stood for decades in the sciences, or you want the practical algorithm for a really high value application in Google, then you would of course expect this is not an easy problem. You might need to spend longer time considering different solutions, exploring the space, combining ideas.

But what's really nice about Alpha Evolve is that it is able to sustain this scaling in a way that it keeps improving over time. And it keeps improving for so long that you can make discoveries on this level of difficulty, like breaking decades old scientific challenges or discovering high value algorithms.

Now, I know it maybe sounds trivial that if you wait longer, you'll get better results. But in practice, that's actually a really difficult thing to build automated agents that are able to sustain this continual improvement without plateauing quite early. This is, I think, a nice feature. There was a second part to the question about predicting how many iterations you will need. So that is something that is actually not so easy because it's like asking,

a priori, do you know how difficult this question is going to be? And especially in the sciences, that's something that often has a very surprising answer. Very trivial questions can turn out to be extremely, extremely difficult and vice versa. But the nice thing is that you have continual improvement if you run the system and

As long as you can run it, you can expect to get better and better results. And you just have to see where this gets you. If you think about the coding agents that general developers have access to and are increasingly using today, one frustration with them is on relatively trivial problems it is set out to do autonomously. It will get lost and blow itself up or plateau, as you said, automatically.

in frustrating ways. Can you talk about if you think there are implications from Alpha Evolved to these other general coding agents? While large fragment models and coding agents are getting much better in their understanding of code,

They're not perfect, right? So they do make mistakes. The other sort of element is to think about what is the task that these agents have been assigned. Mostly, if you are asking an agent to solve a particular task or write a particular program, you are providing a specification. Right.

You are specifying the task either in natural language or you're saying, well, I'm trying to do something completed. Right. So it's not a complete characterization of what you want. It's a partial specification of what you want. And the agents then try to solve the problem and might get lucky and might get the right result or they might hallucinate and get the wrong result.

And the issue is how do you know whether the result is right or wrong? And that depends on having a good evaluator. That's how Alpha Evolved solves the problem. So in some sense, we are able to leverage the hallucinations

for a beneficial purpose, right? So the creativity and the wrong answers that Alpha Evolve can somehow come up with, how do we know that they're wrong? They might be very good. We just don't see them in that way. And which is why the role of the evaluator is really important. And how do we even do the evaluation is very important because when you come up with a new idea,

Should you try to explore that idea much further? Or how deep should you go into stress testing that idea? Should you try that idea out on a few different instances?

or a thousand different instances or really stress test that the idea actually works for the whole thing. This is one of the interesting parts of Alpha Evolve, getting that balance right.

is really important so that you can look at where are the creative solutions? How can you sort of filter out the ones that are promising and then use them later to refine the search process to get the final solution? - If evaluation functions, automated evaluators are,

really like such a limiting constraint here in terms of what we can get agents to do. Any intuition from this project or others on how to overcome that? Like, can models get good at helping us create automated evaluators? Should we imagine simulators that are better for lots of different domains? If I, you know, lame product manager putting in incomplete natural language spec to coding agent,

Should I work with an assistant to complete that spec? Do I use traces? How do you think that gets solved? That's a really great question. And I think you can view it from two perspectives that I think will happen at the same time. So one is that, yes, currently the strict evaluation function plays a key role in Alpha Evolve. And one takeaway you can take from this thinking about the future is that it shows the really high value of having these evaluators available.

Because in many cases, it might be that you have a really important problem, but you don't actually have a very precise definition of what makes for a good solution. And one takeaway you can have from a system like this is that if you actually do build a very precise evaluation function, then this unlocks the possibility of having an agent like Alpha Evolve discover something that's way beyond what, let's say, humans have been able to discover or your best developers have been able to discover. So that's one takeaway.

But the other takeaway that I'm maybe even more excited about from the research perspective is that we don't actually think this is a conceptual limitation. So today we have, this was maybe the easiest way to get into this game of discovering new things by looking at problems that already come with these very precise evaluation functions. So that's just a natural first step to take. But I do believe that this assumption can be relaxed in very significant ways.

And in particular, you already mentioned one example where maybe language models themselves will be able to evaluate whether proposed solutions look promising or not, or whether they fail in some particular ways. And indeed, there is parallel work from Flipmite as well called AI Co-Scientist, which demonstrates this very clearly that

if you propose ideas in natural language, then you can get language models to provide meaningful critiques and identify the ones that work from the ones that don't. So I really do see a lot of hope on relaxing this assumption.

And then even in between these two extremes of strict evaluation that exactly tells you how good a solution is on one end, and then natural language evaluation by a language model on the other end, there is a continual spectrum of simulators and auxiliary evaluation functions, which are maybe not perfect, but as long as they are correlated with the true signal,

then we can build the algorithmic scaffolding of the evolutionary algorithm around this in such a way that we still make meaningful progress. And maybe it will take a few more iterations, but we can still go really, really far. So just to add what Matej sort of mentioned, I think...

One of the takeaways is basically that LLM-based agents like AlphaEvolve, especially when we structure them in this way with population-based sort of search, right, with evolutionary approaches, they are extremely effective in searching.

They can search very convincingly and very effectively in very large spaces and come up with very counterintuitive new solutions for important problems. Problems that we have studied for many, many years and sometimes in some cases decades. So that's one. The other sort of element of the evaluator, like Asmati mentioned, there is work.

on using other sources for evaluation. So you don't have the perfect evaluator. Even for Alpha Evolve, even if you have a simulator, that's not a perfect evaluator, right? Because you are sort of going to evaluate things on a specific distribution of problem instances. You might want to sort of prove certain properties of the solution, right? You might want to say that the solution always

has certain performance. So if you want to prove certain properties of the solution, that might require sort of other work, right? You might have to have a proof agent which sort of tries to approve certain properties of the solution. While on the other hand, you have these LLM-based evaluators

which can look at the solution and you don't have, nobody has built a simulator, but they can just have a guess on how good that solution is. And in fact, that approach also works very well. And we have shown that this AI co-scientist

which we have used for hypothesis generation. It basically uses a multi-agent sort of setup and where LLMs themselves are able to sort of figure out that certain hypotheses are better in terms of novelty and significance and impact and should be propagated.

And that whole process ends up, and this might be surprising and counterintuitive to some, producing much, much, much better results than the base large-language model.

right? So you are really able to discover new information beyond what the large language model itself alone was able to produce. That begs the question, which I think is like one of the biggest meta questions proposed by this sort of work, which is like, do we get self-improving AI, right? One of the things you demonstrated with AlphaEvolve is you can optimize the systems used to train AlphaEvolve, right? So you have this

you know, 23% speed up in part of the training infrastructure, if I recall correctly, are we now witnessing the early stages of recursive self-improvement in AI? And, you know, what do you think the implications are, if that's true? I think in some senses, sort of, yes. But at the moment, what we have seen is basically improvements in computation time. So what Alpha Evolve has been able to do is basically make training more efficient, right?

But you can ask the question, can you make the training, can you improve the training process such that the underlying model is not only sort of trained faster, but is actually fundamentally better in certain cognitive tasks. And that is something that has to be validated still, right? But it is a direction that is definitely very important.

appealing and something that is being sort of actively sort of looked at by many people. Do you have a reason to believe it won't work? It should work. But as we sort of mentioned, that having good evaluators is an important element, right? And so having a sort of evaluator which can say this proposal that you have just suggested is

for me to improve the training process will yield a good result. So if you have that kind of evaluator, then it will work. But there is no reason why such an evaluator does not exist. But we need to sort of work on building such evaluation functions. Maybe just one thing to add to it is that I would also agree that we are maybe seeing the first

sign of self-improvement, but one also needs to be very specific about what we have shown so far. Like as Pushmit mentioned, it's the speeding up the training of the next generation of the Gemini model. So the feedback loop is fairly long, at least currently, maybe on the order of months. But there is, you can call it self-improvement for sure.

Maybe the big question that many people are curious about is how does this extrapolate into the future? And you can have different types of self-improvement. One is where you get maybe just a one-off benefit, like the model improves itself once and that's it. Another one is, okay, the model keeps improving itself continuously, but maybe the improvements get marginally smaller and smaller and smaller and you converge to some limits.

Or maybe the improvements will keep accumulating up and up and up. And that's a big open question that we don't have an answer to today. Let's take that projection to other fields. And obviously, these are all interrelated. But one of the things, Mente, you're really excited about is just how AI applies to these sciences. When you think about new mathematical constructions, improved solutions to open problems or problems that

looked solved to humanity 50 years ago. What do you think the implication is in different fields? Like, is it a fundamental shift in how scientific discovery or mathematics gets done? First of all, yes, I'm super excited working in this area of using AI to accelerate the sciences because in a way it's the most

exciting application of AI that I can imagine. Like what could be more valuable or exciting to advancing the frontiers of human knowledge? So yes, that is definitely there. And then of course, in different fields of science, the speed of progress or the advance you get from AI might be slightly different. So in Alpha Evolve, we've primarily focused on mathematics and computer science because

These are the domains where it's the easiest to get these automated evaluation functions. You often get them basically for free. That's not to say that you cannot get them in other branches of science, but in maths and computer science, they're just most common.

If you think about biology or chemistry, you want to design a molecule, then you can have an evaluation function again in the form of a simulator or a predictive model that given a candidate molecule will make a meaningful prediction about, okay, is this actually going to work in practice? And then if you are in this regime, then again, Alpha Evolve would be applicable.

And we are only talking about the version of Alpha Evolve that we have built today. And these are problems that we can address today. But we don't think that the journey of Alpha Evolve finishes here. We have many ideas about how to make this system more powerful and more broadly applicable. And I'm fairly confident that we will see many applications across many branches of science.

And then this is only talking about Alpha Evolve. There are many other agents, Bushmeat mentioned, AI co-scientists and many others that I'm sure will keep transforming how science is being done across the whole spectrum. Yeah, so I think broadly, if you look at it, right, science is, a lot of science involves searching.

Right. Searching for the right idea, searching for the right construction, searching for the right sort of solution, the right drug candidate and so on. And in some sense, like what scientists have been trying to do is sort of somehow make that process repeatable.

At the moment, there is still sort of an element of certain serendipity to some of the discoveries, but we are, as we move towards sort of rational material discovery or rational drug discovery, you are sort of seeing computational approaches and very systematic evaluations playing a much more important role in many areas of science.

And I think as that work propagates, you will have systems like Alpha Evolve, which will be able to search in those spaces and use these evaluations much more effectively. So it's like you can sort of see this as a tool that will give scientists a superpower in their ability to search over very complex and sometimes complex

counterintuitive sort of solutions based in? When I think about one logical extension to this approach, it is, let's say, like automated evaluation in the real world, right? So lab, assay, you know, a bunch of robotic arms doing experimentation if you're screening molecules or something. What do you think the role

Let's just say like very near term, if that vision is true of the human scientist or engineer is, is it the problem framing, like determining the evaluation? Is it constraining the like giving some intuition for like a starting point or a search space? Like what should the human scientist be good at?

from here? There are many sort of elements, right? First of all, as we have been talking about a lot, the role of the evaluation function, right? So that needs to be defined. Like, what do we really, how do we want to assess these solutions? But then there are many other sort of elements as well, right? When we are trying to find a solution, it has to have certain properties, right?

What are those properties? Right. Giving hints, giving sort of, for example, if you're trying to discover a new drug, you want to make sure that that drug sort of treats the disease but does not kill the patient.

It has sort of its side effects are low. Or it can be what is the delivery mechanism for it. So there are so many different requirements that a solution might want that might need to satisfy. And some of them are encoded in the evaluator, in function evaluator. And some of them you might want to hard copy.

constrain them in the solution, right? And so can you specify those so that an agent like Alpha Evolve can take that into account while it is thinking about how it explores the search space or how it constructs

the solutions that it will generate. These are all very interesting places where human input might be required, but especially as we look at many different types of domains. So yeah, I think we should definitely see this as an amazing tool for scientists, for computer scientists, mathematicians. And in fact, this has been our experience as well, that in the right hands,

it is a very powerful tool, right? So like mathematicians who have tried to explore it and they have been able to specify what are the solutions that, what are the types of solutions that they're looking for? They can be much more

productive and much more sort of effective in finding the solutions. I just wanted to highlight that even though we have been describing alpha-evolve as this kind of autonomous agent that does things on its own, actually, in practice, using this agent often turns out to be surprisingly collaborative. And we have seen this in particular with mathematicians that we have collaborated with.

And there are a few reasons for this. But one is that Alpha Evolve is an agent that doesn't just give you the solution.

It searches for an algorithm that constructs that solution. And so depending on how you set up your problem definition, often it's actually the algorithm that's even more valuable than the solution itself. Because the algorithm, it tells you how to construct the solution. So that means you understand what are the ideas that go into building that solution. And

Maybe especially or definitely it's true in mathematics. That's what people really care about, to understand the nature of our universe and build up the understanding of fundamental ideas. And so it's actually often not interesting almost at all what the solution is, but what you care about is how you build it.

And so we had a firsthand experience collaborating with multiple mathematicians and it's been really fascinating to see where we would share with them the output from Alpha Evolve and they'll be really fascinated looking at the code that it found and trying to understand, okay, what is it actually doing? And then understanding, oh, okay, this is doing this, this is doing that. And now I can see why if you put it together, then it leads to a really good solution.

Yeah, I can also confirm from my own personal experience that looking at the code or the algorithms that the system finds, it's often a really interesting experience because it's code that kind of looks human-like. It's something that you could have written, but would you have thought of writing it in exactly this way and then trying to understand, okay, what exactly is it doing? That's a really interesting experience.

But at the same time, it's one of the key strengths of the system, not only for scientific applications where you can look at the code and get some understanding out of it, but also for many of the practical applications. It's hugely valuable that the artifact you get out of Alpha Evolve is a piece of code

and then you deploy that piece of code. And so before you do that, experts, engineers who have worked on that system can visually inspect that piece of code, understand it and make the final decision of whether it's going to be deployed. So it's in a

completely different league from, let's say, considering using a neural network to make decisions in some production system where you kind of need to trust that the neural network is going to always behave in the way that you hope it will. With the code, you can look at it, understand it and make the decision yourself. I might add that basically

Not all code is interpretable by humans, right? The solutions and the programs that AlphaGol finds are sort of interpretable by human programmers. So this is going to be a very interesting area of work in the future as to when you find these solutions, what can we learn from them?

This was very interesting. As Matej was sort of mentioning, this was a very interesting experience that we had working with Jordan Ellenberg in the earlier version of Alpha Evolved when we were working on the gap set problem. The programs that it discovered had very interesting symmetries.

that mathematicians did not know about. And so not only the solution was mathematically interesting, like the actual sort of construction, but the algorithm for producing that construction had the structure of it was interesting in itself. For listeners who are thinking about...

accessibility or implications for themselves where they're not professional mathematicians in collaboration with Alpha Evolve. What are the considerations in making some of these capabilities more broadly available? We want to make these capabilities accessible to as many people as we can to the wider community.

Now, we have started a trusted tester program where we have asked people to submit proposals. And what we intend to do with that program is to figure out what are the right ways in which people can really leverage

Alpha Evolve. So we have internally used it across Google, but as you know, it requires certain things, sort of the need for a function evaluator. As part of the Trusted Testers program, we are going to be evaluating Alpha Evolve on a bunch of different types of applications, and that will inform our future release strategy as to how do we make it more broadly applicable. The second sort of element is that

not only you need the evaluator, but you also need a significant amount of computational resources, right? Because it's not just one single LLM call. It requires a significant amount of function evaluation depending on the difficulty of the problem, right? If it's an easy problem, then you can do it very quickly. But if you really are going for some very hard problems with a very large extended search space,

and you want to spend a significant amount of time searching over it, then how do you build the overall system that people can use effectively and efficiently? That's the other sort of thing that we'll be thinking about.

Last question for you both. Is there a practical application within Google that you think will be interesting that you haven't tried Alpha Evolve on yet? In this white paper, we try to think about holistically, when we look at the computational infrastructure of Google, what are the key parts in this infrastructure to demonstrate that Alpha Evolve can make discoveries across the stack, not only in one part of it, and that it can make discoveries that are highly valuable.

And so we tried to cover the entire spectrum. So we showed that Alpha Evolve can improve the efficiency of the data center, it can contribute to hardware design, and it can contribute to improving the efficiency of most important pieces of software that are being run inside Google. And one intention here was to demonstrate that this is a really versatile tool that you can apply across the spectrum.

And as Pushmit was saying, this is a tool that is already available inside Google and it is being used for many, many problems. There are quite a few exciting ones. I'm not ready to share about the particulars yet, but as you can imagine,

there is so many exciting computational problems in a place like Google, within AI and also outside, that yeah, I'm sure there'll be many, many really cool results coming in the future. I think that's a great note to end on. Pushmeet, Matej, anything we didn't cover? No, I think that was great. Thank you guys so much for being here. Congrats. Okay, great. Thank you very much.

Find us on Twitter at NoPriorsPod. Subscribe to our YouTube channel if you want to see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.

Meet AlphaEvolve: The Autonomous Agent That Discovers Algorithms Better Than Humans With Google DeepMind’s Pushmeet Kohli and Matej Balog 42:08 Share

No Priors: Artificial Intelligence | Technology | Startups

Shownotes Transcript

Meet AlphaEvolve: The Autonomous Agent That Discovers Algorithms Better Than Humans With Google DeepMind’s Pushmeet Kohli and Matej Balog