We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

The road to artificial general intelligence, with Helen Toner

2025/4/4

It could be really quite destabilizing internationally if countries think that they are in a race to who gets the most powerful AI forever and that country or person or a small group is going to get to determine how the future goes in the long term, have a permanent monopoly on power. If you were China and you thought the US was doing that, you would obviously be reacting.

Welcome to Stop the World, the ASPE podcast. I'm David Rowe. And I'm Olivia Nelson. Today's guest has a fascinating perspective on artificial intelligence that is informed by her equally fascinating career so far, noting that she's only in her early 30s. Her name is Helen Toner, and she's the Director of Strategy and Foundational Research Grants at Georgetown University's Center for Security and Emerging Technology, known as CSET.

She also spent two years on the board of OpenAI and that put her at the centre of the dramatic events in late 2023 when OpenAI CEO Sam Altman was briefly sacked before being reinstated, after which most of the board stood down.

Helen's also Australian, having grown up in Melbourne and studied chemical engineering at Melbourne University before living and working in Beijing as an AI researcher with Oxford University and then going on to work for the non-profit Open Philanthropy. Helen and I didn't discuss her time at OpenAI, which was perfectly fine with me. She said what she had to say about that period and I was very happy to concentrate on her very compelling insights into AI as a field.

So what we did talk about was the curve we are on towards artificial general intelligence, which will be able to do everything humans can do. We talked about progress with reasoning models, the arrival of Chinese model deep seek, the need for regulation, AI and democracy, and AI risks, both large and small. I found myself agreeing with a lot of what she had to say. She has this healthy perspective that is in favour of progress and science, but also is very aware of the risks and the need to come up with smart safeguards against them.

Helen is a big brain on AI who can articulate the complex ideas to make them accessible. It's a really, really interesting conversation that lays out the key issues in this transformative technology. Let's hear from Helen Toner.

Hi, everyone. This is David Rowe, and I'm here with Helen Toner. Helen, thanks for coming on the podcast. Thanks for having me. So let's start by talking about where we're at on the AI trajectory and towards, it seems, a GI or artificial general intelligence. Clearly, AI is already changing our lives in a number of very interesting

demonstrable ways, but anyone who follows the area closely will be tracking the fact that it's still on a steep path and the really big transformative changes are probably yet to come. As artificial intelligence becomes more capable and more general, the transformations will really start to show themselves.

I'd like to your views on where we're up to with this pathway that we're on. Clearly, large language models, the current approach, which is essentially training neural networks on vast amounts of data to predict the next word or token, that is proving very, very effective and

clearly scaling it up with ever larger amounts of computing power seems to keep working for the time being, but there's a question about whether it'll hit a ceiling. So what's your view? How far along that pathway are we? And will some more fundamental breakthroughs be needed for us to take the next big steps? - It's a great question. And the problem is that we don't really know what the pathway we're on is or what the destination is, or if we're, you know, what we're even talking about here.

Because we talk about AI as if it's one thing, we talk about AGI, artificial general intelligence, as if it's this one target that we're building towards, and we all agree that's what we're trying to build. But in fact, we don't really know. And the history of progress in AI has been people trying to set out tests for what would really count as being really intelligent in the way that a human is.

Typically, every time we manage to pass one of those tests, it doesn't actually strike us that the AI is that sophisticated. The original version of this was thinking in the second half of the 20th century, that something like chess involves some deep strategic thinking. If you can build an AI that's very good at chess, it's going to have some deep understanding of strategy. It turns out that's actually not right. It's possible to build computers in ways that are very, very good at playing chess specifically by looking down the branching trees of possibilities.

That doesn't generalize whatsoever to any other non-chest situation. A more recent example we've had was Alan Turing, one of the pioneers of computing,

put out his Turing test, right? Which there was the basic idea was, can you distinguish between a human and a computer in a chat setting? And his idea there, I don't think he was necessarily saying this is the pinnacle of intelligence, but he was saying, you know, we need to have some kind of test, some kind of bar that we can work towards or, you know, that we would use. And, you know, in the last, just the last couple of years, we've seen systems that

clearly can do very well on that test, whether they pass it or fail, it depends on the specifics. Are you having an expert judge who's really trying to make the system mess up? Are you just using an amateur who's just having a nice chat with it? There's a paper this week, I think showing, or last couple of weeks showing that we've basically, we've

basically pass that test. It depends exactly how you set it up. But again, we have these models that are very, very smart in some ways, but still make very silly mistakes in other ways. A clear example right now is a lot of the best so-called large language models. They can also do things other than language. They can look at images and things. They can process images in some very sophisticated ways, but also if you show them some simple things, they just totally mess it up. So there's these tests of like, how many times do these lines cross? And it's these jagged lines where it's pretty easy to count one, two, three, any five-year-old would be able to do it.

and these AI systems can't. So there's these very jagged capabilities, meaning some of the capabilities are very, very sophisticated and some of them are very, very basic and well below what you would expect from a human.

This is all a long way of saying it's very hard to know where we are because we're actually not, people sometimes compare it to, there's this famous essay sort of saying that five years ago we had something that was a level of preschooler and then we had a high schooler and soon we're going to have, you know, a best in the world AI researcher. But actually it's not at all that linear. Like it's really, it's much more confusing. It's much more uneven. And so it's very hard to say. I do think to the question about sort of large language models and scaling the approach, the current approach that we're using,

I tend to think that there tends to be a bit of a false binary that people set up of, can we just keep scaling exactly what we've been doing or do we need to do something fundamentally differently?

Whereas I think in reality, if you look at the history of how we got to where we are with the systems we have now, it's not been sort of really big breakthroughs, really big paradigm shifts. Instead, it's been lots of these sort of small to moderate discoveries or approaches that let you scale them, that work better if you scale them more. And so, you know, the original one was when deep learning started to work at all. So this is sort of artificial neural networks when you have these very networked

statistical models. And that started working around sort of 2011, 2012, when we had enough data and we had enough hardware to make it run at all. And that was just doing, you know, things like image processing or speech recognition.

And you've got this series of more sort of small to medium-sized breakthroughs. So things like this idea of attention, so having the model be looking more closely at different parts of the inputs that led to the transformer, which was a development released by Google in 2017, which is what ChatGPT was built on top of. And then there was figuring out how to use all of the text that is on the internet, another sort of small to medium scale breakthrough.

figuring out how to get these chat models to not just be incredibly racist, incredibly toxic immediately using a technique called reinforcement learning from human feedback. That's another sort of small to medium size breakthrough. And so these all kind of stack on top of each other.

So it's not the case that we're just doing the same thing at larger scale, but it's also not the case that you kind of have to totally backtrack and do something totally different. And it looks to me like we should expect that process to keep going for the foreseeable future. The problem is, you know, foreseeable is not very long. Is that one year? Is that five years? It's hard to sort of see far beyond there. So to try and kind of sum up

where I think that leaves us overall is I think it makes sense to expect that we'll keep seeing progress, but we don't really know what exactly it's going to be progress in. We don't know which areas are going to suddenly be much more advanced and which areas are going to continue to have these pretty fundamental limitations.

but I think it would be not a good bet right now to think that things are going to plateau or that we're not going to be able to overcome some of the limitations that we're seeing. It's interesting that part of what we're learning as we try to feel our way through this amazing period in discovery is that it's very unclear what actually constitutes intelligence, what constitutes intelligence capability. Apart from anything else, one example that I find interesting is

If you look at workforce and the impact on different kinds of jobs, everyone would have assumed, in fact, everyone I think did assume in previous decades that the first jobs to go would be the more sort of blue collar type ones, the more sort of, you know, the repetitive manual jobs. And it turns out that's not the case at all because AI so far is quite...

poor at manual dexterity, for instance. So hairdressers are set for the time being. And in fact, it's white collar jobs that, especially repetitive ones that involve processing large amounts of data that are most affected. So I suppose, nonetheless, more and more tasks seem to be falling to the consumption of

AI, it feels to me as if more and more things, we're reading more and more stories that say, "Oh, okay, AI now appears to be doing better than 97% of humans on such and such a task."

Does it feel to you as if we're going to end up with something that basically just sort of strings together a whole bunch of more narrow AI capabilities into something that feels more generalized? Or will we come up with something that is

fundamentally generalized and can actually just do all of these sorts of tasks because of the sheer breadth of its intelligence capability? It's a great question and it's one that experts disagree on like lots of things with AI, unfortunately. I tend to come down on the side of thinking we'll build something that is more general.

Just because the incentives to do that, the incentives to have one system that can actually encompass all kinds of different tasks are pretty strong. The economic incentives, the research incentives, that's what the scientists want to build. It's what they think is most interesting and it's what will be most productive and profitable for the companies building and deploying these systems. It may not be right, I think in many ways, something that is more different things strapped together that can do different components.

is probably going to be easier to manage, easier to sort of understand and test and validate and make sure that it does what you want. But the challenge is that over and over in AI, what we've seen is that if you try and build a specialized system and you put a lot of sort of special effort into designing a system that can do one particular thing,

And then you wait, you wait a year or two years or five years, then some other person is going to come along and build a system that's more generalized, maybe larger scale, and it's just more powerful overall and can just wipe the floor with your specialized carefully built system. And because that's a pattern that we've seen so many times, I tend to think that it will continue. And I also tend to think that the researchers are going to expect that and try to build in that direction as well. Interesting. Yeah.

You mentioned that often it's the small iterations, the small evolutions on top of existing abilities that make the difference. One of the iterations that we're seeing currently is the introduction of so-called reasoning models. My very inexpert, untechnical understanding of those is that it uses the same basic approach. You mentioned reinforcement learning. It's still using reinforcement learning, but to

to learn what we think of as reasoning skills, like breaking big complex questions down into more basic ones, sort of taking a step back and revising where it's up to with its thought processes and basically pausing and thinking rather than just spitting an answer out. Sometimes it might involve retracing its steps, for instance.

This feels pretty significant and it's being talked up as a big development and a big stage towards better AI. How significant do you see it as being? I think it's pretty significant. Yeah, I think it's definitely should be the most recent addition to that list I was giving of those kind of small to medium size breakthroughs. I think it remains to be seen exactly where it will go. It's very interesting. So one thing that's interesting about it is I think sometimes...

in these conversations about scaling AI, people think about like just going bigger, but actually when you talk to researchers, what they are working desperately hard to do is find things where the returns to scale are good. So things that you can scale and they keep getting better,

Because not every algorithm, not every approach actually keeps giving you better results as you scale it up. And so something that's really exciting for the researchers about reasoning is that it gives you a couple of different places where you can scale things up and get better results. And that's very exciting for them because that means that they know that as their bigger computing clusters are coming online over the next year, three years, five years, and as they're more able to find new sources of data and synthesize new kinds of data, they can turn that scale dial and they will get the returns that they're looking for.

With reasoning, there's a couple of places that comes from. One that's been talked about quite a bit that people have come across the reasoning models, they might have heard of this, is you can scale up the amount of time and computation that is used at the thinking stage, meaning when it's trying to solve a given problem. This also gets called inference time or test time. When it's at the point of solving a particular problem, you can use more compute either to just have it think for longer in one go,

or potentially sort of parallelize, get it to take 16 stabs at the problem and then choose its favorite, 128 stabs at the problem, choose its best answer. So that's one way where you can just keep dialing up how much computational power it has access to and potentially get better and better answers.

The one that's been a bit under-discussed, I think, is it also lets you scale up at the training stage, but not at the, I don't know how technical we want to go, but the way that these language models are trained, the big, you talked about the predicting the next word, which is what gets called the pre-training phase. That's sort of the initial training where they are shown just huge amounts of text and they're learning to predict what comes after what.

But there are other phases, it gets called post-training, which in the past has been where they've learned not to be incredibly toxic, not to provide dangerous advice or that kind of thing. And what's happening with the reasoning models is there's a new kind of post-training that's being done, which is where they learn to do this step-by-step thinking. And essentially how it works is they get given a problem, they get to take a stab at the problem, then it's automatically sort of graded, did they get the

problem right or how well did they do. And then in the sort of so-called chains of thought where they did well and where they ended up at the correct answer, those get reinforced. So it's sort of the model is told do more of that.

that. I think we'll also see researchers trying to scale up that training, how much can you do of it, how do you automatically generate more problems for them to try. A really interesting thing I'm keeping an eye on for the next year or two is that at the moment, because when you're scaling up that training, you need to be able to have lots of problems that the system can try to answer that you can automatically tell if it did a good job when it's training, because you need to be able to tell, should we do more of that or less of that?

So, so far we've seen them being trained a lot on programming problems, on maths problems, because those are domains where it's possible to both automatically generate problems that are reasonable and also to automatically tell is the system doing well or poorly. That's what lets you do it at huge scale because if you have millions of problems, you can't have a human going through and grading the answer or a thousand problems and the AI is having 100 tries at each. You can't go through and hand grade all of those.

And so to me, a really big question for the next year or two is, or maybe two questions. One is, are there other domains where you can do that kind of automatic generating problems and automatically grading how well the system is doing?

And if not, how much will the performance generalize is what AI researchers would call it, meaning how much will it help you outside of just maths and programming? And because we're already seeing that it helps at least somewhat. So what you described are the behavior of not just thinking sort of step by step, but also potentially backtracking, noticing when it's made an error, going back and saying, well, why don't I take another stab at this and see if it gets me a different answer?

That's something that's not just useful if you're solving a maths problem, that's also useful if you're solving lots of other kinds of problems. So we're already definitely seeing some kinds of generalization outside of the specific topic areas that they've been most heavily trained on.

But I think it's a big open question how far that goes. And I know researchers who think both sides of that, who think it will really not go very far beyond having very good maths and coding. And researchers think it will generalize quite a lot. And so this comes back to the kind of jagged capabilities idea of them being very good at some things and less good at others. I think we're very, very likely to have AI systems that are really good at maths and programming and kind of adjacent things, like really very good, as good as top experts, if not better, in the next decade.

one year, two years, conservatively three years. And maybe, if you talk to the people who are most sort of bullish about how fast progress will go, maybe that means they're also systems that are good at basically every other cognitive task that humans do, because maths and coding are some of the hardest ones. And so if you can do that well, you can do lots of things well. Or maybe it keeps being this sort of jagged, uneven skills profile where they're very, very good at that, and also still can't tell you how many times the lines are crossing that a five-year-old could do.

It's a bit hard to know what exactly the sort of combination will be. And that will obviously, to your point about the confusing labor market effects it's going to have, it's very hard to tell what sorts of tasks and what sorts of jobs are going to be in danger from the kinds of capabilities that we see. I'd like to be able to tell my kids, study this, but don't study this. But I've just got absolutely no idea what kind of guidance to give them at this point. Me neither. It's very difficult. Yeah.

And let me say, as technical explanations go, that was a nice and accessible one. So thank you. It's a good point to calibrate ourselves at. I want to come to safety and risk in a moment. But what you've just been talking about is a good opportunity to bring in DeepSeek. Because, I mean, again, my non-expert understanding, DeepSeek did actually make some, certainly nothing revolutionary, but they did a really good job of building a model that didn't require as much

training, but it requires more compute at the actual answering phase. You can tell me if I'm misunderstanding things, but it seems to be a sort of a demonstration of what you were talking about a moment ago, which was innovation on existing

approaches that has simply made a model that is just as good, cheaper, and therefore more accessible to a variety of users. I'd like your view on DeepSeek and how big a deal it was. What did we learn from DeepSeek and how do you see the, I suppose, the AI race between the US and China poised at the moment?

DeepSeek was a really fascinating thing to watch as someone who has followed the Chinese AI space for a long time and thought a lot about US-China competition and been tracking that. It was really interesting to see how it caught fire and how much attention it's suddenly garnered. So I would say DeepSeek is a big deal, but I don't think it's a... I don't know if you saw this, like, you know, front page of the BBC live update breaking news ticker going through sort of people's reactions to DeepSeek. You know, I think it was a Monday when it suddenly...

Everyone was looking at DeepSeek. Everyone was freaking out about DeepSeek. I don't think it was that big of a deal. I think it was a big deal, but there were other things that had happened. For example, just a few weeks earlier, OpenAI had released its most recent reasoning model, which was called O3. It's very confusing. It's the second generation of their reasoning model. The first one was O1. O2 is a big telecommunications company, so they skipped that, went straight to O3. And I think the performance change from O1 to O3 from OpenAI was at least as newsworthy and big of a deal as the DeepSeek

story, for example. But that's not to say that it wasn't, I think it mattered and was important. And yeah, what was interesting there, I think it was a really good illustration of this tricky dynamic we're seeing in AI development right now, which is two things are happening at once that seem like they contradict each other, but they actually don't. The two things are, on the one hand, there's this scaling we've been talking about, which is if you're trying to build the best models, if you're trying to build the most advanced models at any point in time,

the amount of money, the amount of computing power, the amount of energy, the amount of expertise you need to do that is going up over time. So it's getting less and less accessible. It's getting more and more difficult to build the more advanced, the most advanced systems in the world.

At the same time, every time we actually successfully build something from that point forward, it gets easier and easier and more and more accessible to rebuild that same thing. And so what DeepSeek was doing was continuing this trend we've seen where the model that they built or the models actually, I would argue, I think most people in the field think that the more impressive model they built was released in December, you know, a month or two before the big freak out.

model called DeepSeek v3, which was not a reasoning model. And then the thing they released in January was the reasoning version. It's probably more detailed than you need, but essentially they were recreating things that had been built by US companies, depending on how you count something like, I think sort of six to nine months is the relevant timeframe. And that was a lot of time for them to work out kind of ways to do it more efficiently, ways to use less computational power. Definitely very impressive. A lot of the engineering that they put into it, I don't want to minimize that,

but it was sort of a story of fast following as opposed to a story of kind of innovation per se. And so I think what it shows, I don't think it shows, some people were interpreting it as sort of China taking the lead in the AI race. I don't think that's what it shows. I think what it does show is the real difficulty of,

preventing sort of diffusion of these capabilities, preventing any time that we reach the ability to do something with AI, we build a new capability. It's going to be very, very hard to sort of keep that contained and keep that to a limited number of actors because it's just going to get so much easier to build over time. So DeepSeek is obviously still a very talented team with lots of resources. They had lots of the most advanced chips. So that's why they were able to do it within sort of six or nine months.

But then if we wait a year, if we wait two years, that's going to expand even further and become much more accessible, need less hardware, less expertise. Certainly if we wait five years, 10 years, we'll get very, very, very accessible because that's just a repeated trend that we've seen across AI for at least the last 15 years. Okay. So is the US lead then-

I mean, is it real and is it even still meaningful if that diffusion is very, very hard? And obviously, I mean, we can go into, and I sort of don't want to go down the rabbit hole of export controls and, well, it's not a rabbit hole, but I suppose that's just possibly another conversation or the release of open source or open weights models as opposed to closed ones. But I mean, is the US lead still meaningful?

It depends on what lead we're talking about. So I think there's different conversations to be had about, are we talking about AI's economic impacts and sort of using AI to boost the economy? We talk about AI for the military, which I think is more of a kind of diffusion and applications and how do you make practical use of the systems question. If we're talking about what gets called these frontier models, which is what we've been talking about, right? That sort of general purpose, best in class, best in the world systems.

I think it really depends how you measure. It depends what we don't and we don't have good ways of measuring is the challenge. So we have these different benchmarks, these sort of test data sets that models get compared on, depends which ones you look at. So it depends. It's complicated. I think the pessimistic case would be to say, well, look,

deep seek seems to be something like six to nine months behind you could say more like you know one to three months behind if you look at the reasoning model rather than the base model and so that's where they're going to stay and i think that's you know that's a smaller gap than it has been in the past so the gap is shrinking so they're catching up i think the counter argument to that which i feel pretty persuaded by is coming back to this scale question of you know it takes more and more scale more and more resources to build the most advanced systems

What we've seen in the US is that actually this year is when a bunch of these companies are having very, very large computing clusters coming online that they have been planning for years and building for years, and they haven't had access to yet, or they're just starting to get access to. If you look at how much can they scale, companies like OpenAI, Anthropic, Google, XAI, Meta are at the

point where they are actually just getting access to this new generation of very, very large, you can basically think of it as like a supercomputer. It's a supercomputer that's designed for this kind of AI, as opposed to designed for the sort of traditional things we've used older generations of supercomputers for. And so I think the people who's

expertise on this I tend to trust most, sort of think of it as deep-seek coming in at the end of a cycle where the US was just reaching the end of its generation, essentially, of the computing resources it had. And now they're coming into this new generation of much bigger, much, much bigger computing clusters, which are going to be very difficult for Chinese companies to access because of the restrictions that have been placed on chips.

And so the question is, what will the US companies be able to build with that? Will they be able to keep progressing forward? Will they be able to keep making big progress? Because if so, it seems unlikely that the Chinese companies will be able to follow quite as quickly as they, certainly as DeepSeek did in this most recent iteration.

Of course, it's also possible that the US companies aren't able to sort of squeeze that much juice out of these new clusters. We'll see. And if we're sort of starting to plateau or it's not as useful, then maybe we do stick with this sort of smaller lead. But again, this comes back to with the reasoning models, what was exciting for researchers about that was this ability to scale that up quite a lot. And so I think-

The expectations in the industry are that if you take what we've seen so far from opening our Google Anthropic and you can throw these new supercomputers at it, that we would expect to see quite a lot of progress, which will then be difficult for the Chinese to replicate. But the US lead, in other words, is

based on its hardware access, the amount of computing power that it can actually build rather than anything in the algorithmic magic that sort of sits behind the weird bit that most of us don't understand. I mean, you're actually sort of telling, you're opening my eyes to something here that I probably hadn't fully appreciated. I mean, do you see that as the case? It really is based on the hardware? I think the hardware is a really important component. I do think that the expertise and the tacit knowledge inside these companies is important as well. I think there's a lot of tips and tricks

a lot of hard to convey details that matter for how they build these systems.

But China has made a lot of progress on catching up there. And certainly, the more that the US becomes a hostile environment for people with Chinese nationality, the more that's going to incentivize people who have that experience in the US labs to go back home and take with them things that they've learned. And in the deep sea case, it seems like maybe that wasn't even that necessary because they pride themselves on having at least the most recent interview I heard from the CEO, pride themselves on having no returnees from international study or international companies.

Yes, I think in the past I would have said that the talent was a huge factor for US companies, but it looks like that is maybe less decisive now than it has been in the past. And so the hardware is really a big potential edge. Let's just touch for a moment then on the

The importance of a democratic country like the United States maintaining its lead in AI. I mean, people like me, who are obviously very attached to the values of democracy, sort of instinctively feel that it's just important that the US stays in front. That's just a sort of foundational principle.

instinct that I have. But what's the tangible material value of it? I mean, does the leading country actually get to sort of shape the international environment in terms of standard setting and all these sorts of things, especially if they're only a few months ahead, as you're suggesting might be one of the possibilities?

a powerful authoritarian country like China is going to use AI for what it uses AI for, and some of that might involve things like, you know, better facial recognition, better social control, the sorts of things that we're not comfortable with. But at the end of the day, what is the value of the US actually staying in front? You know, other than just, you know, being better at it and being able to sort of enjoy all of the flow-on benefits of being ahead. I think there are different possibilities that depend on how you think things will go with AI.

One that I'm really not sold on, but I'll say because I think it is in a lot of people's minds is this idea of sort of setting standards or kind of showing the way, setting norms, wanting to have, for example, if you're gonna have an open source model being used all around the world, you want that to be from Meta or from another American company because of the values that are embedded in it. I have a hard time seeing how that matters so much. And I wonder if, I'm based in Washington, DC, I wonder if in the sort of US national security conversations that I'm in a lot,

People are a little bit over fixated on the lessons from 5G and Huawei, where there really was, you know, standard setting really was a very big deal there because you're trying to have one global interoperable telecommunication standard. And so, you know, actually writing like very technical specific protocols that get locked in and then are, you know, used all around the world. And I don't think there's a comparable thing for AI.

So that's an argument that I know some people are very concerned about that I personally don't find as compelling. So then I think it comes down to two possibilities for why this could matter. One that is, I think, pretty understandable and pretty consistent with what we've seen in the past a lot is just a general, you know,

state power, national power argument of, in the past we've seen a transition from, it used to be that your national power depended a lot on the size of your army that you could field. How many young men did you have and how much could you feed them and that kind of thing. And of course, we went through a big transition where that national power became much more about industrial strength and how much could you manufacture and how many ships could you build, how quickly and that kind of thing.

And that also not just military, right? Like also at an economic level, you know, how much of a high functioning, high productive, highly productive economy do you have and the economic strength that that gives you. And I tend to think that AI will, you know, be another wave of that sort of advancement of countries that are using more advanced AI systems and have those diffused throughout their economy, throughout their military, that that will be a really big determinant of sort of overall national strength and national power.

I agree with you though that that argument doesn't really seem to get you very far if we're talking about a gap of six to nine months, even if we're talking about a gap of one or two years, it seems like probably the question of how AI is being applied, how it's being used in different ways in practice is probably going to matter more than who has the most cutting edge thing in the lab.

Then a third way that it could matter, which I go back and forth on how plausible I find, is this idea of do you reach a certain point in AI development where if you have an advantage, that advantage

compounds and gets bigger and bigger and bigger because, for example, maybe you're able to use your AI systems to build the next generation of AI systems and they're better and then they're able to build an expert generation. This sometimes gets called recursive self-improvement or just recursive development. I think if that is a possibility, if that is what we're looking at, then that does potentially mean that even a gap of a few months could be really important.

But I think you have to have a pretty specific set of assumptions to think that that is how development will go. And I worry sometimes that sort of strategic decisions about AI, strategic thinking about AI is assuming in the background that we'll reach that that is a clear point that we need to be racing towards without actually trying to unpick

Is that true? And how would we tell in advance? And I worry about that because I think that idea could be quite destabilizing, especially if it's unnecessarily destabilizing if it's not right. It could be really quite destabilizing internationally if countries think that they are in a race to who gets the most powerful AI forever and that country or person or small group is going to get to determine how the future goes in the long term.

have a permanent monopoly on power, if you were China and you thought the US was doing that, you would obviously be reacting. But also if you were Russia, if you were North Korea, if you were Iran, if you were all kinds of countries. It would make sense to launch an airstrike on a data center, for instance, or something that serious. I mean, at a minimum, it would make sense to use non-kinetic means. So to use sabotage of some kind, cyber attack.

There's actually an interesting paper recently by an interesting trio. It was the guy who runs the Center for AI Safety, who has a background in machine learning. Eric Schmidt, who's the former executive chairman of Google and has done a lot of work with the defense industrial base in the US. And Alexander Wang, who runs ScaleAI, which is a big data labing company. They wrote a whole paper about how these sort of stabilizing and destabilizing dynamics

But I think even that paper didn't really, I think, unpick this or try to unpick this question of, do you actually get that compounding advantage? Is there actually sort of a point of no return beyond which whoever happens to be in the lead at that point is sort of then permanently at a huge advantage? So I hope that we can kind of dig into that a little bit more in the coming years rather than just taking it as a given. Yeah.

Well, it's a good reading tip. I'll look up that paper. I mean, it's a fascinating topic, this idea that there could actually be a kind of a takeoff point where it recursively self-improves so quickly that whoever has it at that point will become so powerful that they could effectively smother everything and everyone else working in this area on the planet. And in that case, you could be three days ahead. And if the takeoff is that fast, then it could make all the difference.

It's sci-fi kind of stuff, but that actually segues quite well into the, I suppose, the third act of our conversation here, which is about

risk and safety. And I think this is an area that I find honestly fascinating. It often gets broken down and there tends to be a bit of back and forth argument, sometimes not very productive about, are we wasting our time by worrying about those sorts of existential level risks? And it feels as if the big risk of AI becoming so powerful that we lose control of it. And in some way that has detrimental effects on life on earth, humankind.

That conversation feels as if it's sort of shrunk a little bit and it's being pushed into the background. Even Bletchley Park, which wasn't that long ago, the big AI safety conference there, tackled those issues. But we haven't had any

anything quite like Bletchley Park since then. On the other side, there are the people who argue that, well, it's actually the near-term risks that we need to worry about. Disinformation, the ability to do something, bioengineer weapons, something like this. I mean, for my money, disinformation is bad, but it won't make the human race extinct. So I think we need to worry about both. I mean, where do you sit on that spectrum, I suppose, between the near-term and the less likely, but more catastrophic longer-term risks?

I think we need to be able to handle many different types of problems at once. Like you said, I think that this idea, which definitely has been prevalent of one set of problems is a distraction from another set of problems, I think is really not helpful and is counterproductive. I think there are clear observable problems being caused by AI systems now. We should be looking at those. We should be trying to prevent them. We should be trying to manage them. We should be taking them seriously.

At the same time, I think we should also be taking seriously the fact that some of the best capitalized companies in the world, with hundreds of billions of dollars to spend, with research teams that consist of some of the very best thinkers in the world, that they are racing ahead as fast as they can to build AI systems that they think will be more capable, smarter, more powerful than humans.

And they will tell you if you ask those researchers that they don't know how to control those systems if we build them and they don't know how it will go. So I think we need to kind of, yeah, be able to walk and chew gum at the same time and say, okay, there's stuff for us to deal with now. And also what those people are trying to build and telling us that they're planning to build

sounds like maybe we should be getting ready for that and trying to think about what we would need for that to actually go well. Because again, this is not a, I think it, you know, 10 or 15 years ago, it was sort of seen as a bit of a sci-fi kind of concern, but now you have really like Nobel prize winning computer scientists. You have the top researchers at many of these companies that are building this stuff who will tell you like, oh, I don't know if this goes well for humanity. Like I'm really not sure. I don't think we have, you know, the technical means to be confident one way or another. So I think we need to be

trying to prepare for a wide range of potential scenarios, including potentially very, very dangerous and concerning scenarios. Yeah. I mean, whether it's Geoff Hinton talking, he has a great remark along the lines that history does not have a lot of examples where a less intelligent

entity was able to control a more intelligent entity. I mean, put another way, sheep don't have humans running around in paddocks, for instance, and funnel us down wooden things towards a gigantic chopping machine. I mean, the exception, of course, is cats and dogs, which do have us running around to try and make their lives as good as they can. But yeah, I do think that this gets at an important point, which is

I don't think the concern here is sort of Terminator robots that are seeking revenge and coming to kill us because they want vengeance. I think the concern is much more like if you think about how humans make plans and affect the world, we don't think very much about ants or beetles when we're deciding where to put the next skyscraper or what forest to clear for agricultural land. It just doesn't come into our calculus. And so I think it's a question of

how are AI systems affecting what is going on in the world and what constraints are they under and what are they optimizing for? Because if they're just, I think there's concern that there might be sort of natural pressures towards AI systems optimizing for, for example, a planet that's covered in solar panels and data centers, because that's a great

planet from the perspective of the AI systems. It doesn't actually need them to have any ill will or any emotions whatsoever for that to be not great for humans. Absolutely. And it's conceivable that some basic human rights previously considered

become more up for grabs, I suppose, in that kind of scenario, even with the complicity of some human beings, say, in leadership positions. It's a fascinating discussion. So are we having the right conversations about this now? And what have you seen as the trajectory of the safety and risk conversation over the past year?

I mean, obviously over the past couple of months, but I suppose even before that, certainly since the Trump administration came to office, we've seen a real diminution of talk of things like safety. I don't even hear AI safety used as an expression anymore. It seems to have been rebadged as safety.

national security, which sort of partially brings some of the safety questions in, but looks at them from a different lens, which is more around geopolitical competition and saying if the US stays in front, then AI will by definition be safer. J.D. Vance's speech in Paris obviously really was quite contemptuous of ideas like safety. He talked about hand-wringing won't help us.

Do you think the safety conversation has been muffled now? And do you see any way of sort of reestablishing it as a priority? I'm not sure. We will see. It does seem like under the Trump administration, this idea of safety has been sort of thrown in the same bucket as DEI, which is obviously anathema.

And it's very ironic to me to see that being treated as the same thing because, you know, having been in this space for a number of years at this point, I remember when sort of AI safety was seen as in opposition to or an enemy of sort of more left progressive perspectives on AI. You know, maybe that branding is gone forever, who knows? But I think there's sort of

just more pragmatic concerns that have to resurface. So there's a great paper I love from a researcher called Deb Raji called The Fallacy of AI Functionality. And she's basically what she's saying is we have all these conversations about AI ethics, AI responsibility, AI trustworthiness, AI bias. But actually, if you look at a lot of the systems in question, they just don't work very well. Like it's not a matter of someone didn't do the right ethics training. It's a matter of the system just isn't very good at predicting the kind of thing that it's supposed to predict.

And that generalizes to a lot of reliability problems that we have with AI, issues with robustness. We have challenges around sort of understanding how they work on the inside. Just lots of things that mean that they don't necessarily work the way that we want them to or the way that we would like them to. That's not a sort of quote unquote safety question. That's not a quote unquote ethics question. That's just a matter of does it work?

Likewise, if we do have some of the top scientists in the field saying, actually, this stuff might be quite dangerous and quite bad for civilization, that's not a question of ideology or hand-wringing. That's a question of, are we building technology that is going to help us or not?

So I think that the conversations that treat any kind of regulation, any kind of thinking about safety or effectiveness or trustworthiness as being fundamentally in opposition to progress to technological advancement, I think it's very mistaken to treat those as sort of fundamentally in tension. I think often, certainly badly designed regulation can be a damper on innovation, can be a very bad thing.

but the idea that we want our systems to work well and we want them to be promoting things that we care about is sort of, I think should be pretty uncontroversial. And then the question is, are we on track to do that? Or is the sort of unbridled competition dynamics that we're seeing at play, maybe that isn't going to take us down the best path necessarily. And if so, how do we try to steer that a bit better? Yeah. The conflation of safety and

ideology. I have found that fascinating as well. I heard a podcast with Marc Andreessen a couple of months ago where, I mean, Andreessen's a brilliant guy in a lot of ways, but he was talking about two of the so-called godfathers of machine learning. I assume referring to Yoshua Bengio and Geoffrey Hinton as being, you know, sort of radical leftists. The

The third, Jan LeCun, I assume, who we left out of that pile, obviously, is the guy who's saying, look, chill out with AI safety. It's not going to kill us all. But that's, I mean, I don't know what any of their personal politics are. But for all I know, Jan LeCun votes, you know, Democrat far more than the other two do. I mean, it was just a bizarre sort of conflation of, you know, radical left-wing ideology with safety. I just cannot, could not see the connection between the two things. So I completely hear what you're saying there.

I'm going to get in trouble for going over time here, but I do want to finish off with the question of what happens if all of this works out just right and we solve all of the world's problems and we can all sort of pursue our lives in material abundance as we hope. Do you think much about

what that actually means for me. We talked about giving guidance to our children earlier and, you know, what on earth do you tell them to study now? Because, you know, what on earth is going to be useful to them by the time they grow up? Do you worry at all about what it means for, you know, a sort of human identity, human sense of meaning?

I think there's two ways of looking at this. One is about transition and one is about sort of where you end up after a transition. I certainly think that transitioning to a world, if we transition to a world where AI is doing all kinds of tasks that we've traditionally thought of as being things only humans can do, that would be very bumpy. That could be

you know, really rob people of meaning if they've sort of grown up with one conception of what life is supposed to look like and then they're suddenly thrust into this new world. But longer term, I certainly feel optimistic. I think there's lots of really wonderful ways the future could look in terms of, I think, you know, we can find meaning in

People always talk about art and music and these high-minded things, but I think even just things like building things or playing sports, watching sports, or working in your community, building really strong relationships. I think there's creating things yourself, creating things with other people. I think there's so many possibilities for how to lead wonderful, meaningful lives.

If we can get, you know, not just the safety, right. But also if we can get the political economy questions, right. If you know who has access to resources, how is it like, do we still have democracies? Is democracy even meaningful? What is the geopolitical situation doing? I definitely feel like if we can get all of those tricky questions, right. And we find ourselves living in a place of material abundance and we have a little bit of time, we have, you know, a generation or two to adapt to what that means.

then I think that could be absolutely wonderful for sure. What exactly it'll look like, I have no idea, but definitely hope that it's something that we are our kids or maybe their kids do get to experience.

You've summed it up well. And yeah, I mean, it would be tragic if we get everything we've ever wanted forehanded to us on a platter and we managed to turn it into a terrible thing and destroy ourselves through the process. So it would be the greatest gift never taken. Helen, thank you so much for talking with us. It's been a really fascinating conversation. Hopefully we can have you back sometime soon, but otherwise, thanks and all the best. Thanks very much. It's been a pleasure. Thanks for listening, folks. We'll be back next week. Stay tuned.

The road to artificial general intelligence, with Helen Toner 44:10 Share

Stop the World

Shownotes Transcript

The road to artificial general intelligence, with Helen Toner