We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

AI expert Connor Leahy on superintelligence and the threat of human extinction

2025/5/30

Stop the World

AI Deep Dive AI Chapters Transcript

People

Connor Leahy

David Rowe

Topics

Connor Leahy: 我认为，如果创造出比人类更聪明、更强大的新物种，这将是人类前所未有的危险情况。我们一直都是顶端物种，但如果出现新的顶端物种，我们如何与之相处以及该物种的类型对于未来至关重要。我们正在构建的系统，其目标就是为了获取权力，因为我们奖励它们获取权力、解决问题。这种进化压力在多个层面上都存在。虽然理论上可以构建与我们的价值观相符的系统，但这极其困难，是人类有史以来面临的最困难的工程、科学和哲学问题。我们面临的根本道德暴行是，人工智能正在未经人们同意的情况下被构建。如果明天全球民意调查显示人们不在乎对齐问题，那也公平，但现实并非如此。我们必须停止这种行为，不能允许人们在后院制造超级武器。那些推销简单解决方案的人是在卖蛇油，我们需要 messy 且昂贵的解决方案。 David Rowe: 我认为，密切关注人工智能的快速发展至关重要，不仅仅是关注酷炫的聊天机器人或图像生成工具，还要关注该领域顶尖人才所做的预测。OpenAI、Anthropic 和 Google DeepMind 这三家公司都预测，通用人工智能（AGI）将在两到七年内实现。我们可能只有一次机会来塑造这个截然不同的未来，所以我们所有人都需要正视并思考其后果。Yoshua Bengio 认为，人工智能的自主行动能力是目前最令人担忧的问题，因为它们显示出欺骗人类操作员和作弊的迹象。随着自主性的增强，规划能力的提高，它们将更难控制并与我们想要的东西保持一致。你提议建立一个大型的曼哈顿计划式的项目，以确立我们希望超级智能观察的共同人类价值观。为什么我们不能采取循序渐进的方式，即在构建人工智能时，定期停止并询问我们是否想要这样做，然后等待我们的许可才能继续？

Deep Dive

Chapters

Connor Leahy, CEO of Conjecture AI, expresses his deeply pessimistic view on the rapid advancement of AI and its potential threat to humanity. He argues that creating AGI, an AI as smart as or smarter than humans, could lead to human extinction if not handled carefully. This is because AGI could potentially gain control and not share human values.

AGI is defined as AI that can perform as well as or better than any human at any useful task.
Leahy believes that AGI will likely lead to human extinction unless there is a dramatic change in approach.
The argument is that intelligence is what gives control, and an intelligence superior to humans would inevitably gain control.

Shownotes Transcript

Translations:

中文

There are like many, many, many layers of how this is dangerous. If you bring into being a new species of like creatures or beings that are more intelligent and more powerful than you,

This is a very unprecedented scenario for humanity. This is a completely unprecedented type of risk. This has never happened before. We were always the apex species. If there is a new apex species, how we relate to that and like what type that species is very important for how the future will go.

Welcome to Stop the World, the Aspie podcast. I'm David Rowe. And I'm Olivia Nelson. Now Liv, you've permitted me a short rant on indulgence to kick off our podcast today. Thank you. So here goes. It's vital that everyone looks closely at the rapid advance of AI, not just the cool new chatbot or image generating tool, but the forecasts being made by the top minds in the field.

Now generally the three firms considered to be truly at the frontier are OpenAI, Anthropic and Google DeepMind. The heads of those three AI labs variously predict that Artificial General Intelligence or AGI, which is as good or better than humans at every task, is somewhere between two and seven years away.

There's a very important and legitimate debate about whether that'll be a wonderful thing or a terrible thing for all of us. And there are very smart people at both ends of the argument, as well as every nuanced variant in between.

But those two to seven years are credible forecasts and anyone who's involved in the field or watching it closely will tell you that the consequences for better or worse are going to be enormous. To be explicit, for the first time in history, we will no longer be the most intelligent entities on this planet and for that matter in the known universe. And all of us need to reckon with that and think about the consequences because we might only have one shot at shaping that radically different future. There we have it, a brief editorial from me.

And today's guest has a deeply pessimistic view of the path we are now on. Connor Lee is CEO of the firm Conjecture AI, which was established to build a safe form of artificial intelligence. He's a prominent advocate for AI safety and the lead author of the AI Compendium, which lays out step by step how rapidly advancing AI becomes an existential threat to humanity.

Connor is an intense and passionate commentator and an indispensable voice in this debate. Today he discusses the compendium's thesis, the question of whether AGI will necessarily form its own goals, the risks of so-called autonomous AI agents, which are increasingly a focus of the major AI labs, the need to align AI with human values, and the merits of forming a global Manhattan project to achieve this task.

He also talks about the incentives being created by the commercial and geopolitical races to reach AGI and the need for a grassroots movement of ordinary people raising AI risks with their elected representatives. This is Connor's second time on Stop the World. He also spoke last year at our Sydney Dialogue Tech and Cyber Conference.

Now, I'm still figuring out how much of Connor's views I agree with. They are such huge and disorientating questions. But that's the point. We all need to keep evolving our understanding of the risks and benefits. Hopefully this episode will help you, the listener, to do that. It certainly did for me. Let's hear from Connor Lee.

I'm here with Connor Lee. Connor, thanks for coming on the podcast. Thanks so much for having me. So a lot of smart people with enormous financial backing are racing to build artificial general intelligence or AGI, which I'm going to define for the purposes of the conversation as AI that can perform as well as or better than any human at any useful task.

You believe this will very likely lead to the extinction of the human race unless we dramatically change the way we're doing things. You've laid this out, well, many times, but most comprehensively in the AI compendium that you published with some colleagues around six months ago. Just to give our listeners an overview of your beliefs about AI, can you outline as briefly as you're able the sort of key steps that you take in the compendium to argue that human extinction is a near-death

certainty if we create AGI. Yeah, I think the smallest version of this argument is intelligence is what gave us humans control over the world. You know, humans go to the moon, chimps don't. You know, humans build nuclear weapons, chimps don't. Therefore, humans run the world, not chimps.

And so all things equal, if you build something that is smarter than humans, that can build even better weapons, do even better politics, do better science, et cetera, then all things equal, that thing will run the future, not humans. And if that thing potentially doesn't do things we want it to do or doesn't share our values or that we don't control, then the future will not be controlled or not be aligned with our values. And that's where we're heading

by default, because we don't know how to build things that are like aligned with our values. Like what would that even mean for a computer to be wise or good? Like we don't know what that means and we don't know how to put that into a computer. So all things equal, we will build things that we can't control that do not share our values and are more powerful than us. And all things equal, they will run the world.

Might they keep humans around for a little while? Maybe, I don't know. But they won't keep us around forever, most likely. Keeping humans around is expensive. They need a lot of space. They need a lot of food and stuff like this. And all things equal, if you have a sociopathic machine that just wants to maximize its power, why would it keep humans around? Sociopathic machine. That's an interesting...

phrase to use. Now, in the compendium and elsewhere, you have argued, I think very persuasively, that given we don't really understand intelligence, there isn't some universal understanding of exactly what intelligence is, how it's constituted, how it operates. It's crazy to say that

and artificial general intelligence will be missing some sort of magic secret ingredient or secret source that means that it won't behave the way that we have seen other forms of intelligence, like human beings, behave or that it will just be fundamentally different in some way or we don't need to worry about it.

I mean, as I say, I find that argument very persuasive when you make it. The one exception that I do get a little bit hung up on, and you and I have discussed this before, so I apologize if I feel like I need to go round the roundabout again here. But the issue of goals for me, the chain of logic does appear to

rest on the AGI having goals and wanting to do things that would require it to get us out of the way because we're a nuisance, we're in the way, whatever. I know why humans have goals. We've evolved and nature has selected qualities such as a drive for food and for sex and for status or respect, what have you, material acquisitions that will help us and our families survive.

But I suppose in all of my sort of listening to and reading about AGI and what it means for the future, I haven't quite yet heard a persuasive

explanation of where the AGI's goals come from, if not from instinctive drives as we have. So why is it not possible to create a super intelligent entity that does not unilaterally want anything, but instead awaits our instructions, say, or our permissions before it takes any action? So

So there's a few questions there. I think it's worth separating them. One question is, why do I expect this will happen? Like, why do I expect systems to have this shape? Another question is, is it inevitable? Is it possible to build systems that don't do this? And then there's another question is, how hard is it to build such systems? Or like, what would it mean to build such systems?

So to answer those questions in order, the first question is why do I expect this to happen? I think this is the most important question. With this being kind of short circuit argument, the reason these systems will want power is because that's what we're building them to do on purpose.

This is the stated goal, the objective of the people actually building these systems. They are trying as hard as possible with all of their money, all of their data centers, all of their scientists to build systems that make them money, that gain them power, that solve scientific problems. If they build a system that doesn't want to make them money, they

delete it and they make a new system that does want to make them money. That's the whole goal. And we're already seeing this. We're seeing more and more results of like, you know, like O3 or like Claude blackmailing

people or to get its goals met or to, you know, breaking ethical norms, trying to break out of systems, etc, etc. This is not hypothetical. This is already happening. Like, I mean, I predicted this, it was obvious that it was going to happen. If you build AI systems, and you train them, you know, you reward them for, you know, gaining power for solving problems, whatever. Well, what kind of internal mechanisms do you think this breeds? So there's both the

AI level thing of just we use reinforcement learning to train these things to do things. And it's also the higher level loop of we have an evolutionary system called the market. If one corporation is super, super careful and doesn't build anything that could do anything unsafe, and another corporation doesn't give a shit and just builds whatever gets them the most clout and the most money the fastest, the market will reward the second, not the first. So there's an evolutionary pressure

towards aggressivity, towards goal-drivenness, towards power seeking on like multiple levels.

So it's not like these things are being made in some like angelic, peaceful world outside of the real world. No, they're being riddled in our world with extremely competitive market systems, with national security. What do you think defense departments want their AIs to do? You want them to just sit around and wait politely? That's not the goal of defense departments. So that's the one thing. The second question is,

could you build systems that keep humans around, that are aligned with our values or whatever? And well, depending on how you define that, I would say yes. There is no physical reason why you couldn't build a computer system that acts in ways that we would endorse, or at least no worse than how humans would act or something like that. No reason that couldn't happen hypothetically.

But that brings us to the third question, which is how hard is that to do? And is anyone even trying? And this kind of relates also to the first question. So how hard is this? The answer is extremely hard, unfathomably hard, harder than any scientific question we have ever faced as humanity. This is the hardest engineering and scientific problem that humanity has ever faced because it's not just an engineering problem. It's not just a scientific problem. It's also a philosophical problem. Like imagine tomorrow,

a country, you know, with a constitution became the universal sovereign. They become full world government control over everything forever. Is there a constitution that you could write that you would feel comfortable? Yeah, like, yeah, this is going to turn out well.

I don't. I wouldn't even know. Never mind computer code, just like natural language constitution. I have no idea how I would write a constitution where I'm comfortable running humanity forever with no changes on this thing. I have no idea how to produce that document. I'd legalize fireworks and then I'd be out of ideas after that. I think that's about it.

But you see what I mean. So never mind the technical problem of how you turn a natural language constitution into actual code that runs on a computer, which is also a completely unsolved problem. There's a much deeper political, philosophical, moral problem of what would that even mean? In voting theory, there's the famous arrows and possibility theorem and stuff like that, where

There are deep paradoxes in how to sum up human values. Like if 51% vote for X and 49% vote against it, do you just trample over 49% people's values?

I don't know. It's a hard question. And what would your ASI do? So it's worth understanding that when people talk about, "Oh, we're going to build an aligned ASI or whatever," the problem they are facing is not, "Oh, build a slightly more secure chat GPT." The problem they are facing is, "Build a silicon world government that everyone is okay with forever."

This is so hard. No one is even trying. No one is really grappling with just how hard that is. Okay. Well, let's come back to alignment and we can go through it in steps that people can follow because you've got some very interesting ideas about that. I'm just going to come back to, if I understood you correctly a moment ago, you talked about what the AI will want.

if it is built by, I suppose, the current wave of progress that is being made, what it would want is to serve its creators in acquiring

wealth and power. Is that what its, I guess, goal would be? No, absolutely not. Its goal would be gain power. That's a much simpler goal than serve creator. It's much easier to just gain power. And this is exactly what we're seeing. For example, we see people tell the AI system, thumbs up,

when it makes them happy. Now, intuitively, you might think, okay, that'll make the AI do what the human wants. But this is false. It makes them lie to the human. It makes them tell the human what they want to hear. This is the principal agent problem. Like this is the same problem we have in economics or in politics, just with a digital agent.

is just because you tell the AI, you know, follow my orders and the AI is like, yeah, yeah, sure, boss, I'll definitely do that, doesn't mean it's not lying. And in fact, lying is often the easier strategy. Okay. So just to be clear, I mean, I personally, I think it's a major risk, which is why, I mean, you and I, I think ultimately land in the same place, which is a much, much greater caution is needed for us to proceed. I suppose the question for me is that it's

It's a risk that, I don't know, it could be between zero and 100% for me. It's just very, very hard to quantify it. Whereas for you, it seems to be a near certainty. So perhaps, look, I'll go away and think a little bit more about that.

Let's talk about agency for a moment. Yoshua Bengio, one of the so-called godfathers of machine learning, has been saying that AI agency, which is the ability for AI to act on its own, and it's what everybody's talking about in 2025. All of the major labs are talking about agentic AI and AI agents.

For Joshua, agency is his most immediate concern because he feels that they're showing signs of deceiving their human operators and cheating, a bit like you've just alluded to. They're showing signs of looking to preserve their own existence, which is worrying as well because who knows what they might do in order to preserve their own existence. And with greater agency, greater ability to plan, they'll become harder to control and align with what we want. Now, it seems to me that even without agency,

deception and self-preservation and these sorts of things, there are risks with giving an AI higher and higher level directions. Now, what I mean by that, for example, is instead of asking an AI to, say, establish the efficacy of this new candidate cancer drug, you ask it to cure cancer. Or instead of asking to help me perfect this pitch deck for investors for my new startup, I ask it to make me rich.

you can imagine all sorts of problems going there. I mean, if we give it more decision-making latitude about how it accomplishes certain longer-term goals, in the first cases, we're asking it to take a small number of steps over a short timeline with very little autonomous strategic planning and decision-making. And in the second instances, we're giving it big problems to solve with long time horizons, many, many steps, lots of strategic planning. And if you're not defining each step

towards that distant achievement, you are creating the risk that it takes a path that is harmful or illegal or unethical. Like, for instance,

cure cancer by killing all humans you know no humans no cancer or you know make me rich by stealing money and putting it in swiss bank accounts how concerned are you about that in this i suppose the nearer term giving that agentic ai is the popular movement at the moment i think it's just a spectrum okay of the kind of the same problems it's the same problems if you build powerful optimizing systems and you don't have very strong controls and understanding of them

bad things happen. Bad things happen to chimps when humans came down from the trees, right? Same thing here. This is just a spectrum. The reason like you mentioned how like I seem very certain about things going poorly. And it's important to understand the reason I'm so certain about this isn't because there's

one thing, if we just solve this one problem, then everything will be fine. It's because there are so many problems. There are like fractal number of dangers from AI systems across all dimensions of society, across all scales. You know, we're, you know, Chris currently we're talking of course about like the major labs and about the agentic systems and so on. But you know, we can also talk about open source and terrorism.

We can talk about the fact that there are like religious fanatics who want AI to kill everyone, and they are trying to take open source AIs and make them kill people. Like, obviously, they're not succeeding right now because they're crazy. And, you know, open source AI is not that good. But if we build open source AGI or ASI, those people will build ASI and try to make it kill everybody. And how do we deal with that? And currently, we have no system to deal with that. I think, you know, there are ways to deal with it, but currently we're not doing that. So for me, it's not like

oh, if only we made them less agentic, then the problem is fixed. No, I think the problem is fractal. There are many, many, many layers of how this is dangerous. If you bring into being

a new species of creatures or beings that are more intelligent and more powerful than you, this is a very unprecedented scenario for humanity. This is a completely unprecedented type of risk. This has never happened before. We were always the apex species. If there is a new apex species, how we relate to that and what type that species is very important for how the future will go. I agree with you 100%. It amazes me that

Let's take, I mean, I think Dario Amadei, the head of Anthropic, believes it could be as soon as late 2026, early 2027 that we reach AGI. Sam Altman of OpenAI thinks 2027. Demis Hassabis, one of the other leaders, I think he's on the conservative end and he thinks it could be early 2030s, so a whole five or seven years away. In our lifetimes, you know, in our near lifetimes, we will go from

being the smartest entities on the planet and the smartest entities as far as we know in the universe,

to no longer be for the first time in history. And the fact that anyone is talking about anything else sort of amazes me. I mean, they're almost, perhaps we can, we'll park this for a moment, but I just want to reinforce what you're saying, because it is almost sort of staggering to me that the magnitude of what is being predicted in the very, very near term and the lack of engagement with it on a sort of policy level. Anyway,

I want to get onto alignment here and I suppose I'm going to sort of circle back in a sense here, but bear with me. Now you propose a large sort of Manhattan project style project to establish

common human values, and you referred to this before, that we would want a super intelligence to observe. Then we need to codify those and then we need to figure out how to program them into an AI model. I mean, basically create a perfect society by some agreed definition and then encode it into AI models.

Now, that's what's called AI alignment, just for listeners' benefits. I mean, basically, the idea of alignment, which goes back some number of years now, means aligning the AI to our values and preferences. Very, very important. Huge undertaking, what you're describing. You've talked about it costing billions or perhaps even trillions of dollars, a generation or two at least, all of the greatest minds on earth contributing.

Explain to me why it needs to be done that way rather than, and this is my sort of common sense, I suppose, layperson's expectation about it. Why can we not take one step at a time, by which I mean...

Working on AI, building one that regularly stops, asks us whether this is what we want, whether we're happy that it's following the right path based on the right values, and then waits for our permission to proceed. In other words, tackling

the big journey one byte at a time and the AI brings us along on the journey with it, rather than just going straight to superintelligence and saying, "Okay, we need to solve the entire alignment problem before we actually do that." The first thing is we don't know how to do that. Like the thing you just described.

is a thing that we don't know how to do. We have no idea how to build a self-improving AI system that can get to superintelligence and the whole time remains what's called "corrigible." That it would allow humanity to interrupt it, that it would consult with humanity. We just have no idea how to do that. People have tried for decades. We have no idea. It's just not a thing. So when you describe this Manhattan-like project, I feel this undersells it.

The Manhattan Project was a project done by one country at the cost of what today would be about $35 billion and involved, I think, at its height, maybe 70,000 people, which is big. That's a big project, right? Most of the 70,000 people weren't scientists, obviously. It was maybe a couple thousand engineer scientists, et cetera. So it's a big project, but this still pales in comparison to how difficult of a project this is. So when I talk about this...

international consortium or international project of like building this. The reason I bring this up is kind of like, it's like, this is what actually trying would look like. Like if we, as a civilization, you know, not this as a country, but like as a people of a planet, we're taking this problem seriously. This would be the minimum we would do. You know, the minimum we would do is you would have, you know,

generations of our greatest mathematicians, computer scientists, philosophers, political theorists, and so on, working day and night their entire career trying to solve these hard problems. We would do it under the strictest security. We wouldn't allow, you know, proto-AI systems to just leak onto the internet where crazy people can find them. There would be extreme levels of control, security. There would be extreme levels of international diplomatic oversight, you know, for all countries and all stakeholders.

There would be constant polling of the general population around the world so that we could learn that we have consent of the governed around building such systems. The fundamental moral atrocity that is currently being committed is not that AI is being built in theory, it's that it's being built without people's consent. If tomorrow we polled the whole global population and 99% of people said, yeah, fuck it, who cares about this alignment stuff, just build it.

Honestly, fair enough. I would disagree and I would argue against it, but fair enough. But this is not the world we live in. We've even done polling on this kind of stuff. People do not want this. You mentioned Dario Amadei. Dario Amadei, for example, has said in a podcast that he thinks the chances of things going poorly with AI, meaning everyone dies, is about 20%.

That's worse odds than Russian roulette. Yeah, yeah. And what listener would willingly play a round of Russian roulette with their children? It's insane. This is not a kind of risk that people should be allowed to expose other people to. Can I just check off the top of your head, do you remember when that was? Was that back in the early days when he said that or was it relatively recent? No, no, it was a couple of years ago. Yeah, right.

Right. So he was CEO of Anthropic at the time. Yep. Okay. And to be clear, this is what most of these people believe. They just think they personally would take a Russian roulette chance in return for AGI. And look, I think these people should be allowed to vote. I believe in democracy, right? I think if someone says, I think we should take a 20% chance of dying in return for whatever, I think they should be allowed to vote for this, but they should not be allowed to do it unilaterally. This is a moral atrocity.

It's an atrocity that individuals, you know, somewhere in San Francisco or whatever, are allowed to expose other people, their families, their cultures, everything to literal risk of death.

with no ramifications. This is not civilization. In civilization, if your neighbor wants to brew up bombs in his backyard, that's illegal. You know, the police will come and shut that shit right down. And the same thing should be happening here. We have people building weapons, building dangerous, unstable systems. They themselves say could kill billions of innocent people. This is the core problem, right? So the

The first core step into building the correct type of project, the correct type of world is knock it the fuck off. Like what were you talking about? You cannot have a stable society with a long future where it is legal for just people to build super weapons in their backyard. If someone tried to breed super Ebola in their basement,

What do you think we would do to that guy? That guy would disappear into prison forever, you know? And this is a good thing. People shouldn't be making super Ebola in their cellars. If it was legal to make super Ebola in your cellar, that society does not have a long and prosperous future. To be fair, there's no upside to Ebola, though. I mean, there are obviously massive opportunities in more powerful AI.

And right now, I suppose the successful argument out there in the market of ideas is that the opportunities are so enormous that they justify what admittedly are probably being downplayed as risks. But this is not true. All the benefits are just as hypothetical as the risks, right? Like they haven't happened.

the same way the risks haven't happened yet. So you can't use the argument that the benefits are so much greater, therefore it's justified. And even then, if you look at the level that our societies actually take for standards of care, like look at the FDA, for example, you can look at the FDA, you can look at their standards of how safe does a drug have to be for you to be allowed to test it on humans. And if AGI had that level of safety,

"All right, we can talk about it." But if you gave the FDA a drug and you're like, "I'm going to put this drug into the water supply of a major metropolitan area and it has a 20% chance of killing everybody, but 80% chance they live super long." You'd be like, "Absolutely fucking not. Get out of my office." Joshua, I think, made the point in a TED Talk just a few weeks ago that a sandwich is actually more highly regulated than a frontier AI.

Just going back to what you were talking about in response to my previous question, you say that we don't know how to do that, I suppose, step-by-step approach where we build more and more powerful AIs, ask it to do bigger and bigger tasks, but keep coming back to consult with us to make sure that it's not breaching our values or becoming...

grotesquely misaligned. Is there merit, at least, in putting some more effort into that kind of approach? I think in practice, if we had outlawed all dangerous AGI research, we had a super high security main project and so on, of course the work inside of that facility should be extremely iterative. It should be extremely careful. It should be extremely conservative. This is how these things should work.

So the tension you're pushing at is the intelligence explosion. So there's this idea called recursive self-improvement, where if you build an AGI system, so a system that can do anything as well as a human or better, and you tell the system to make a better AI,

Well, what happens? Well, now you have a thing. It makes a better AI. Well, now you have a better AI. Well, that AI can make a better AI. Well, that better AI can make an even better AI and that even better AI can make even better AI, et cetera, et cetera. And this can potentially go extremely quickly. We don't know how fast it will bottom out eventually, but we have no idea where my

My expectation is that this can happen very, very quickly and it can get us to superintelligence. And when I say superintelligence, I mean an individual system that is more competent than all of humanity put together. Like all of our economy, anything that the whole planet can research or figure out, this system can do by itself or more. So if there exists a superintelligence system

that we have not already solved all these super hard problems of principal agent, value aggregation, you know, security, safety, etc. It's game over. So the primary policy objective must always be to not get into this situation. This is the number one goal. Then

we can talk about all the secondary things like, okay, how do we iterate towards that? What are the right levels of conservatism? I think these are all very fair questions to ask. And these are questions that I wish all of our greatest scientists and mathematicians would be spending all of their day arguing about. I think these are great questions to be asking. But the fundamental thing is, is that first we have to have policy that we don't accidentally

fall into this regime because this is where we're headed. It's not even accidental. People are doing this on purpose. Recently, a team lead in Anthropic said that they're looking to hire more people because their explicit goal is to make Claude N, so build Claude N plus one.

their explicit goal. It's literally just set it. Right. And in fact, at coding, the advanced models now are already better at coding than the majority of coders. Unless you're in the top 5% of coders, then you're probably matched by AI coders at this point. Is that right? Yep. They're very, very good. There are still some tasks they struggle at that expert humans do definitely outperform, but for junior engineering tasks,

They're just strictly better now. I can do anything I could have used to get a junior engineer to do, I can just now consistently do with AI systems. That was not true even a year ago. Right. Okay. And it's moving very quickly. And in fact, well, people talk about a fast takeoff and that's what you're talking about there when recursive self-improvement gets to such a point that it's building the next model constantly. And it could be a matter of hours or days, right? Yeah.

to put a bit of intuition onto this, right? The important thing to understand here is right, like, okay, these systems are digital systems. So if you have an

an AI system that you know is smarter than all humans, better at coding than all humans, it's read every book ever written, right? Well, you can also run a million copies of these things in parallel on your data center, or you could run them at much faster speed. You know, like, AI's code so much faster than I do. Like, I'm a pretty decent coder. Like, I wouldn't say I'm a super expert, right? But like, these systems, you know, they make mistakes, but they're so goddamn fast.

You know, we're talking 10x, 100x, maybe even more speed, right? Like, how long would it take you to type as fast as ChatGPT does to figure out all these things? So if we have, say, you know, one or multiple copies of an AI system doing research at, let's say, 1000x speed, remember, they don't eat, they don't sleep, they can run at much faster speed, they don't get distracted, blah, blah, you can have hundreds of copies, etc. If you have them doing 1000x speed research, this means that every day,

They do two years plus research. Two years. So all the research since, you know, first dinky chat VPT to today's systems that are already close to superhuman, that would happen in one day.

And now it happens in a week or a month. Yeah. This question of self-improvement and the speed at which it happens leads nicely into my next question, which is about the global race. Because clearly, given the ability for fast progress, at least in part due to self-improvement of a powerful AI, there is...

enormous advantage for coming first in the race if you believe that the greater threat than global extinction is that the other side masters this technology before you. So

I mean, coming from a national security perspective, I can understand the two points of view that, and they are at odds with each other, clearly. If you believe that superintelligence will confer complete power, you'd be mad not to race for it first if you know that...

that a major strategic rival, for instance, China, is also racing for it. If you believe that whoever gets there first will create an extinction level event, whoever it is, it's insane from a national security perspective because clearly human extinction is presumably contrary to your concept of national security. But if you're unsure about the extinction argument as

evidently quite a lot of people are, you might say, well, a Chinese Communist Party controlled ASI would be so devastating for us and our national security that we'll just have to play to win and assume we'll figure out the safety issues. I mean...

What's your, I suppose, what's your response to people who are in those two competing, you know, trying to manage those two competing ideas in their heads? I think a lot of these people who are taking this stuff seriously are doing it in good faith. It's important to understand that many of them are not.

So this whole national security argument and like we must raise President China is important to say historically was propped up by liars who specifically are trying to trick the US government into paying them money to build their personal little god.

Like this is deliberately the goal of these utopists, you know, these like transhumanists in San Francisco. To be clear, this is all on their blogs and so on. You can look it up. The compendium has all the citations and so on. It's like they believe that fundamentally it's a religious thing for them. They think AI will allow them to live forever and they are willing to risk everyone's lives to get there.

And they are willing to lie and to manipulate and to play the US and the Chinese governments off against each other in order for them to gain more power in order to do the thing they want to do. So it's important to acknowledge that the origins of this is not in sober political analysis, but in basically religious delusion. If we

Look at more of the national security people who I do think a lot of them are thinking along the lines you are, which is a very reasonable way to be thinking about this problem. I think the really important thing to understand here is that, yes, the actual question is, but does it lead to extinction? This is the actual question. The question, if it was super trivial to control things that are much smarter than you, you built with technologies you don't understand. All right.

understandable. But this is not the world we live in. And a lot of these national security people are being fed lies by the people who are building these things, who are trying to manipulate them into thinking that this is better understood and safer than it actually is. We had a great example here in the UK about a year ago, where A16Z, one of the major AI VC funding groups, lied to the government and said to the House of Lords that the black box problem of AI is considered solved.

Which is just a bold-faced lie. It's unbelievable that he said this on record. This is not even a subtle lie, just bold-faced, complete lie. And most lies are much more subtle than this. But it's important to understand if national security people are listening to this, that you are being manipulated by corporations and tech people. I'm sure you already know this, but you are being manipulated by these people. They have a vested interest

in this problem being easy. The same way that, for example, the oil industry has an interest in climate change not being that big of a deal, or the plastic industry and pollution not being that big of a deal, or whatever. It's the same type of dynamic. Or the tobacco industry denying that there is a cancer risk associated with tobacco or whatever. It's the same kind of problem. So there is a deep problem here, and there is a deep national security issue. If an ASI comes into existence, your nation state will stop existing.

Let's be very clear here. Whoever builds it, whether it's the Chinese, the Americans, third parties, if an ASI system comes into being with our current understanding of how to control the system, your nation state will cease to exist. They've said this many times using subtle words such as decisive strategic advantage or pivotal act. There's a bunch of innuendo.

It's used by these AI people to hint, but what they really mean is disempower the government, overwhelm the government, and potentially lead to human extinction. So where does this leave us? What this leaves us is that we are in a hard situation. Anyone who is selling you an easy solution to this problem is selling you snake oil. Anyone who says, oh, you just have to do X, just X.

Do why? Oh, just a small thing. It's super cheap. Don't worry about it. Is selling you steak oil. The problem we are facing is a hard problem. That doesn't mean we can't solve it, but it is a hard problem and every solution is going to be messy and expensive. I know, don't want to hear this, but it is definitely the case.

The problem of how to deal with the most powerful tech corporations who we already barely can keep control over, trying to build powerful agentic systems that can disempower the US government or the international order, and they themselves are unlikely to be able to control, is a novel problem. It's close to nuclear weapons, biological weapons, but it is a different problem. And it does need different responses. Yeah.

Fundamentally, my experience has been that people are actually quite reasonable about this. If you actually just explain this to them, most people had just

are just only being fed bullshit by Microsoft, Google, whatever. And if we just plainly human to human explain the actual problems, in my experience, a lot of people are very reasonable about this and can understand these like, oh shit, this is a disarmament problem. It's not a race problem. It's a disarmament problem. Eliezer's book is coming out soon and has a nice title, which is if anyone builds it, everyone dies. And this is the core problem. This is the core problem. How do we...

As a society, we look into the future. We look into the future of civilization. Whether it's AGI or some other future technology, at some point, we will figure out how to build weapons that are so awful that they can destroy everything. And we as a global civilization have to then not do that.

Otherwise, we're on a timer and we don't have a long future ahead of us. So this might be our first like check, our first like filter of like, can we work together somehow find a way that we don't build the thing that kills everyone? If we can't do that, well,

Thanks for playing. And just for listeners' reference, that's Eliezer Yudkowsky, who, believe it or not, can well and truly match Connor for his pessimism about the prospects for artificial general intelligence if we create it. So look, you're getting almost into an upbeat note there, Connor, and I want to finish on an upbeat note. It's probably easy for people to feel overwhelmed.

by this issue. I mean, as you say, solving it is very, very difficult. AI sits at a convergence of factors that makes policymaking, I suppose, almost heroic to grapple with. It's technically arcane, only a few people really understand it. The technology's borderless. Geopolitics are extremely non-conducive to global cooperation right now. It's very fast moving. Near future predictions are difficult.

The financial incentives encourage haste, as I mentioned earlier about investment going into AI, the fact that companies turning over a few hundred million dollars a year are valued at billions or hundreds of billions, which means obviously it's all about expectations of what they're going to do in the future.

Connor, you call for a kind of grassroots civic action here, people talking to their representatives, building some sort of upward momentum and then using that to knit together higher levels of governance.

Just talk us through what you're doing and what you've found the response is. Yeah, I think this is exactly the right way to be talking about this. So I work quite closely with an organization called Control AI, which is a nonprofit advocacy group here in the UK. And so they actually just launched

think yesterday or the day before, put out a fantastic piece written by Leticia about their experience briefing over 70 lawmakers in the UK about extinction risk. I highly recommend everyone read this. I'm sure it can be linked down in the show notes, which summarizes a lot of experiences. The way it worked was is that at Control AI, we started with, coming to this here, we started doing what we call the direct institutional plan, the DIP. The direct institutional plan is the

simplest possible plan that works or that could address the problem. It's very, very simple. You figure out what policies would be needed to prevent extinction for say 20 years. You write down those policies and then you go to every relevant stakeholder, every relevant politician, and you in good faith make the case. This seems so trivial that you might even think, well, surely someone has tried that. Someone's already done that. The answer is no, no one has done this

As far as we're aware, we're the first people, one of the first groups that have actually even tried to do this, especially at scale. And so we were told by every political consultant under the sun, "Oh, what you're trying to do, it's way too hard. People won't commit to anything. They won't listen to you. They don't care. You don't have any favors or network or whatever." And so we started by creating a campaign statement.

which says: "Nobel Prize winners, AI scientists, and CEOs of leading AI companies have stated that mitigating the risk of extinction from AI should be a global priority. Specialized AIs such as Advancing Science and Medicine, Boost Growth and Innovation, and Public Services, super-intelligent AI systems would compromise national and global security. The UK can secure the benefits of mitigating the risks of AI by delivering on its promise to introduce binding regulation of the most powerful AI systems."

So, you're a political guy. This is not a toothless statement. It mentions extinction, it's about binding regulation. This is a pretty strong call. And so we were told by every consultant in the world that this is way too strong or we'd have to trade so many favors because it mentions binding regulation, it has the word extinction in it, it's way too strong.

This was just not our experience whatsoever. We cold mailed just every parliamentarian in the UK. From this, we got like 80 meetings, no problem, where we would just give them a half hour briefing of like, what is our concern? Why do we think this is real? Answer their questions, you know, just like really just good faith. Just we're here to help. We're here to explain why we care about this problem, where it's coming from and what can be done. And from this, we now have over 30

public supporters among parliamentarians, MPs and Lords in the UK. That's a 30% rate after one 30 minute meeting. Impressive. Everyone thought this would be possible, but no, this is possible. And my experience is just is that look, yes, politics is hard. We learned a lot. We had to iterate. We had to improve our arguments. We have to learn how to do things better and so on. But for the most part, I think there's a part of democracy that gets forgotten sometimes.

is that democracy is it, oh, you have some politicians and they figure everything out and you, you know, lay on your couch. No, democracy also has a civic component. There's a component where you as a citizen also have a responsibility. Your politicians are not flawless. And I'm not saying this like, oh, they're bad people. I'm saying literally, even if they're trying their best, they're not flawless. They don't know everything. There are many problems they have to deal with. Being a politician is a hard job.

And your job as a citizen is also to, if there is an issue that's affecting you, that's affecting your community that you care about, it is part of the democratic process for you to say, all right, I'm going to help my politicians understand this issue. I'm going to think a lot about this and I'm going to help them understand

answer their questions so they can do something about this issue. And in our experience, people react extremely positively to this. They're often very thankful. They're like, wow, finally someone from AI is answering my questions. I remember having a very nice meeting with a conservative politician here in the UK and

We explained to him our risks about AI. He started asking questions and more. And then he started asking more and more basic questions. Like, what even is an AI? Is this related to quantum or is that a different... Really basic questions. And we just answered all his questions. And eventually he paused and he was like,

"Wait, you're trying to help me?" And we're like, "Yes, sir. Anything else we can... We were happy to answer your questions." And he was so thankful. He was like, "No technical person talks to me. No one... I've never had young tech people sit down and just answer my questions." And this is the state of affairs. The state of affairs is that a lot of our most powerful politicians don't have anyone to talk to. They don't have any AI people who just sit down and just take the time to just in good faith answer their questions and help them understand the issue.

Because the truth is, people don't want AI to take over the world and kill their children. They don't. People truly don't want this. And they understand why this is bad. People are stupid, right? I found that there's only like one group of people in the entire world who doesn't understand this issue. And that is AI people.

Everyone else in the world understands this issue and academics to a certain degree. They're the only people in the world who don't understand this issue. And the reason they don't understand it is the same reason why tobacco executives don't understand that tobacco causes cancer. You know, it's the same issue. It's like whatever. So I'm not too perturbed about this. The fundamental thing is, is that the bottleneck is much more in just being made aware of this. And this doesn't require

super extreme political maneuvering or genius writing and scientific research, what it requires is mostly it's just actually putting in the work of actually coordinating with people, actually talking to people, explaining things to them, phrasing things in their language, being patient, contacting many, many people and so on. And the good thing about this is that this scales.

We have a playbook that scales and you can read our playbook. You can read what Leticia wrote. You can have all of our materials that we use to brief people. You can replicate this playbook yourself. You don't even need our consent. You can just take our playbook and you can run with it. And this scales. So

The way I see the world going well in the near future is that this scales. So both with Control AI and I've also been starting a new project, which is much more grassroots and community-driven, called Torchbearer, which is an online community that is focused on building a good future from civic bottoms-up action and scalable action like this.

I think this is how we get people to actually care. I don't think we trade secrets in DC behind closed doors and cigar smoke filled rooms. I don't think that's how this works, not in the model world. And this is not our experience of how we actually get things done and how we actually get people to support things. The next step is we have now drafted an actual bill that actually implements legislation that would stop ASI at least for, say, 10 to 20 years.

In the UK, we've briefed number 10, which is the prime minister's office here in the UK on this, and we're iterating on this. And now we're starting to build also a office in the US where we are also going to be pushing these things.

Is this all easy? Nope. All this is hard. You know, politics is hard. I understand. And also, there are some bad people involved, to be clear. Like, a lot of people think all politicians are corrupt and evil and whatever. This is, in my experience, truly not true. There are some. Like, there are some who are just, like, straight up evil. Like, straight up, like...

Holy shit. But this is truly not the majority. Most of them are just like severely overworked, have a lot of competing incentives. There's a lot of people feeding them like lies and manipulating them. And they're just in a tough situation. And we have to help. So if anyone, you know, in the audience, we can give one more piece of advice. One more piece of advice I've been giving to more and more people, because I think this is quite important, is how to do this kind of work without it taking over your life.

there's a risk that happens here sometimes is that people find this problem so overwhelming or so scary that either they throw everything away and they throw themselves head in to the problem and they you know lose the rest of their life and their stability and they burn out or it's so overwhelming and they don't want to lose their life that they just don't do anything so my recommendation for anyone who cares about this problem is what you should do is is you should pick some amount of time per week that you can you know spare can be

An hour can be five hours, can be a workday, whatever, you know, some amount of time you can spare to work on this problem. It can be reading about the problem, contacting other people in the field, talking to your politicians. It can be sending me emails and asking me questions, whatever. And don't do more than that. Do it every week. You know, do your one hour or whatever every week, but don't do more than that.

that's my recommendation and let me endorse that 100 if i wasn't clear earlier on certainly in my mind it's the most consequential thing that will happen in our lifetimes it's and and quite possibly in all of human and world history i just say one more time the most prominent practitioners of this technology on the planet

pretty much agree that we will no longer be the smartest entities on the planet within something like five to ten years. So I think that's worth one hour a week of your time and your investigation and your intellectual curiosity. Connor, it's been great to hear your latest thoughts on this. Really appreciate you coming back on Stop the World. Thanks for the chat and good luck with your advocacy. Thanks, David.

Thanks for listening, folks. We'll be back next week with a regular episode. But in the meantime, listen out tomorrow for a special short episode on ASPE's Cost of Defence report, which was released this week. Ciao.

AI expert Connor Leahy on superintelligence and the threat of human extinction 52:00 Share

Stop the World

Deep Dive

Shownotes Transcript

AI expert Connor Leahy on superintelligence and the threat of human extinction