We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Babbage: The science that built the AI revolution—part one

2024/3/6

Babbage from The Economist

AI Deep Dive Transcript

People

Alok Jha

Daniel Glazer

Daniela Rus

Dawood Dassu

Steve Garratt

Yoshua Bengio

Topics

Alok Jha: 本系列节目探讨了人工智能革命背后的科学，从早期受人脑启发的人工智能系统到如今强大的生成式人工智能。我们试图揭开炒作、流行语和术语背后的真相，探索理解现代生成式人工智能的关键概念。我采访了多位专家，包括神经科学家、计算机科学家和数据记者，他们分享了各自领域的见解，帮助我们理解人工智能的演变历程。 Ainslie Johnstone & Dawood Dassu: 英国生物样本库收集了大量参与者的脑部影像数据，这些数据包含了关于大脑结构、功能和活动的信息，为研究人类智能提供了宝贵资源。这些数据可以帮助科学家研究基因组数据、影像数据和生活方式等因素与智力之间的关系，从而更深入地了解人类智能的本质。我们对参与者进行了脑部扫描，收集了大量的脑部图像和变量数据，这些数据可以帮助科学家研究大脑结构和功能，从而深入了解人类智能。 Daniel Glazer: 神经元通过脉冲传递信息，这种“是/否”的放电模式是信息处理能力的关键。大脑中的神经元通过突触连接，并通过调整连接强度来学习。虽然我们对大脑的结构和分子水平的工作原理了解很多，但我们仍然无法完全解释宏观行为是如何从微观细节中产生的。智能包括抽象思维、运用知识和语言能力等。对智能的定义存在循环性，我们通常将动物或植物的行为与人类的行为进行比较来判断其是否具有智能。很难找到一种能够完全定义智能的单一标准，因为不同的动物可能具备不同的智能方面。我们可以通过观察大脑活动来研究人类的智能，并比较不同物种大脑的差异来寻找智能的生物学基础。智能体通常会使用工具来增强自身的能力，这在人类历史和人工智能的发展中都得到了体现。大型语言模型等人工智能系统虽然并非真正智能，但将其视为智能体可以提高我们与它们的交互效率。 Daniela Rus: 早期人工神经元模型是一个简单的数学模型，它模拟了大脑中神经元的阈值计算。感知器模型虽然最初看起来很有前景，但它也存在局限性，例如无法识别复杂的模式。明斯基和帕尔特的著作《感知器》证明了单层神经网络只能计算线性函数，这导致了第一次人工智能寒冬。在人工智能寒冬期间，对人工神经网络的研究进展缓慢，但一些研究人员仍在坚持。ELIZA是早期的一种聊天机器人，它使用基于规则的系统，而不是人工神经网络。深度神经网络的出现是解决人工神经网络局限性的关键，它通过多层神经元来处理复杂的模式。 Yoshua Bengio: 早期的人工神经网络系统非常简单，难以训练。对人工智能的研究可以促进对大脑工作原理的理解，反之亦然。研究人员没有试图完全模拟人脑，而是从神经科学中寻找最简单的模型进行研究。多层神经网络能够计算比单层神经网络更多的函数。

Deep Dive

Shownotes Transcript

Translations:

中文

Hello, this episode of Babbage is available to listen for free. But if you want to listen every week, you'll need to become an Economist subscriber. For full details, click the link in the show notes or search online for Economist podcasts.

This message comes from Greenlight. Ready to start talking to your kids about financial literacy? Meet Greenlight, the debit card and money app that teaches kids and teens how to earn, save, spend wisely, and invest with your guardrails in place. With Greenlight, you can send money to kids quickly, set up chores, automate allowance, and keep an eye on your kids' spending with real-time notifications.

The Economist

So we've just arrived at an industrial estate outside of Manchester. We're looking at a sort of low grey industrial building with some huge tanks outside. Looks very unassuming from the outside but there's something pretty special going on inside. Ainsley Johnston is a data journalist and science correspondent for The Economist. Hello Ainsley. Hi, nice to meet you. Hi, I'm Steve. Welcome to the UK Biobank Imaging Centre.

She recently went to visit a brain imaging lab in the north of England. The UK Biobank imaging study, we end up with each participant contributing about 9,000 images. Dawood Dasu is the head of imaging operations at UK Biobank. These are things that tell you about the size, volume, the structure of the brain, but also tells you about brain function as well, so which parts of the brain are active during certain tasks.

And we also have something which gives us a measure of flow of blood in key parts of the brain as well. So each participant contributes, just from brain, around 2,500 variables to the data set that we upload for researchers to use.

The UK Biobank maintains a huge database of biomedical data. It collects everything from genome sequences to information on people's diets. The imaging study that Dawood is talking about here aims to scan everything from the hearts to the bones and abdomens of all the participants. Those scans will help scientists delve into the intricacies of one of the most complicated objects in the entire universe, the human brain.

While participants lie inside an MRI scanner, they're given a quick task to do. It's a snap game. So you get three images, three faces.

They have to match the top face to either the left or the right hand side and that, in my layman's language, it lights up the parts of the brain that are involved in decision making. They'll compare that to what was happening earlier on when the same magnetic fields were being applied but there was no task. We all know that human brains are remarkable.

Somehow, from a tangle of billions of brain cells and a soup of chemical reactions emerges a vast range of skills: language, memory, vision, the ability to process information and even control muscles and much, much more. And the sum is much greater than the parts. Because human brains are also at the centre of what we call intelligence. Human intelligence has driven the success of our species.

Which perhaps makes it odd that we still have so much to learn about what human intelligence, in fact any intelligence, actually is. But understanding human intelligence has to be the starting point if you want to understand the artificial type too. That's our goal in this special four-part series on the science that built the AI revolution. I'm Alok Jha and this is Babbage from The Economist.

In today's show we'll look at the very earliest AI systems and how they took inspiration from the human brain. This is the first of four episodes in which we'll examine the scientific ideas and innovations that have led to the current moment in AI. We're going to get behind the hype, buzzwords and jargon and explore eight ideas that we think you need to know if you want to understand how the generative AI of today came to be.

we'll explore what artificial neural networks really are. The nerve cell will go ba-ba-ba-ba-ba-ba-ba-ba-ba-ba. When we come to think about neurally inspired artificial systems, the fact that it's sending a pulse or it isn't is the critical insight that gives us all the information processing power that we use now. From the earliest attempts to model the human brain in silicon... The systems we were building were so dumb. Ha!

and so weak and so difficult to train. To the technologies that enabled those models to be scaled up. ImageNet was the turning point of AI's history, recognizing how critical it is to use big data. We'll hear why finally, around a decade ago, AI got astonishingly good. All of a sudden things are working and people pay attention to what we do.

We have a number of examples where computer vision systems can beat human experts at their own game. And how those systems just kept on getting better. The change from, say, GPT-2 to GPT-3 was huge. The change from GPT-3 to GPT-4 was huge.

I did not think the large language models would work as well as they are. I just thought I can't just throw the whole internet at it and be able to get next word prediction and have that seem like a human. You know what? I was dead wrong. You can't. If you want to understand the origins of artificial intelligence, it's best to start with the second of those words. So our first question in this series is this. What is intelligence?

To figure out exactly how the human brain works, let's pick up where we left off with our correspondent Ainsley Johnston. The UK Biobank Centre just outside of Manchester scans patients brains seven days a week as they work towards their goal of imaging 100,000 people. Steve Garrett, the imaging programme manager, explained the process that participants undergo. We are in the imaging clinic

We've got participants in here who are coming for around a four to five hour visit. Are they doing tests and things on the computers? We have a touchscreen questionnaire where they give really comprehensive answers to anything about their health and lifestyle, but they also do the cognition tests.

Who wants to have a seat? Oh sure, I'll have a seat, yeah great. That's Joanne Norris, one of the health research assistants at the clinic. We explain obviously that they're going to go through a 25 minute time section on some games, puzzles and memory tests as well. So have a read in the yellow and then when you're ready press that smiley next button. Okay. Okay, the game will have three pairs right?

Okay, so I can see six cards in front of me and now they've been turned over and so I've got to find... Okay, this one was a man and this one was a man. I think this one was a square. Oh no, no, it's embarrassing. Okay, this one is a kite and this one is a kite. Great, square, square. Okay, so I'm being asked to add the following numbers together. One, two, three, four, five. Okay, so that equals 15.

If Truda's mother's brother is Tim's sister's father, what relation is Truda to Tim? Truda's mother's brother. That's Truda's uncle. It's Tim's sister's father. That's Tim's father. Truda's uncle is Tim's father. Then Truda must be his aunt, I think? Oh, God. At this point, some of our participants asked for a pen and paper. I think I need a pen and paper.

I feel like we've probably got enough of this and I think I'm probably embarrassing myself. These tests are about a lot more than just making fun of journalists though. The scores from each of the tests help to paint a unique picture of participants' cognitive abilities. This is powerful data for researchers, particularly in combination with the biomedical data that's about to be collected.

and then they'll go and get changed. And after they're changed, one of them will go for their brain scan. At the end of a corridor full of warning signs for a strong magnetic field is the brain MRI machine. These machines look like giant doughnuts. The participant lies down on a bed and then their head and shoulders are moved inside the bore of the scanner. We entered the control room next door.

From here, a radiographer controls the scanner, checks the quality of the brain images that are being collected, and makes sure that the participant is happy and comfortable. Task is started now. Angela Emmons, one of the radiographers, took me through the process. It's a half-hour scan of the brain. First 25 minutes you just need to keep nice and still, then there's a task coming up. The task is just to look at the brain

When it's actually working, we run an earlier sequence when they are at rest and then just run two minutes of that when they're undertaking a game of Snap. We show them a series of shapes and we show them a series of faces. It runs about two and a half minutes and then when that comes to an end, they've got about another two minutes left in the scanner. While the participant's in the scanner, what can you see in the control room?

Lots of images come up. The images come up in real time. We check the resolution, make sure we've got good images, participants settled, and then just follow that through the sequences.

This gives you an idea of intelligence. That's Dawood Dasu, who we heard at the start of the podcast. You can look at how much of that variation amongst participants is explained by genome data. You can look at our imaging data. You might even be looking at history as well, so lifestyle, job, diet and things like that. You could look at all of that as well. And I'm sure somebody will figure out a way of looking at all of that together.

Using the biobank data, scientists have discovered that having a larger brain, and in particular a larger frontal cortex, is associated with higher intelligence. There are also certain patterns in how different parts of the brain communicate with each other that can predict people's scores on cognitive tests. There's still a lot of variability in intelligence that scientists can't explain using these measures of the brain though. But access to enormous data sets like the UK Biobank

is allowing scientists to pick apart how the tangle of neurons inside our heads have enabled us to develop vaccines, send a man to the moon and even create AI. Lots of researchers from around the world use data from the UK biobank and other sources to investigate brain intelligence. But intelligence in human brains is not something that's easy to pinpoint. There isn't one bit of the brain that's responsible for it, for example.

And the more you get into it, the harder it gets to define what intelligence even is. So let's take a step back and look at how the brain works at a more basic level. To do that I spoke to Daniel Glazer. He's a neuroscientist at the Institute of Philosophy, part of the University of London. He works at the intersection of neuroscience and AI.

We know a lot about how the brain is structured, and we know a lot about how it works in the sense of how the molecular level works. I can tell you in exquisite detail about the structure of the individual neurons, and at the level of the whole brain, I can tell you what the front does and what the back does. What I can't tell you is what difference at the microscopic level...

makes the difference at the macroscopic level. So although I know all of these levels of description of the brain, I can't give you a coherent story that tells you how the overall behaviour derives from all this exquisite detail that I do know about the molecules. Let's go into a bit of that exquisite detail then. Just describe the anatomy for

and how the anatomy functions. So brains are collections of neurons, which are nerve cells. And while nerve cells exist throughout the body, pain detectors and all sorts of things like that, in the brain they're all clumped together in a big wadge. And the principal property that almost all nerve cells have is that they use electricity to send signals over a distance. And there's two things that derive from that. So one is that these cells are often elongated. So most cells in the body are kind of roundy, clumpy. They have a shape like that.

Nerve cells characteristically have got a long extended process which we tend to call an axon and you really can think about this extended process like a wire.

And like a wire, nerve cells send information along this long process using electricity. So nerve cells are signaling devices that get, if you like, information from one bit of the cell to the other bit of the cell along a long bit called the axon, and they use electricity to do that. Just in terms of how that manifests in sensing the world, just explain to me how a network of these cells smells something or learns something.

I think to understand how this works, you can actually go back in evolution around about 70 million years. You could use chemicals to send information. Ooh, there's something nasty there, pull back. And you could retract your feelers. But that only works at very short distances. And for animals and cells to get bigger, organisms to get bigger, they needed to communicate information about smells, about predators, about food over longer distances. And so what evolution did, if we can say it that way, about 70 million years ago, is to use some of these proteins that were being used for signalling within cells...

and wire them up to an electrical signal and then at the other end they turn them back into chemical information which they then use to set off other cells in the network and that insight interestingly which was about signaling is paralleled in the evolution in human terms of what we would call telegraphy if you want to send reliably a signal of long distances you want to be using some kind of code for example morse code and so the first transatlantic cable used pulses dipped

which could be reliably read out at the other end. And it turns out that 70 million years ago, evolution came up with the same insight. So the critical thing about nerve cells is that they use electricity to signal. But that code is not a kind of more, less, more, less, more. It's not a continuously modulated signal. It's pulses.

And this transmission of information by pulses, it either fires or it doesn't, is a critical thing you need to know about nerve cells. When we come to think about neurally inspired artificial systems, the fact that it's thresholding, it's sending a pulse or it isn't, it's doing a yes-no firing pattern, is the critical insight that gives us all the information processing power that we use now.

And it's kind of, in a very sort of crude way, a kind of a digital signal in that respect. Not crudely at all. It's a digital signal in the sense that the information is a one or a zero. The cell either fires or it doesn't. A good way to think about nerve cells probably is not so much to start with in the brain, but to think about flexing a muscle. You want to send a signal from your spinal cord to your muscle in your arm. If you want the muscle to contract more, I'm going to sound like a neuron for a second, the nerve cell will go, if you want to contract a little bit, it will go,

similarly if you have a pain receptor and something's a bit painful you'll get that if something's really really painful the nerve cell signals that by going ba ba ba ba ba ba ba ba ba ba

And so the rate coding, we would say, the rate at which things fire, and there can be more subtle codes, is a yes-no signal that contains information over time rather than an amplitude-modulated smooth signal as you might have in the nuances of your voice. So in your brain, the neurons are very close together. They exist together.

in networks which represent all sorts of functionality and memory, etc. So in your brain, how do the brain cells, the neurons, work together to learn something, whether it's a language or what a predator looks like or whatever else? When neurons are connected to each other, individually or in networks...

there is a strength of the connection. So you don't get the same bang for your buck from each of the cells that connects into a particular cell. So imagine you've got a cell and it's got thousands of other cells connecting into it. Those thousands of other cells are firing.

But each of the pulses from those cells does not give you the same input to the cell that's the target. So we can control the amount of input that you get from a cell by the strength of what we call a synapse, the wire that comes in. And if you like to take an analogy from humankind, you might ask all of your mates for a restaurant recommendation, but you're going to pay more attention to one of your friends who's good with food or likes that kind of cuisine or knows the city than another.

So they're all saying pizza, burger, we should go to that Indian place, we should go to that Asian restaurant, whatever. But listening in, you might say, well, I hear all of those inputs, but I'm going to up-regulate one and down-regulate the other. So that's the strength of connections. In learning...

If it turns out that the restaurant you chose was a good one... From your experience. You go to the restaurant, it was great. You'll say, oh, that restaurant was amazing. And then you say, who was it recommended that restaurant? You go, Alloc. Well, do you know what? Next time I'm looking for a recommendation for a restaurant, I'm going to up-weight...

allox signal compared to the other guy who you know didn't recommend that yeah so things that fire together wire together when a neuron fires it says okay i got excited now i'm asking what was the input that got me to the place i am and i'm going to subtly upregulate those inputs so that in future the ones that got me to this good place are more likely to get me going again that learning that strengthening the connection at a chemical level what's happening

At a chemical level, there are neurotransmitters which change, generally speaking, the structure of the dendrites, so there are things called spines, that basically allow each neuron that fires to release more neurotransmitter to that cell. So it changes the neurochemistry and, to a small extent, the neuroanatomy. It really does change the microstructure of the neurons so that you get more input to a particular cell from the cells that fired previously. Let's zoom out.

People always ask this question about intelligence and are human brains intelligent? But where does that come from in all of this?

By the way, if you look it up online or elsewhere, I mean, as I would with any respected interview, an interviewer, I looked it up on Wikipedia before I came out this morning. And if you look up on Wikipedia, what is intelligence? It said it's that thing which humans are good at. Right. That's a bit facetious, but there is a sort of circularity. And so, for example, when we look for intelligence in animals or indeed in plants, there's some nice stuff about forests being intelligence. Basically, if you just speed up forests intelligence.

then they kind of think things through and they're generous and they look after each other and they feel pain when their fellows are chopped down. When we say that, when we look for intelligence in animals, broadly speaking, we're looking for things that they do that are like things that we do, right? So I can do better than this, but actually as a starting point, intelligence is what we think like. And so just break that down. What does intelligence mean? Even if we can't define it exactly, what are the kind of components of what we think of as intelligence? So intelligence is the ability to think things through.

And the evidence for that is that you can apply it to different domains. You can abstract things to look at something and see their structure, to apply it to other things, to bring knowledge of different domains, to bear on certain things. That requires kind of memory and breadth of reference and understanding. It turns out that language is quite a useful tool in helping one to be intelligent. So it's difficult maybe to imagine a human or a creature that doesn't have any kind of symbolic abstract thought like language and is still intelligent. It seems to be very helpful.

to do that. Although, you know, when we start to look at other organisms like octopuses, they exhibit behaviours which you might think of intelligent, they solve problems, they learn from experience, they think things through, they try stuff and try things again differently from that, and they probably don't have internal, you know, language of thought. It's interesting, Alok, if you think of any given thing, so for example, the ability to project into the future, to think about a future, you might think of that planning as an intelligent thing. The problem is, as soon as you write down a single thing that's about intelligence, a

You can usually find an animal that does that particular thing, right? So if you want planning, go for corvids, like crow-like creatures. We call crows intelligent. People say it all the time. Quite so. And that's because they share a thing which we think of as intelligent in ourselves, which is the ability to plan, the ability... So, for example, when crows hide stuff, if they're observed hiding a thing by another crow, or sometimes a different species, they'll kind of wander away. And then when they're sure the person who saw them hide the piece of food is gone, they'll go back and move the food-hiding place to a place somewhere else. Now, why would you do that...

is because you kind of have thought through that when your back is turned, if you don't come back soon, the person who saw you hiding it is going to come and move it. So we used to think that only humans could do that. The problem is once you, as you're encouraging me to do, Alok, once you define a single thing, which is, yeah, do you know what? Intelligence is that. I can probably find you an animal that can do something like that. What I can't find you an animal that can do is all the things that we count of as intelligent, but that's a bit circular again because we call them intelligent because we do them.

Yeah. So it is a bit reductive and it's not at all comprehensive in the way that you can define intelligence. But as scientists, you want to try and test hypotheses. You want to try and

measure specific things in this sort of slightly confusing world. So in terms of intelligence in humans, what are the ways that neuroscientists or others would try and measure that or test it? So we can certainly look at what's going on in people's brains when they do things that we would consider intelligent.

And we can also particularly do that in the bits of brains of which we have more than animals that are less intelligent than we are. So we can learn by looking at the bits of the brain which are different in us from monkeys, and we can draw out the circuits which enable us to do that kind of complex thought. I do think that intelligence is something that allows us to manipulate objects.

It's very rare for somebody to be just intelligent without using some kind of external system, even if they've internalised it. So language would be an example of an external system which you put in your head. But actually, smart people use tools well. While we're talking, we've got somebody very friendly in the room who's operating some complex sound recording equipment, and you're using a Mac to structure your thinking and look at the questions. That's

intelligence. We use these prosthetics. And actually, again, when we come to think about large language models and the contemporary developments in AI, one of the things that intelligent people like us do is to make good use of these tools. Now,

Now, we also fool ourselves that they might be intelligent too, but nobody thinks that their phone is intelligent really. But they use it to enhance their own intelligence if they're smart. Often it can defeat your intelligence by too much scrolling, but you can use it to extend yourself by judicious use of Wikipedia on the fly or storing information in a helpful way. And this ability to use tools is something that we observe in the history of man, actually, when these frontal lobes developed as something that is a marker of intelligence.

a time when our intelligence probably really took off. It's interesting with the phone example, actually, isn't it? A mobile phone that's connected to the internet, basically a small computer, has memory. It has some sorts of reasoning capabilities too. These markers, as you say, of intelligence, but it doesn't have all of the things. It doesn't plan or abstract things in the way that humans do. But I guess it's a different type of intelligence in that respect, but we would never call it intelligent. You're right. In general, that's right. I mean, I think it's an interesting question about ascribing intelligence is worth pondering for a second.

Fast forwarding to LLMs, artificial neural networks like large language models and machine learning, I think our inevitable ability, we can't turn it off, to make them seem intelligent, allows us to use these tools more effectively. It doesn't mean they are intelligent.

but treating them like they're intelligent enables us to engage with them in more effective ways. And when we come to ask, as I'm sure you will, Alok, whether these machines are smart or not, we must always beware of this innate capacity of humans to ascribe intelligence to others and to machines, and that will mislead us when we try to make judgments about the new machines that we've built. All right, well, we've talked about the difficulty of defining human intelligence. We've talked about the difficulty of actually defining

trying to understand it at all the different levels, from the sort of whole level to the cellular level, there's clearly huge amounts still to learn. But I guess if we try and understand

where all of this knowledge leads into how to do the artificial bit of the artificial intelligence. So when we're talking about computer scientists who were looking for ways of being inspired about intelligence to make artificial versions of it, was it a good idea to try and build artificial intelligences on the human brain? I suppose it's the only way they had, right? Well,

Well, when computer scientists tried to make smarter machines, one of the observations that they made is that maybe what's important about the way that humans think is the wet stuff, is the neurons. And so we can ask, what are the properties of neurons that they lit upon, and how did they implement them?

And actually they did go right back to basics. So to understand a neural network in the sense of computers, that's the way that most machine learning algorithms work, you really just start with a neuron. It's a device which takes inputs from a bunch of other neurons...

not all the neurons affect it to the same extent those are called weights this is true of a tiny little worm each of the neurons that comes onto another neuron excites it to a different extent and it works out on the base of those inputs whether it's past a threshold for excitement or not and if it does it goes boom and that ping that spike goes to the next one taking that our

architecture, and layering upon it a learning rule, which, as we said before, things that fire together, wire together. So by adjusting the weights between the neurons to up-regulate things that tended to make things fire in a good context, those two simple insights give you quite a powerful computational learning machine. Now, when we talk about these neural networks, they're actually being implemented in digital architecture.

So funnily enough, you've got a good old-fashioned digital computer, like the kind that works in your desktop PC or in your phone, but it's running a simulation of these very simple neurons. And again, if you think about the exquisite micro-architecture of human neurons, it would take...

you know, years to describe even a single human neuron. So no, we abstract it into some inputs, some weights, a firing pattern. So this very simplified neuron is at the basis of all of the artificial neural networks that underlie machine learning and current AI. Next, we'll continue that thought and move from human cells to silicon chips and look at the first attempts to create artificial versions of the human brain.

And one of the godfathers of modern AI will tell us about the first time his computer system showed some of the skills that Dan Glazer has been telling us about. That's all coming up.

First, though, just a quick reminder that this is a free episode of Babbage. To continue listening to our special series on AI, you'll need to sign up to Economist Podcast Plus. And now's the perfect time to do so. We've got a sale on. Subscribe for less than $2.50 a month. But hurry, the offer ends on Sunday, the 17th of March.

And as a subscriber, you won't only have access to all of our Specialist Weekly podcasts, you'll be able to join us for Babbage's first ever live event following the conclusion of this very series.

That's going to be held on Thursday, April the 4th, where we're going to answer as many of your questions as we can on the science behind artificial intelligence. Don't miss out. You can submit your questions, check the start time in your region and book your place by going to economist.com slash AI event, all one word. The link is in the show notes.

Today on Babbage we've heard about how the brain works and we're trying to unpack how computer scientists who wanted to build intelligent systems were inspired by what neuroscientists had already found. But rather than building an artificial version of a physical nerve cell, computer scientists wanted to build virtual ones. And that leads us to the next step as we build our understanding of the science behind modern AI. Question 2:

What was the first artificial neuron? To answer this question we travelled across the Atlantic Ocean to Boston in Massachusetts. Main Street, which carries traffic after crossing the Charles River from central Boston, is awash with offices of some of the world's biggest tech firms: Google, Facebook and IBM.

This part of the city has been called the most innovative square mile on the planet. Companies are lured in because the area is dominated by two institutions, Harvard University and the Massachusetts Institute of Technology or MIT.

Few places on the planet have played a more central role in the evolution of modern artificial intelligence. Our quest to mathematically think about intelligence and model our brains goes back to 1943.

where Warren McCulloch and Walter Pitts introduced the concept of neural networks. Daniela Roos is the director of the MIT Computer Science and Artificial Intelligence Laboratory, also known as CSAIL. And they published the first mathematical model that at that time was believed to capture what is happening in our brain. If the way that neurons work in the brain can be explained by mathematics,

then the brain's network surely could be replicated using computer code. Professors McCulloch and Pitts thought that machines with brain-like architecture could have a lot of computational power. The early artificial neuron was a very simple mathematical model. You had a computational unit that took as input data from other sources, maybe other units,

The input was weighted by parameters. And then inside the artificial neuron, the computation was very simple. It was a thresholding computation. Essentially, if the sum total of what came in was larger than given threshold, the neuron output 1, otherwise the neuron output 0.

So the computation was discrete and very simple, essentially a step function. You're either above or below a value. Neurons in the human brain also operate using discrete functions, which Dan Glazer mentioned earlier. They either fire or they don't fire. A psychologist at Cornell University called Frank Rosenblatt went on to develop this model to create an artificial neuron, a mathematical function that he called a perceptron.

At first the perceptron seemed promising. After learning some examples, perceptrons could do some basic things, giving a yes or no answer to an input that hadn't been previously analysed by the machine. Let's say you fed the model some data about the strength and speed of athletes in a sports team. Learning from those two variables, the model could answer whether or not a new athlete would be likely to be accepted into a team. As the field matured however,

flaws in the perceptron became clearer. Because perceptrons only worked like a single artificial neuron, they couldn't be trained to recognise patterns that were more complex. What about for example athletes who were neither particularly fast nor strong, but had really good technique?

In 1969, Marvin Minsky and Seymour Papert co-authored Perceptrons, which is a book that demonstrated that mathematically, if all you have is a single layer neural network, then you could only compute linear functions. And if you can compute a linear function, then you can have a closed form solution. There's no need for machine learning. And actually, in the book,

This work triggered the first AI winter because people lost faith in what would be possible. It became clear that if artificial neural networks were to work, they'd have to have more layers of perceptrons to deal with the complexity of the real world. During the AI winter that Daniela mentioned, funding dissipated and interest in the very idea of creating artificial neural networks dwindled.

There was very little progress until the 1980s. But some researchers did persist within that time and they found other pathways to creating intelligent machines. In 1950, Alan Turing introduced a very important benchmark for machine intelligence in his paper Computing Machinery and Intelligence. This is the paper that introduced the Turing test.

the idea that you know whether a machine is intelligent or not if you aim to converse with it, and if you cannot tell whether on the other side you're talking to a machine or to a person. So there have been many efforts to build intelligent machines that meet the Turing test, in particular at MIT.

One of the earliest systems called ELIZA was introduced and ELIZA enchanted people for a long time. ELIZA was one of the first so-called chatterbots, which took the world by storm in 1966. Its creator, Joseph Weizenbaum, put the bot in the role of a psychotherapist. A user would type a message on an electric typewriter and the machine would reply.

Men are all alike. In what way? They're always bugging us about something or other. Can you think of a specific example? Well, my boyfriend made me come here. Your boyfriend made you come here? He says I'm depressed much the time. I am sorry to hear you are depressed. It's true. I am unhappy.

The early AI systems used what is called a lot of rule-based systems. So you need to solve a problem, you identify some rules and then you write a program that makes decisions according to the set of rules. This is sometimes called good old-fashioned AI. ELIZA didn't use an artificial neural network and it didn't learn from its inputs. Instead, the language model recognised keywords and reflected them back in the form of simple phrases or questions.

supposedly modelling the kind of conversation that you'd expect with a therapist. It was almost like a mirror. Eliza did not pass the Turing test. Which was in fact the point. The researchers behind the bot designed Eliza to show how superficial the state of human-to-machine conversation really was. But in reality, it had the opposite response. People became engaged in long, deep conversations with the computer program.

You know, that's really incredible. It's as if it really understood what I was saying. But it doesn't, of course. It's just a bag of tricks. Oh, I get it. It hasn't the faintest idea what I'm talking about. ELIZA was not an intelligent machine, but it made people stop and think about what the world might be like if artificial intelligence did come along.

It was perhaps also the first time that humans showed how willing we all are to believe that computers could be intelligent if they spoke to us in our own language. It's another example of what Dan Glazer described earlier as the innate desire of humans to anthropomorphize everything in the world around us.

Of course, in the decades since ELIZA, chatterbots became chatbots. And that's not all. These days, our conversations with chatbots easily pass the Turing test. But how did the skills of chatbots that we see today emerge from the primitive AI of the 1960s? What was it that made the theory of artificial neural networks actually work in practice?

At its core, the answer lies in the insight that artificial neurons had to be layered on top of each other like neural networks are in the human brain. And so, at the end of the 1960s, researchers came up with the idea of the Deep Neural Network. The deep learning revolution that came several decades later happened in no small part thanks to three scientists who would later become known as the Godfathers of AI.

The systems we were building were so dumb and so weak and so difficult to train. That's one of the so-called godfathers, Yoshua Bengio. He is a computer scientist at the University of Montreal and he was a key figure in the development of deep learning. What got me really excited when I started reading some of the early neural net papers from the early 80s is the idea that our own intelligence with our brain

could be explained by a few principles, just like think of how physics works. Could it be possible that we would do something similar for understanding intelligence? And of course,

take advantage of those principles to design intelligent machines. And in fact, it goes also in the other direction because there are experiments we can run in computers that we can't run on real brains. And so the work we've been doing in AI is informing also theories of how the brain works. So it's a two-way street. So that synergy and that idea that maybe there is an explanation for intelligence that we can communicate as a scientific tool

theory is really what got me into this field. Talk to us about what the challenge was in trying to model the human brain in silicon. Well, we didn't try to model the human brain in silicon because that would have seemed too daunting a task. Instead, we looked at the simplest possible models that come from neuroscience and see how we can tweak them. In the early days when I was doing my PhD,

we were trying to use these systems to classify simple patterns like shapes of characters or phonemes using the sound recording of me saying "ah, e, oh" can a neural network, which is this very simplified calculation inspired by neurons in the brain, can a neural network learn to distinguish between those different categories of objects in the input?

I've been working on this from the mid-80s to the mid-2000s. What were some of the first things you tried to do with the neural networks to prove that they could be useful? So in the 90s, I worked on these pattern recognition tasks, both speech and image classification tasks.

and industrial applications emerged. For example, I worked on a project to use neural nets for classifying amounts on checks to automate the process of making sure that a check you deposit at the bank has the right amount. And that was actually deployed in banks in the 90s and processed a large number of checks. All of the approaches that had been tried before didn't do very well because

There is so much variability between people. We write in different ways. So it was not trivial. And it is something that had a lot of economic value already to address that challenge. Next week, we'll look at exactly how artificial neural networks allowed machines to learn. And we'll also examine the clever maths that allowed all of this to happen.

People realized that if you could insert a middle layer, which is sometimes called a hidden layer, these systems could actually compute many more functions. That's next time on Babbage. Thanks to Daniel Glazer, Daniela Roos, Joshua Bengio, the economist Ainsley Johnston and all of the people she spoke to at the UK Biobank.

And thank you for listening. To follow the next stage of our journey to understand modern AI, subscribe to Economist Podcast Plus. Find out more by clicking the link in the show notes. Babbage is produced by Jason Hoskin and Kunal Patel, with mixing and sound design by Nico Rofast. The executive producer is Hannah Mourinho. I'm Alok Jha, and in London, this is The Economist.

And now, a next-level moment from AT&T business. Say you've sent out a gigantic shipment of pillows, and they need to be there in time for International Sleep Day. You've got AT&T 5G, so you're fully confident. But the vendor isn't responding, and International Sleep Day is tomorrow. Luckily, AT&T 5G lets you deal with any issues with ease, so the pillows will get delivered and everyone can sleep soundly, especially you. AT&T 5G requires a compatible plan and device. Coverage not available everywhere. Learn more at att.com slash 5G network.

Babbage: The science that built the AI revolution—part one 42:57 Share

Babbage from The Economist

Deep Dive

Shownotes Transcript

Babbage: The science that built the AI revolution—part one