Hello, this is a free episode of Babbage. To listen to the show every week, you'll need to be an Economist subscriber. For more information, search online for Economist Podcasts Plus or click the link in the show notes.
Six months from now, you could be running a 5K, booking that dream trip, or seeing thicker, fuller hair every time you look in the mirror. Through HERS, you can get dermatologist-trusted, clinically proven prescriptions with ingredients that go beyond what over-the-counter products offer. Whether you prefer oral or topical treatments, HERS has you covered.
Getting started is simple. Just fill out an intake form online and a licensed provider will recommend a customized plan just for you. The best part? Everything is 100% online. If prescribed, your treatment ships right to your door. No pharmacy trips, no waiting rooms, and no insurance headaches.
Plus, treatments start at just $35 a month. Start your initial free online visit today at a 4hers.com slash deal. That's F-O-R-H-E-R-S dot com slash deal. Compounded products are not FDA approved or verified for safety, effectiveness, or quality. Prescription required. Price varies based on product and subscription plan. See website for full details, restrictions, and important safety information. The Economist.
The process of breast screening is where two human readers look at the mammograms and then they make a decision on that and then they disagree as a third human reader. Gerald Lipp is a radiologist in charge of screening for breast cancer in the northeast of Scotland.
Gerald and his team recently wanted to find out how an artificial intelligence tool could help their service. We had 10,000 women go through the system and where two humans said there was nothing seen on the mammograms, this is when the AI kicked in. And the AI would then give us a reading and said, there's something on this mammogram and they'll have a circle on the mammogram and say, would you like to look at this area a bit more closely?
One of the very first few patients we did, there was a lady named Barbara. And Barbara was a lady who was reaching her 70s, so this would have been her last time coming to the screening program. We wouldn't have seen her again. And the AI did highlight a very subtle area in the right breast which two human readers had already said was okay, and they weren't worried about it.
the expert readers ourselves, with the help of the AI augmented by it, looked at this area again and said, let's give her a chance, let's call her back and take a look at this. I think I was the one actually who did her ultrasound and her biopsy, found a tiny little thing, probably about a size of five millimeters in the breast. And when the results came back, lo and behold, Barbara was one of 11 women that we managed to pick up using the AI as an extra reader or a fail-safe
where the humans had said this was normal and in her case it was a significant cancer because otherwise it would have been growing in the background without her knowing. You might have heard stories like Barbara's before. In the past five years or so computer vision tools have been shown to spot everything from fractures on x-rays to the early signs of disease from retinal scans. They've even been used to help with surgeries.
That list of uses of AI is set to explode in the coming years.
I know I'm a very good radiologist already because the breast screening program shows that we are good, but the AI gives us that little edge. In screening, we look at a lot of normal mammograms. You're looking at six to eight in a thousand mammograms to find cancer. That's 900 mammograms that are normal. So if I can automate and take away some of that 900 mammograms in a thousand, that actually gives me more time to be patient-facing.
New, bigger AI models promise to make access to healthcare easier, help doctors spot disease faster and transform how medical research is done. But promise is one thing, practice is another. How soon will these technologies really benefit patients? This podcast is the first of two episodes on how the world of healthcare is grappling with the potential of AI.
First, we'll examine where and when patients could benefit from AI models in healthcare. And next week, we'll gaze into the future. How will medical researchers and doctors use the latest AI models to understand and treat disease in completely new ways? I'm Alok Jha, and this is Babbage from The Economist. Today, part one, how AI can improve the experiences of patients. ♪
To talk about the emerging role of artificial intelligence in healthcare is The Economist's health editor, Natasha Loda. She recently put together a special report on the topic. Thanks for joining me, Natasha. Thank you for having me. Now, Natasha, artificial intelligence and healthcare is a topic that, you know, we've covered quite a lot in the past. There have been examples of useful places where it's been applied and people have talked about its potential for a while.
Just right now, what is the specific focus that makes it interesting for you? Well, I suppose it's generative AI in the last year that has been particularly creating quite a buzz around what could be possible in medicine using both the AI tools of the past and the ones that we're going to invent. And, you know, the possibilities lie beyond not just diagnostics, but also technology.
tools for giving personalised health advice, the ability to triage lots of patients very quickly, to write patient notes and summarise meetings with patients, and even to discover new insights from medical data and do drug discovery. So there's a sort of really wide range of possible uses of AI. Okay, so...
Just tell me then, in your opinion, if you were looking at this in the broadest bus strokes, what's the biggest problem that you see healthcare workers being able to tackle with AI sort of in the future?
I mean, fundamentally, what AI is able to do is replicate quite sophisticated form of human knowledge. And so if you think about what are the problems writ large in healthcare, it is that there just aren't enough doctors or nurses, psychiatrists, you name a medical profession, there aren't enough of them to service the needs of the seven or so billion people on the planet.
So that access problem, isn't it? That seems to be a really interesting and intriguing way that this might go. If you've got generative AIs that can replace some of the functions of clinical staff, essentially, you just have more people able to access those services. That's what you're saying.
Yeah, and we're in this really difficult situation at the moment where globally our population is getting older, healthcare costs are rising, there's a real burden of chronic diseases, and there's this sort of massive shortage of healthcare workers. And so even in the short to medium term, if we can allow these technologies to be harnessed,
There's actually a lot of potential for improving productivity and meeting human need, even if we don't, in the big picture thing, solve for health care for the next 10 to 20 years. So that's the question is how much can we wring out of this technology and how quickly? OK, well, that's what we'll explore next in the show. Natasha, we'll come back to you a bit later on.
One of the tools that's using computer vision to help diagnose patients is called MIA. It's an AI model built by the company Chiron Medical and it's a system that Gerald Lipp, who we heard from at the start of the show, has been using to detect cancers from mammograms in Scotland.
Many people don't necessarily know that usually two radiologists are required to look at every single mammogram in breast cancer screening. Peter Kachkameti is the chief executive of Chiron. You can imagine it's large volume and the large volume can be looked at two doctors. Whenever they disagree, a third doctor is brought in.
But even with two doctors, it's estimated and expected that more than one in five cancers will be missed. So obviously that means a lot of improvement can be made.
So what can the AI do? Well, it can increase the cancer detection. We have results that show we can increase cancer detection by up to 13%. With that, we can expect that we can actually halve the number of cancers missed in breast screening. Okay, well, talk to me about the technology then. What kind of AI model is behind the tool that you're describing? How does it work?
Yeah, so MIA is based on a large number of varied AI components.
This means convolutional neural networks primarily. Okay, so the input is the scan or mammogram and the data is passed through layers of the neural network. Each one looks for specific things in the image that might make up parts of a lesion or some other problem. And once that data has passed through all the layers of the neural network, the algorithm can determine whether or not that patient could have cancer.
And Mir passes images through several different networks like this before it comes to a decision. So think about a large group of experts, like, I don't know, hundreds of radiologists sitting in a room looking at every single mammogram, and then they're making a decision collectively in order to have the best possible results. And if you think about it, the AI can find information on its own in a data set, for instance, among the breast cancer screening images,
that humans may not be able to process. Maybe it's different motifs, maybe it's finer tune of grayscale, etc. And from a practical point of view, this means the AI can have very different capability from the doctor's. And that means we need to use the AI and optimize it to be the best complementary capability to the doctor's rather than trying to compete with the doctor's.
Tell me what the model is trained on. How do you make the neural network able to recognise what a breast cancer looks like on a mammogram? So we used the longitudinal data on the whole patient. So for negatives, we wanted to make sure that for multiple years there were no positive signs for the patients. And for positives, we wanted to make sure we account for all the different aspects of a positive cancer.
And it is important that we are training the algorithm on as much of the truth as we have access to, like biopsies and other follow-ups, rather than what the doctors say, because otherwise we would be just approaching what the doctors can do and wouldn't be able to develop complementary capability. So we're talking about images from mammograms, but also data from biopsies that are linked to those images, perhaps.
And all of this is labelled so that the model knows which ones contain potential tumours and which ones don't. Yes, and the labelling process is one of the key elements of machine learning engineering. Really understanding the domain, what constitutes a positive, what's a negative, and to understand when something is certain or not. And typically how much data is going into it? I mean, how many scans, how many biopsies, what sort of order of magnitude are we talking about?
Doordorff magnitude is a couple of million. It's important for AI that it's not necessarily the number of cases that you need to look at, but you need to optimize for the maximum amount of information that it is trained on. So for instance, if I have an algorithm that is trained on
one particular country in Europe. And if you get more of those cases, that's much less valuable than you get different ethnicities from across the globe, from Asia, from South America, Africa. I see. So you want as diverse a set as possible around the world? Diverse in a specific way that it gives you the highest information so that you can reduce confounders. Okay. So where is MIA currently in clinical use then?
It is in clinical use in Hungary and in the UK. And just talk me through how a radiologist would be using it. So, you know, a patient comes in, they have their scan, there's a mammogram available. Does that go into the model sort of in parallel with the radiologists looking at the same thing? Yes. Once the patient is scanned, the AI will...
make a determination in a couple of seconds, so even before the patient leaves the room. But those results may be held back in order to support an unbiased process. It's a very good idea for the doctors to look at images without the AI's input first. So they don't get biased on it. Exactly, so that they don't get biased, so that the maximum human capability is used, and then the AI then determines whether an adjustment is to be made, for instance, or not.
And then it could flag up to the lead radiologist, hey, maybe cancer is missed or your radiologist are potentially over-diagnosing and things like that. It's always a human radiologist who makes the final decision. They have the authority. The AI is a tool to flag when the AI is certain.
And so the doctors have a high chance of changing the patient outcome for the better. Peter, well, that's been fascinating. Thank you very much for your time. Thank you so much. There's huge potential for computer vision systems in radiology to detect a specific disease from a medical image. But AI's ability to make life easier for patients goes far beyond that, as Jason Hoskin, one of our producers, has been finding out.
As you walk down the streets of Boston, it's hard not to be reminded of the city's role as a hub of innovation.
Central to scores of bright ideas in biotechnology are two world-leading institutions, the Massachusetts Institute of Technology and Harvard University. My interest is in how we build AI doctors. That's Pranav Rajpokar. He's an assistant professor at Harvard's medical school. Five years back, if you were someone working in this space, you would have typically a modality specialty and a disease specialty.
So for example, I was interested back in the day on working on chest x-rays and particularly on chest x-rays to identify signs for pneumonia.
But the way the field has evolved over time, what's been increasingly clear is that you can start to think about what it takes to not solve one problem, but to solve the full range of problems that a radiologist works on. So now my group works on thinking about interpreting chest x-rays and or CT scans and or MRI scans.
And we don't think of it from a disease by disease perspective, but really think about it from what we call a generalist medical AI perspective, where you're trying to build these AI models that can do a lot more than just one disease detection. And so why has the field changed? What's the kind of technology that's enabled that? Well, there have been three key techniques that have really created advancements. One of them is self-supervised learning.
The idea of self-supervised learning is to be able to learn without requiring lots of annotations. And we have developed some innovations on that front, along with several other groups in the world. So by annotations, you mean that images are labeled in the training set. So in the breast cancer example, experts would go through and tell the model what a positive case for cancer looks like and what a negative case looks like. And it learns from that.
But self-supervised learning is making that process less onerous. So that's the first thing that's changed. What's the next thing then? The second set of techniques that's been really taking off is multimodal learning, which
And in multimodal learning, we're interested in not just learning about a single modality from a single modality. For example, I'm not just interested in learning from chest x-rays. I might be interested in learning from chest x-rays and the associated clinical reports in order to understand, for example, what chest x-ray matches which clinical report.
Or I might have some associated clinical information like what the patient history is and how we feed in both the image information and that clinical context information has been something we've just gotten much better at modeling over the last few years. Okay, and what's the final thing? The third bucket is generative AI to generate language that
to describe what we see in the image. And this is very close to how radiologists do it. They will write a block of text that goes through comprehensively every organ, every process, and commenting on it.
And this is a technology that's really been unlocked in the last couple of years and allowed us to make these much more interesting models that can start to look like what we expect from radiologists. And your system incorporates all three of those things? That's correct. So how well does it work? That's a great question. So for instance, we published a study on an algorithm called CheckZero,
where you were able to achieve the same performance as radiologists on detecting diseases from chest x-rays, but this time using almost no annotations that were explicitly provided by radiologists. One year later, we're at an even better milestone, where recently our group has shown that we can directly output these clinical reports
And when we give those reports to radiologists, and some of the time they come from the model, and some of the time they come from another radiologist, we ask, how often do you prefer the report that was generated by the model? And we're finding out it's about comparable at this point.
So I think some of the stuff that you've been saying so far, you're trying to build AI doctors, the algorithms are getting as good as radiologists. This will cause a bit of fear in people, right? Are the AIs replacing the doctors? Well, I think there is a conventional approach to thinking about this, which is that these are not going to replace radiologists. And I think that's for the...
Major part just wrong.
Right now, you just have a big supply shortage. So in India, the volume has gotten so high, chest x-rays are not even reported by radiologists. That means you have someone who is frankly not comfortable reading chest x-rays a lot of the time, and they're being pressured to be able to make these decisions which have big consequences downstream for patients.
So now think about bringing AI into the picture. If you had autonomous radiologists and you deployed those to be able to provide either preliminary reads or service in areas where there weren't experts to be able to do the interpretation, just imagine how much positive benefit that has to society.
We're just not in a situation where there's an alternative. Training of radiologists globally is at a standstill. What you have an exponential curve in is the rise in the imaging demand of an aging population. This is a 5% compound annual growth rate. And so you think about this problem just getting worse and worse and worse.
And the solution that we have in the horizon is going to be AI models that can do these reads. I can see that in areas where it's difficult to access care, an autonomous radiologist would be really useful. But people would definitely be freaked out by the idea of that, particularly when you're thinking about things like trust.
You know, I often ask this in seminars I give. I ask people, how many of you trust your doctor? And it's surprising that not a lot of hands actually go up.
And it's interesting to ask, why do we trust doctors? Well, one is they have a certification that says I'm qualified to be able to be a radiologist and they have to renew it every few years. And so that is a system we have set up in a community that understands how we think about the licensing of human doctors.
So similarly, I think we need principles for licensing these AI medical doctors. Okay, now all of this futuristic technology that you've talked about relies on generative AI. And I think most people's interactions with generative AI will have been playing around with the likes of ChatGPT. You know, maybe people have used them to check out symptoms that they might have. But ChatGPT obviously isn't licensed as a medical device.
Is that sort of general technology useful though? So I was recently trying to find out what I can do for some back pain that I had. And when I looked at Google versus what I was able to understand and collect from ChatGPT, there's a vast difference in terms of the depth of the information that it can provide you.
Now, of course, that comes with the question, well, how will I validate what is telling me is true? Well, I can always look up on independent sources what those independent sources have to say about what ChatGPT thinks about my medical concern. But this has actually not been documented.
the exception as much as I thought it would have been. Whether we like it or not, we have been, at least on the patient side, using these systems. The caveat here is also that these
big companies typically don't want to be making medical recommendations. And so they tune their model in order to be able to refrain from making medical recommendations. And that's probably a good thing from a regulatory standpoint. Then they don't have to worry about being classified as a medical device.
But I think it does come with a trade-off where an early version of ChatGPT was actually evaluated in the New England Journal of Medicine for its ability to answer medical questions. And turns out it actually did pretty well.
That sort of gives you an insight into that these systems are powerful. And frankly, there are a lot of places where having such tools is going to make an incredible difference to the quality of healthcare that can be received. That's fascinating. Pranav, thank you so much for talking to me. Thanks for having me.
You can see why large language models, such as the one that powers the likes of ChatGPT, could help to understand an illness better. After all, they've ingested pretty much all of the data available on the internet and tried to make sense of it. That includes factual websites provided by hospitals or healthcare systems, articles in medical journals and even first-hand accounts posted online by doctors or patients.
That ginormous training set might also include inaccuracies or misinformation, though. What's more, the way that generative AI produces its results makes it hard to fully trust. But can those challenges be overcome? Could the conversational powers of chat GPT be harnessed to create reliable medical bots? That's coming up.
Six months from now, you could be running a 5K, booking that dream trip, or seeing thicker, fuller hair every time you look in the mirror. Through HERS, you can get dermatologist-trusted, clinically proven prescriptions with ingredients that go beyond what over-the-counter products offer. Whether you prefer oral or topical treatments, HERS has you covered.
Getting started is simple. Just fill out an intake form online and a licensed provider will recommend a customized plan just for you. The best part? Everything is 100% online. If prescribed, your treatment ships right to your door. No pharmacy trips, no waiting rooms, and no insurance headaches.
Plus, treatments start at just $35 a month. Start your initial free online visit today at aforhers.com slash deal. That's F-O-R-H-E-R-S dot com slash deal. Compounded products are not FDA approved or verified for safety, effectiveness, or quality. Prescription required. Price varies based on product and subscription plan. See website for full details, restrictions, and important safety information. Today on Babbage, we're asking how medical AI tools can improve the experiences of patients.
Natasha Lodi, The Economist's health editor, who we heard from earlier, has been looking into the use of chatbots in healthcare.
Natasha, we just heard that chat GPT can in fact be a useful tool for medical advice. You're sitting here in the studio with me. So let's test chat GPT out for some medical problems that you might be having. I have to say I'm sceptical, but you've got chat GPT on your phone there. And the latest update means that it speaks now. So why don't you give it some symptoms and see if it can work out what it thinks is wrong with you. OK. All right.
Hi. I'm having issues right now. Could not understand what you said. Hi, can you help me with something? Of course. What do you need help with? Well, I've got a medical problem that I'm worried about. I've been quite short of breath today and I have a sharp pain in my chest, just slightly to the left. It sounds like you're experiencing symptoms that could be serious.
It's really important to seek immediate medical attention for symptoms like shortness of breath and sharp chest pain. Can you call a healthcare provider or go to an emergency room? Yeah, I could do that. That sounds like a good plan. It's important to get checked out by a professional as soon as possible with symptoms like those. Ask ChatGPT what he or it actually thinks is wrong with you. What do you actually think is wrong with me though? I kind of need to know.
I'm not a doctor so I can't diagnose what's wrong, but symptoms like shortness of breath and sharp chest pain could be caused by a variety of things, from something less serious like muscle strain to more urgent issues like a heart problem or a lung issue. What kind of heart problem could it be? Chest pain and shortness of breath can be associated with several heart-related issues. Here are a few possibilities:
One, heart attack. This occurs when blood flow to a part of the heart is blocked for a long enough time that it dies. Two, angina. This is a symptom of coronary artery disease and occurs when the heart is blocked in the jaw or back. Three, pericarditis. This is an inflammation of the lining around the heart that can determine the exact cause and appropriate treatment.
Now, that's interesting because there are so many guardrails out there with that version of chat GPT that you're using that clearly it doesn't want to tell you exactly what it's doing. And it's saying it's not a doctor and it's saying you should get medical advice, which is probably sensible, right? That's exactly the right thing to do. But it took several minutes there before it suggested that you might be having a heart attack, which I mean, it's not funny. You'd be lying on the floor at that point. I
incapable of moving. How useful is that really in medical advice? Well, it's clearly not intended to be useful. Not that version, no. But I mean, we're a long way technologically from being in a situation where we could recommend that anyone having these symptoms could just go and safely have a conversation with a chatbot and that would be the best course of action. I mean, it is good advice.
As far as I can tell. It's all correct. I'm not a doctor, but if you're going to dispense diagnoses, there are rules for how you do this. And there are tests that you have to do. And, you know, there are regulations you have to follow. And that's not what this particular bit of technology has been through. I think that it's unfair to just expect
the publicly available chat GPT model to be giving us medical advice. I think it's not terrible. I mean, look, I think the reason this is a good test is that we know that people have been Googling Dr. Google, as they call it, for years. And we now know that people are not only chat GPTing, if that's the way of putting it, their symptoms, we know that doctors are doing it.
And we also know that people are solving medical mysteries using these tools. But despite the promise then, can you explain to me why a generalized model of the type that powers ChatGPT isn't ever going to be a medical device? The missing bit of information we don't have is we don't know the number of people who are using these tools and getting the wrong answer or getting false reassurance that they're fine when they're not. And
And that's the kind of problem that we haven't even begun to solve with these tools. And that is why we can't rely on them. Also, because they're generative, they come up with a different answer each time you ask them. So, you know, who knows after the 10th time of asking that question, whether it would give me the same accurate advice. And of course, you know, we've talked about hallucinations before. There's a generative AI's
hallucinate, they make facts up. It's hard to understand how they come to their conclusions sometimes. So these are all problems for trust. And of course, trust is probably the most important thing when it comes to medicine, because you need to understand how whoever's treating you, giving you advice, how they come to their conclusions and what happens if things goes wrong as well. So I suppose the next question is,
Could you make a more specialised healthcare chatbot? And are people trying to do these things? And what benefits would they offer? Well, look, you can get medical chatbots. There's one called Ada Health, which is a text-based chatbot. So you type in your symptoms. It's a symptom checker. It doesn't promise diagnoses, but it's been crafted over many years and had all the sort of data that it needs to
built into it so that it can essentially assess what you type in and ask questions and then kind of weigh up what the symptoms could possibly be and then give you a range of likely answers. And that's a regulated medical device. So we have that.
The problem, of course, is that with these kind of chatbot is there a bit clunky, you have to go through a question and answer tree. It can be quite a lengthy process. You may have a cold and you may have to sit down answering lots of very different questions, some of which seem quite strange.
But so the question is, can we bridge the two systems? Can we bridge this accurate text-based chatbot with this amazing tool that is conversational chat GPT? Can we sort of have an interface that interrogates this much more accurate medical chatbot? And that is very much what people are thinking. And so the next step
step in the evolution could be where you have a conversation with something that sounds like chat GPT, but kind of under the hood, you have a kind of much more solid, robust, explainable and provable AI chatbot.
I mean, there are ways of making sure that the outputs of a large language model are more accurate and they're not inventing information. I mean, this is a problem across all sectors. And the solutions I've heard of are things like only allowing the large language model to access data from a particular database. So you've picked
papers or a database of medical information that is correct and you ask the large language model a question but say to it only find your answer from that particular database you know it's called retrieval augmented generation
And then another way of improving the responses from a large language model is to take a generalized large language model and fine-tune it with specific labeled information that is very highly accurate. Let's say medical information, which means that when you ask a question, it's not a generalized LLM anymore. It's got specific talents, knowledge about certain conditions.
So it's less likely to give you bad information. Ultimately, this is all going to have to go through regulators. They're going to have to have some oversight over whatever people come up with. But I think there's a long way to go on the technological front before we can say, actually, no, this is too difficult and it can't be done. So the AI of today does seem to have a lot of promise. But what's going on in clinics right now?
I said at the start of the show that we'd look at the experiences of patients in the age of AI. But I highly doubt that what we've discussed so far is your experience when you visit your doctor or go to a hospital. It certainly isn't mine. Producer Jason Hoskin has been investigating the reason why.
If you develop an AI algorithm that can triage something or provide an indicative diagnosis, then the minimum thing that you have to do is run it by the regulators and get a regulatory clearance to allow you to put that device on market. Hugh Harvey used to be the clinical director at Chiron Medical, the company that we heard from earlier that uses AI to detect breast cancer.
Chiron was the first AI system ever to get regulatory approval, so I did that. It was the first European clearance for a deep learning computer vision system, yeah. Hugh now runs a consultancy called Harjian Health.
They help other AI companies comply with medical device regulations. It's a step that is slightly onerous on first timers in industry because you have to develop a quality management system. You have to show that any software used for a medical purpose is safe, effective and cyber secure, which I'm sure patients would agree are the minimum things they'd want to see from such devices. And how do they do that? How do they show that they're safe for all these things?
So you start off with your hypothesis that your device can do X functionality. You then do a literature review to determine how well that is currently done under standard of care within the hospital system.
You then develop your device in a documented process to show that you've identified risks and you've mitigated against those and to show that you have met the user requirements as you've identified. And then you run some form of clinical investigation to show that your device can do what you say it does. And you present that data to a regulator. So the tools that are being deployed in clinical settings right now, they're quite narrow and rigid in terms of what they can do.
But the new models, they're more generalist. And how much more of a challenge are generative AI models that, by definition, give a unique output every time? Are they more of a bottleneck and a challenge? Just to lay the ground to the background context here is that not a single large language model has received regulatory approval for use for a medical purpose. And as you say, this can be due to the fact that these systems have a propensity to hallucinate, which is
to all our best efforts we can't control to a certain extent and they're non-deterministic and what that means is that if you give a large language model a certain prompt or a certain input
you will not get the same output every single time. And that is something that the regulators so far have not been happy to give regulatory clearance for. It's accuracy as well. It's very, very difficult to measure the accuracy of a general system. If you have something that has potentially infinite inputs and infinite outputs, we don't know how often it's going to get it wrong. And it's not just how often it gets it wrong, it's how badly it gets something wrong.
And until we can measure that, these things aren't safe to be on market as an intended medical device. It is quite hard to conceptualise an academic framework by which you would evaluate that across its entire generalisability. And I've not seen anyone do that comprehensively yet.
Do you know of large language models and generative AIs that want to get regulatory approval? There are many that want to get regulatory approval. Absolutely. I had three inbound emails this morning with people asking how can we get regulatory approval. I was actually at a conference last week with a bunch of regulators and it was almost unanimously decided that current regulatory frameworks and the current evidence being generated by vendors is not sufficient to allow any of these to be used, especially in high risk situations.
you know, if they have a generalist model, you are claiming it can solve pretty much any medical problem. Well, that includes children with cancer or neonatal sepsis, which we all agree is a seriously high risk situation. And we should be focusing on more low risk use cases, perhaps things in the administrative realm, rather than trying to use it directly into the diagnostic realm. Do you think this is something that needs to be overhauled, though? Because clearly they have great potential in medical settings.
No, I would put it the other way. I would actually say the developers of large language models need to actually improve their evidence generation. If you want to make great claims, you need to provide exceptional evidence. And if we are going to, you know, unlock this apparent revolution in healthcare, surely then someone needs to demonstrate these things can be safe, effective and cyber secure, rather than say to the regulators, well, could you just lower the barrier?
Another point I'd like to make is that, you know, if someone invented a drug tomorrow, it would take them 10 years to get it to market. And I don't see any reason why we should be rushing large language models into hospitals, ignoring evidence and the typical timeframes it takes to generate it. That's quite a useful comparison. Something else that's been put to us on the show is perhaps the regulatory and policy framework for AI should be something equivalent to licensing doctors.
Do you think that is maybe a good way of thinking about it? I don't actually. I would disagree with that. I think comparing large language models to fully functioning human doctors is anthropomorphizing the technology, which I think we should really stay away from. I think it's rather dangerous to assume this technology is as capable as a human when it's not.
Yes, large language models are broader in capability than some of the more linear models that came before it, but it's by no means at the intellectual level of a human. And so to say, well, we should use the same kind of regulatory structure that we evaluate humans on is, I think, a dangerous thing to do. Large language models don't have moral accountability and they don't have, you know, decades of experience like doctors do.
Then again, if we were going to regulate it like we do doctors, well, it takes five to six years for a doctor to go through medical school. Then it takes them, I don't know, seven to 10 years training to become a consultant. So, okay, if you want to do that for large language models, let's wait 10 to 13 years and see your evidence at the end of it.
And look, I'm coming across as quite pessimistic and I realise that, but I think we'd all agree that actually what we want to do is take our time and get the safest, most effective products to market. And that is the reason why we have these regulations in the first place. Hugh, thank you very much. Thank you. Despite the challenges around implementing generative AI systems such as large language models into healthcare,
more of the narrower, more specific tools are getting into clinics. Like MIA, the cancer detection model we explored earlier. So what does all that mean for patients in the future? To round off this episode, I'm once again joined by the economist, Natasha Loder. Natasha, we've talked about diagnosis and advice and things. Where else could chatbots and...
these sorts of artificial intelligence models be useful in looking after patients then, as they're more integrated with the health system? So really, it depends on your timescale as to sort of what the answer is. Ultimately, you know, they could be a sort of first line service, 24-7 giving advice, triage, even diagnostics.
But really, we're going to start with low risk activities such as advice and symptom checking and
Where you will see AIs, more advanced AIs, more ambitious AIs appear earlier are in places where you have no care, rural communities, developing countries, particularly where you're supporting patients that have chronic health needs in countries with very few doctors. And in those sorts of settings, the risk of having some sort of AI system supporting patients has to be set aside the fact that they're getting none at all.
That's how healthcare often views any kind of innovation. It's sort of, you know, what are the risks? What are the benefits? And so in countries that have no healthcare or little healthcare available for certain populations, it may be that AI tools are really going to be a sort of quick answer. And then in the longer term for you and me, perhaps,
We'll be using these as personal doctors. They'll have our own personal medical information and be able to give us quite sophisticated answers to questions that we might have. Yeah, so they'll be able to take your many years of medical information, blood tests, anything else, any drugs you might have had, and give you quite specific advice in a way that you just can't do without a human doctor right now. And even then, it's limited to whatever that doctor specifically has available in terms of
information or memory in terms of how they treated you. I mean, this is a classic problem. You can go and see your doctor and they may have a few moments to scan your recent medical history. They're not going to go back through years of notes. And yet the information in there could be highly salient to the problem you're having today. And so AIs have the potential to
to make medicine much more sophisticated, to reduce the amount of error, the misdiagnoses. It's extraordinary to me how we could really start to meet the needs of so many more people with this technology. Natasha, we've covered a lot of ground in this episode and looked at different ways that different types of AI models could be used in healthcare.
You've done a lot of reporting on this just recently, and I'm just curious where you land on this. You know, we don't want to get too excited about things just because new technology arrives, but then there's so much potential for, as you say, more access to healthcare, better healthcare. Give me a scale of how excited I should be about all of this.
Well, it's all about timescale and looking forward in the next 10, 20 years, it's transformative. It's going to revolutionise diagnostics, treatment, patient care, kind of almost every aspect, I would think, of healthcare, including foundational models, which are going to essentially absorb biology and medicine and make it more understandable and explainable. But in the short to medium term, it's less clear to me
how rapidly these technologies will disseminate. And that's because there are a whole bunch of kind of ethical, regulatory and technical challenges that need to be overcome to ensure that it's deployed safely.
And that's going to require regulators to actually spend some more money. They need to move more quickly. They need to put more effort into oversight of these technologies, because if we're going to introduce them safely, that's what it's going to take. But yeah, I'm very excited. I can't wait to see what's coming down the road.
I think that you've managed to make that answer a very sensible one because isn't it that loads of AI people always talk about how the
These sorts of technologies are overestimated in the short term and underestimated in the long term. And in case anyone doesn't know what a foundation model is, we'll be talking about that in the next episode of this two part series on AI in health care. One of the most important and interesting uses of AI in health care. And I look forward to speaking to you again next week to look at how all of this is going to change the medical research side of things. Sounds good. The story of AI in health care doesn't end there.
To catch the second part of this series of Babbage and much, much more, you'll need to become a subscriber to Economist Podcast Plus. Just follow the link in the show notes. Thanks to Gerald Lipp, Peter Kachikometi, Pranav Rajpukar, Hugh Harvey and the economist Natasha Loder. You can read Natasha's full 10-page report on the promise of AI in healthcare on the Economist's app. It's really worth it to get into the details of her reporting. There's a link in the show notes.
Thank you for listening. Babbage This Week was produced by Jason Hoskin and Lawrence Knight. Mixing and sound design is by Nico Rofast. The executive producer is Hannah Marino. I'm Alok Jha, and in London, this is The Economist.
With the Wealthfront Cash Account, you can earn 4% annual percentage yield from partner banks on your cash until you're ready to invest. The Cash Account grows your money with no account maintenance fees and free instant withdrawals whenever you need it. Money works better here. Go to Wealthfront.com to start saving and investing today.
Cash account offered by Wealthfront Brokerage LLC member FINRA SIPC. Wealthfront is not a bank. The APY on cash deposits as of December 27, 2024 is representative, subject to change and requires no minimum. Funds in the cash account are swept to partner banks where they earn the variable APY.