We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

This doctor says he can make AI in health care safer

2025/2/10

On Point | Podcast

Ziad Obermeier: 我认为人工智能在医疗保健领域有巨大的潜力。它可以帮助医生更好地完成日常工作,例如撰写病历和回复邮件,甚至可以辅助医生进行更高级的诊断,例如通过分析乳房X光片来提高癌症筛查的准确性。人工智能不仅可以提高医疗效率,降低治疗成本,还能帮助我们更深入地了解疾病的机制,从而开发出新的治疗方法。我坚信人工智能将推动医疗领域的进步,为患者带来更好的治疗效果。 Robert Califf: 我认为人工智能在医疗保健领域的应用前景广阔,但同时也面临着监管的挑战。美国食品药品监督管理局(FDA)无法独自监管所有的人工智能应用,因此需要临床社区、医疗系统和专业人士的共同参与,建立一个完善的监管生态系统,对人工智能算法进行持续评估,确保其安全有效。我们需要共同努力,制定合理的监管框架,以促进人工智能在医疗保健领域的健康发展。 Brian Anderson: 我认为人工智能在医疗保健领域的应用需要各方共同努力,确保其负责任和安全。我们成立了医疗人工智能联盟(CHI),旨在联合私营部门的利益相关者,就人工智能在医疗领域的应用达成共识。我们致力于建立一个质量保证资源网络,为医疗系统提供独立的测试和评估服务,提高人工智能模型的透明度。我们还积极与政府部门合作,推动建立完善的监管框架,确保人工智能在医疗保健领域得到安全和负责任的应用。作为一名家庭医生,我深知人工智能在改善医疗服务、减轻医生负担方面的潜力,我相信通过各方的共同努力,我们可以让人工智能更好地服务于患者。

Deep Dive

Chapters

This chapter explores the rising use of AI in healthcare, highlighting tools that automate tasks and even assist doctors in areas beyond current capabilities. It emphasizes the critical need for safety regulations and oversight of these AI tools to prevent accidental harm to patients.

AI tools automate doctor's tasks like writing notes and analyzing images.
AI can help doctors do things they currently can't, such as better mammogram analysis.
A major concern is ensuring the safety and regulation of these AI tools in healthcare.

Shownotes Transcript

Translations:

中文

This episode is brought to you by Shopify.

Forget the frustration of picking commerce platforms when you switch your business to Shopify, the global commerce platform that supercharges your selling wherever you sell. With Shopify, you'll harness the same intuitive features, trusted apps, and powerful analytics used by the world's leading brands. Sign up today for your $1 per month trial period at shopify.com slash tech, all lowercase. That's shopify.com slash tech. This is On Point. I'm Meghna Chakrabarty. Thank you.

Artificial intelligence is changing everything, even your visits to your doctor. There are a bunch of tools that are already starting to be used that...

help automate parts of the doctor's job. Things like, oh, language models can help me write my notes better or write emails better. I think there are increasingly even tools that start looking at images, for example, ultrasound images of the heart and pre-populating a report that a cardiologist would write about that.

Dr. Ziad Obermeier is an associate professor of health policy and management at the University California Berkeley School of Public Health. There's a second type of tool that I find even more exciting than that, which is tools that help doctors do things that doctors don't know how to do today.

And so one example of that is my colleague Regina Barzillay at MIT has done an enormous amount of work on developing AI tools to read mammograms in ways that are actually better than what any radiologist can do. And so this algorithm looks at a bunch of mammograms and looks at the outcomes that patients have over the next five years. Do they get a biopsy? Do they end up having progression in their breast cancer? And it learns from those data

and then delivers a prediction based on how the mammogram looks today about what's going to happen to that patient's cancer. And by making cancer screening more accurate, Dr. Obermeier says these tools can also lower the cost of treatment because there would be fewer invasive procedures patients have to endure and pay for. What Regina finds is humans basically make two kinds of mistakes when they're looking at a patient's imaging tests.

One kind of mistake is that doctors sometimes do too much. They look at the mammogram and they're like, oh no, this is really high risk. This person needs a biopsy and an MRI. And it turns out all that stuff is negative and that was all

unnecessary testing. But sometimes doctors miss really high-risk cases that the algorithm sees, but the doctors don't. And those patients that the doctors say, yep, your mammogram's fine, come back in however many years for your next one. But those patients actually had a really bad cancer on their mammogram. And the promise of AI in those settings is that it can target the care to the people who actually need it.

In his own research, Dr. Obermeier works with AI tools that also detect sudden cardiac arrest in patients. And he finds that these tools not only have the potential to better treat and lower costs for patients, but also help healthcare providers learn why those difficult-to-spot cases even happen. Based on the way that electrocardiogram waveform looks, this very high-risk waveform that I missed and that everyone has missed until the AI can show us what it looks like,

we can actually start forming hypotheses about what's going on there because we know a lot about how the heart produces these waveforms. And so we can actually tie those new insights from AI back into laboratory studies that help us understand the mechanisms of this problem and then develop new treatments and diagnostics for it. So I think that's the really exciting part is not just improvements in the clinic, but also breakthroughs in the lab driven by AI.

By the way, Dr. Obermeyer was featured in an award-winning series we did a couple of years ago called Smarter Health. It was all about how AI is transforming healthcare. And you can find it in our podcast feed or at onpointradio.org. Now, back then, as right now, in fact, a huge question looms over AI and healthcare.

How do we know that these tools are safe to use? And what systems and regulations can help ensure that these tools won't accidentally harm people instead?

Well, last November, Dr. Robert Califf, who then headed the U.S. Food and Drug Administration in the Biden administration, said this. This field will be so big, we couldn't possibly hire enough people at the FDA to regulate it all on our own. And so it's very important that we get the clinical community, the health systems, the

the professions of the people that you see for your healthcare to be very involved in developing what we would call an ecosystem of regulation where these algorithms can be constantly evaluated.

Now, understanding AI is so important for us all that we here at On Point are going to keep coming back to it, following up on questions that we couldn't answer before to see how the technology, its impact and regulation are changing. So today, when policymakers talk about an ecosystem of regulation in AI and health care, what exactly do they mean?

Who would create it? Who would be a part of it? How would it ensure patient safety? And there are always pitfalls when industries are essentially asked to regulate themselves. Can those pitfalls be avoided with AI and healthcare in the U.S., especially since so much of healthcare delivery in this country is for profit?

So for that, we are turning today to Dr. Brian Anderson. He's the CEO and co-founder of the Coalition for Health AI, or CHI, and he joins us in the studio. Dr. Anderson, welcome to On Point. Hi, Megan. It's so great to be here. Please call me Brian. Okay, Brian. So I have to admit that, as I said, we're constantly looking for up-to-date, intelligent, and thoughtful ways to talk about AI and civilization, and I ran into this article about you in Politico.

from earlier this year, from the beginning of January. And here's what it said. It says, Okay. And then it says, Okay.

He can in terms of helping the government struggle with oversight of this rapidly evolving technology. Big shoes to fill. So, I mean, like Politico is in the business of making people want to read more. I get that. But how can you help? What are you doing to help with this oversight of AI regulation and health care? Yeah, no, great question. Yeah.

So part of the work that we're doing in CHI is bringing together private sector stakeholders to help build consensus on what responsible, good AI looks like in health.

And importantly, I would call it two or three critical stakeholders in this effort. So you have the vendors, the technology companies that are making the models. You have the health systems and the hospitals that are essentially the customers of these models. They're purchasing them, they're buying them, they're deploying them. And you have patients that are obviously at the center of this. We want to build AI models that treat them well, treat them safely.

And in that context, what we find when we've brought together this community of over 3,000 organizations is a very loud group of customers that want to have better understanding on how these models actually perform.

on patients like those that they may themselves as doctors and nurses be using this tool that they may end up deciding to buy. The challenge is those doctors, those administrators that are making these decisions oftentimes don't have the kinds of data that really informs how these models perform. How do they actually work on a patient population like maybe here in Boston if I'm a health system looking to use a model?

And so what we're trying to do is build a network of essentially quality assurance resources or labs that would be able to do those kinds of tests on behalf of a health system that might not have that ability to do that kind of test. And so bringing greater transparency to how these models actually work. Okay. So I'm going to put a pin in that.

And come back to it because that is the heart of what you're offering and what you're trying to develop here. But for context for listeners, let me get a sense from you of how big you think currently the gap is between the AI tools for healthcare that are available right now, let alone a year from now or five years from now, and our understanding of technology.

what they do, how they do it, and the potential impact on patients. All the questions that regulators would usually try to answer. Yeah. So I use the example of going to a grocery store. When you and I or any of our listeners go to a grocery store, oftentimes if we're interested in what the ingredients are that is in a can of soup or in a box of cereal, we turn it over and we look at the label and we get a clear understanding what the ingredients are, how many carbohydrates, fats, sugars, things like that.

In AI, we don't have anything like that. We don't have a standard way of describing what it is that goes into make these models do the things that they do. We don't have any standard way of sharing what was our training methodology? What were the data sets used to train these models? Importantly, how do they perform? What are their indications? And importantly, perhaps most importantly, what are their limitations?

And so right now doctors that are using tools when they see patients lack that kind of information on the specific kinds of patients they have in front of them. So one of the things that we're focusing on within CHI is creating that AI nutrition label so that doctors, so that health systems can actually have that information. Right now we rely on PowerPoint presentations and very kind of, I would say, high level descriptions from very well intending vendors.

Do you know what's so strange to me about AI and healthcare? Is that healthcare is possibly one of the most highly regulated industries in the United States, right? I mean, just to – obviously, we know all the steps that a drug has to go through in order to be approved for patient use. We know all the steps that medical devices have to go through, right? Like, we actually – all the things that we don't exactly know about AI in terms of its impact, we know about healthcare.

a, you know, a heart monitor, for example, or a pacemaker, because all that actually has to be proven and understood before the government says, yes, you can use that in the hospital. So to me, that actually also like if we keep talking about AI as tools, they're not just tools on the computer, right? They are tools that are actually having an impact on the treatment and care that people receive. Is right. Like, how did that happen? That

That AI has gotten so far into the healthcare industry already without any kind of regulatory pause. Well, so part of the problem is it's moving so quickly. So we have what I would call traditional AI or predictive AI. And in the FDA regulatory landscape, there's, you know, 800, 900 different models that have already been improved.

Now, generative AI, which I think is what's gotten a lot of us exciting, a lot of the online, you know, large language models that it's Gemini or ChatGPT or Bing or whatever that we use that have potentially a profound level of impact on how we both practice healthcare as doctors, but as patients, how we understand it and navigate that as well. There is not a single model in the generative AI space that's been regulatorily approved by the FDA. So part of the challenge is

just keeping up with and trying to get your arms around this regulatory framework that is in the generative AI space, which the FDA has not done yet. The other challenge is... But these tools are being used, though? Oh, yeah. Okay. A great example, I think, that many people would resonate with is LLMs have the ability to, in a doctor-patient relationship...

help doctors better connect with their patients by summarizing the whole conversation in a note for them rather than having the doctor have to spend 10 minutes typing that note out. Okay. So before we get to more about exactly what CHI or the Coalition for Health AI is doing in those QA labs that you talked about, when we come back from the break, I want to

pursue this a little bit more about why AI is so fully already embedded in healthcare and the gap between that and regulation. So we'll do that when we come back. This is On Point.

Support for this podcast comes from On Air Fest. WBUR is a media partner of On Air Fest, the festival for sound and storytelling happening February 19th through 21st in Brooklyn. This year's lineup features SNL's James Austin Johnson and a sale of Death, Sex, and Money and over 200 other creators.

Support for On Point comes from Indeed. You just realized that your business needed to hire someone yesterday. How can you find amazing candidates fast? Easy, just use Indeed. There's no need to wait. You can speed up your hiring with Indeed.

and On Point listeners will get a $75 sponsored job credit to get your jobs more visibility at Indeed.com slash On Point. Just go to Indeed.com slash On Point right now and support the show by saying you heard about Indeed on this podcast. Indeed.com slash On Point. Terms and conditions apply. Hiring? Indeed is all you need.

At Radiolab, we love nothing more than nerding out about science, neuroscience, chemistry. But, but, we do also like to get into other kinds of stories. Stories about policing, or politics, country music, hockey, sex.

Of bugs. Regardless of whether we're looking at science or not science, we bring a rigorous curiosity to get you the answers. And hopefully make you see the world anew. Radiolab, adventures on the edge of what we think we know. Wherever you get your podcasts.

You're back with On Point. I'm Magna Chakrabarty and Dr. Brian Anderson is with us today. He's the chief executive officer and co-founder of the Coalition for Health AI. And it's a group, quite a large and influential group, that's come together to try and figure out ways to

to make AI more responsible, more transparent in how it's being deployed in healthcare. And the big question for us all here is, you know, how can that translate into better regulation for AI and healthcare? And Dr. Anderson, so...

Again, I'm actually going off on this little bit of a tangent here because I have to be perfectly frank. AI and the speed that it's developing is something that has maybe thankfully rendered me into essentially a five-year-old in my understanding of things. So a lot of my questions are really at the basic level because I think they're the same questions that everybody has. So getting back to this gap, the already extant gap in regulation—

Or, yeah, let's call it a chasm more than anything. I was really thinking like, OK, well, for any new procedure, even surgical procedure that comes up, it goes through these like years of proof of concept testing. It goes through laboratory models, testing on animals. Then there has to be a huge amount of transparency and informed consent online.

on the first people who are willing to have that surgical innovation tried on them. There's a really well-established...

Yeah. For improvements in medicine. But correct me if I'm wrong. I just don't see AI as having been put through that rigor. It's just sort of jumped into, well, now I go to my doctor's office and I don't know how the AI is influencing the kinds of decisions my doctor's making or the diagnosis she's making or the treatments that she's recommending. It seems like we just skipped that whole aspect of developing new medical tools. Yeah.

Yeah, no, I think you're right. So first back to the sense of being a five-year-old, I think it is important for all of us to recognize that there is so much that we don't know in this space, even those of us that spend tons of time. And so there's a degree of humility that we have to approach this in not fully understanding how these models do the things they do or the emergent capabilities that we sometimes see.

Now, as it relates to how these models get through the regulatory process today, I think you're right to call out that there's a degree of, I think, lack of clarity and uncertainty about how these models actually go from

being submitted for regulatory approval and then ultimately getting regulatory approval. An example, so you bring up clinical trials, right? Clinical trials, robust process usually involves, you know, thousands of people that are ideally representative of the individuals that would ultimately be using the drug once it's potentially approved. In AI, part of the challenge is, particularly for regulatory bodies like the FDA,

is how do you independently assess or test that model separate from any of the submissions that the vendor might submit to you? And so one of the challenges, just to be very frank, for agencies like the FDA is where can I go to get a data set to test a model separately to actually see if the claims that the sponsor, the vendor, is offering are actually true? Are they valid?

And that's the issue, that's why Dr. Califf in that opening segment was making the comment that we, society, need to come together and work with health systems, which oftentimes are the places where this kind of data resides, where it can be made secure and private and used to test these models. We don't yet have that kind of capability within agencies like the FDA. So can you make that even more concrete for me?

If I understand what you're saying, it's like the FDA is sort of behind the eight ball to begin with because in putting a potential AI model through its paces, they don't even have like

what, healthcare data or patient data in order to like, or even can't they use hypothetical data to like a pretend patient to test it? Sure, one could use synthetic data, but I think many studies have shown that when you use synthetic data, it's not real data. And so there's going to be a lot of confounding variables that could sneak in there. The ideal use of a testing data set is real testing data that is fit for the purpose of the model to test its output.

And that's a real challenge because there's so many AI models out there. And as Dr. Califf was mentioning, it's going to be impossible for the FDA to hire enough people to meet that demand. Okay. So for an AI tool in healthcare, though, what would be your control group? What are you testing against?

So for an AI, so you'd be testing on real people, and depending on the use case, that would be your control group. So Regina is trying to build a computer vision algorithm to detect breast cancer and mammograms. You'd have individuals that don't have cancer have a mammogram. You'd have individuals that do have cancer.

and do have a mammogram. And then you tease out and train or test the model's predictive capability of identifying somewhere on that mammogram, you know, concerning lesion for breast cancer. I see. So that one's pretty straightforward. But again, in thinking of

one of the basic truths about medicine is that you'll never see the same case twice, right? Like, because the body is very, very complex. And so as AI gets more sophisticated and can do more things, would it even be possible to do that kind of testing against a control?

It raises a great question. I think particularly in the generative AI space, what are the controls in a non-deterministic model like these AI scribe models that I was referencing earlier? We don't know. We don't have agreement on how you build the testing and evaluation frameworks for particularly for the generative AI space. That's why there is such an urgent need for

as we look to expand the use of these models, generative AI models, in increasingly more and more consequential use cases, cases in the ICU, like Zeon, and the use of AI models to identify people that will die suddenly from an arrhythmia. Those are pretty high-stakes cases. The challenge is, particularly when you bring together a group of leaders in the AI technical field, we don't have agreement on how you measure

efficacy, how you measure bias. These are some of the real challenges in the emerging field of generative AI. Okay. So let's go back to Dr. Robert Califf, who

who was the former head, is the former head of the U.S. Food and Drug Administration. He served in that position in the Biden administration, and he spoke with NPR in November of last year. And here's how he described how FDA currently evaluates AI-enabled devices. If it's an AI algorithm which is embedded in a high-risk device, let's say a cardiac defibrillator, that's a really good

example. If your heart suddenly stops beating, you want a shock to get your heart started again. You want to make sure that algorithm is right. That is very carefully regulated and you got to do clinical studies typically in people that show that the algorithm actually really works. You have administrative uses of AI. Those aren't regulated at all. That's administration and finance. And in the middle, we have this vast area of decision support. So there's a spectrum.

So that's Dr. Robert Califf. And you know, it makes a lot of sense when he says if the air is actually attached to a device...

It has to go through FDA's normal regulatory process. Trials. But the decision support side. Okay. So let's talk about that. Now, the Coalition for Health AI, CHI, the group that you founded and run. Let's get down into the nitty gritty of what you're trying to offer here. These quality assurance labs that can do the kind of testing that Califf says currently FDA cannot do.

So do you have like a real world example or even a hypothetical one of how that would work? Yeah. So at the opening segment, you had mentioned Ziad, Dr. Obermeier. So Ziad and I, classmates, a lot of what we're doing in CHI is inspired by some of the work that Ziad is doing currently. A real life example is a company he founded, Dandelion Health, does very similar kinds of work to this. So

If you bring together a group of health systems, they obviously have a lot of health data. That data can then be de-identified, made secure, made private, harmonized in such a way that you can have a technology vendor

that wants to train their model or validate a model that exists, it's performance. These labs have the ability to accelerate that kind of innovation. And so one of the exciting things about groups like Dandelion or Mayo Clinic's Validate or Beekeeper AI, these are just some of the quality assurance resources and labs that we're working with. One of the exciting things that they can do

is make data accessible and activated in such a way that they're lowering the threshold and the challenges for startups or tech companies to get access to that kind of data, to train it. And it's robust. It's diverse sets of data. So you can have, ideally, potent, powerful models. Okay, I'm going to jump in here again. So my AI five-year-old's understanding of AI is kicking in again. Yeah.

Let's put sort of some concrete parameters around an example of how CHI would work. Say I am a representative of like MD Anderson Cancer Center in Texas. So really big organization, important, treats a lot of people. And we're interested in using a new AI tool that would say help in the decision-making process following an ovarian cancer diagnosis. Great. New tool, CHI.

We want to understand how it works. We want to understand the pitfalls or the advantages. How would CHI help with that? Yeah. So right now what we're hearing from hospitals like those, like MD Anderson and broader University of Texas health systems, is just being inundated by vendors wanting to sell models and tools that potentially could really help them.

The challenge, of course, is sifting through all these solicitations. And I think the burden on the health system is understanding what's hype and what's real. And particularly in the oncology space, you want to really have a clear understanding of what's working and what's not in the ovarian cancer use case. So in that use case, MD Anderson would potentially have some kind of solicitation out there for an ovarian cancer clinical decision support tool.

vendors would solicit and perhaps share their model card, their AI nutrition label. MD Anderson, theoretically, could say, "Great, we've down selected the two or three vendors. We'd like to have them send their model to a lab." And we would work in partnership, MD Anderson and the vendor, on an agreeable lab, whichever one they want to select.

And then they would choose a kind of testing data set that would be representative of the patients that come to MD Anderson. And so you would have a representative sample of the kinds of patients that the model would be if MD Anderson chose to go with them, you would use on real patients. But the lab would then run a kind of a simulation test using that testing data.

And they would issue an evaluation report. It would involve a bunch of different, like a report card, a bunch of different grades. And then MD Anderson would get that, and they would be able to make a more informed decision. So it's not regulatory in that the FDA is not using this data to make a regulatory decision. You have, in the private sector, a purchaser trying to get more data about which vendor they want to move forward with. But it's a way to test for efficacy, safety, all the things that we're concerned about as consumers.

You got it. Practitioners and patients. Okay. So the sort of sample data set, does that come from patients who've already been to MD Anderson? Where does that come from? So, I mean, this is the really exciting thing about this era in AI. So for the past 30 years in healthcare, we've been investing billions of dollars in bringing the data that doctors used to write in chicken scratch on a notebook

that they could only read into the computer world where now it's digitized, it's on electronic health records, we can share it. This term interoperability means that if you go from one clinic and across the street to another, your data theoretically can easily go with you.

That took a lot of effort to make that data be able to be harmonized and work on different systems. Because of that, now you have organizations like Dandelion or these other quality assurance labs that are able to then bring together data sets from different health systems, harmonize it, and make it available for that kind of testing. So in the MD Anderson example...

you know, a quality assurance lab could say, okay, we need to then create a subset data set that includes patients with ovarian cancer. And we're going to test this model just on those patients. And we're going to aggregate that data or

Or use a federated network that comes from perhaps, you know, 30 plus health systems across the U.S. So that's how they create that data set. And is it the hospital system or whoever the client is, not the vendor, but the hospital system who's deciding, I want you to grade the product on these following criteria?

Yeah, so ideally it's the vendor and the customer coming into agreement on what goes into that test. What we're in CHI trying to do is create this network of quality assurance labs that is trusted by both the vendor community and the customer community. Because it's only in that space that you can have these independent entities

really issue an evaluation report that the vendors are going to want and the customers are going to want. And who's paying for this? So the vendors of the model are likely going to be the ones that are going to be entering into the business relationships with the labs and charged a fee to do that test. Now, it is very likely that the vendors would just, you know, carry that cost down to the customer. And so there would be a fee associated with that. The hope that we have in CHI is

is that this marketplace of labs is going to be competitive. So it's not just one lab, it's multiple labs. And so that ultimate price point will be determined by free market enterprise. Okay. So to be clear, when we're speaking about vendors, I mean the developers of these AI tools. When we're speaking about customers in this context, we're talking about hospitals, healthcare systems, et cetera. Yep.

So, I mean, this is on the face of it, Dr. Anderson, this is a very exciting idea. However, I have to say my journalistic skepticism is also raised a little bit when I hear that a toolmaker is paying for someone to test their tools, because I automatically think the first words that come to mind are conflict of interest.

I mean, how can you guarantee that they're not going to influence the outcome of the testing? It's a great question. And it's a really important area that we're focused on in CHI. So what we're doing as CHI is creating a certification framework that essentially looks to mitigate what you just described, that conflict of interest. Because if a vendor is paying someone, you know, there is that absolute risk that that lab is going to somehow be pressured to issue a favorable report.

And so that certification framework involves a whole bunch of steps, involves bringing in independent auditing agencies. There's an international standard that we leverage, 17-025, that really is a tried and trusted manner of ensuring that these labs are not allowed, and they're audited for this, to have commercial engagements with any of the vendors that they're working with.

And it's a real challenge, right? Because many of the health systems today, I think, recognize they have a very valuable asset in their health data. And so many of those health systems are already partnering with technology vendors. You know, some of the big names that are recognizable in, you know, many small startups.

And so ensuring that any lab that's chosen by a vendor, they can't have that commercial engagement. They can't have that kind of entanglement because, to your point, they'd be conflicting. But commercial engagement isn't the only issue, though, right? It's simply like you're paying me to do these tests, and so therefore you can exert some pressure on what the outcome of the test would be. It's a fair point. I would say absent funding from outside sources, this is the only path we have forward right now. Why not have the customers, i.e. the hospitals, pay?

So we could have that as a model. Right now, I think what we've heard is the community of customers would like to move forward with the vendors. They don't want to pay. So I'm sitting here from, I'm going to occupy the seat of the patient advocate and keep questioning you about these things. We're going to take a quick break here. We're talking with Dr. Brian Anderson. He's CEO of CX.

Chai, it's the Coalition for Health AI, and he's got this big idea and working with major groups and companies across the United States to help answer questions of are these new AI tools that are getting everywhere in health care safe and effective? More in a moment. This is On Point.

Support for AI coverage in On Point comes from MathWorks, creator of MATLAB and Simulink software for technical computing and model-based design. MathWorks, accelerating the pace of discovery in engineering and science. Learn more at mathworks.com.

and from Olin College of Engineering, committed to introducing students to the ethical implications of artificial intelligence in engineering through classes like AI and Society, olin.edu. Still getting around to that fix on your car? You got this. On eBay, you'll find millions of parts guaranteed to fit. Doesn't matter if it's a major engine repair or your first time swapping your windshield wipers.

eBay has that part you need ready to click perfectly into place for changes big and small. Loud or quiet. Find all the parts you need at prices you'll love. Guaranteed to fit every time. But you already know that. eBay. Things. People. Love. Eligible items only. Exclusions apply. You're back with On Point. I'm Meghna Chakrabarty. And before we return to our really interesting conversation about how do you ensure quality and

when it comes to AI tools and healthcare. We will get back to that in a second. I just want to let everyone know that later this spring, we are going to have a very important and large series about boys and education because there's a ton of evidence. And in fact, the amount of evidence is growing that there's quite a different set of outcomes for boys and girls after they graduate from high school in modern day America. So we're going to look at

K through 12 and try to understand what kind of experiences are boys having in school in the United States these days. So your experience as parents and educators is going to be very, very important for this series. And we hope

want to hear from you. From your younger boys, for example, how do they talk about school? How do you see them learning and being engaged? Or how do you see them not being engaged in school? Do they feel like they belong there? And also when it comes to behavior and discipline, both

Both parents and educators, do you see a difference between boys and girls? And for teachers in the classroom, do you have to manage boys and girls differently? What are the issues and concerns you have around behavior and discipline for American boys? So what I'd like you to do is get the OnPoint VoxPop app. If you don't already have it, just look for OnPoint VoxPop wherever you get your apps.

And that is the best way to send us your thoughts. Again, as parents, as family members, as teachers and educators about the experience that boys are having in American schools. So do that. Or if getting another app on your phone is not your jam, you can also call us at 617-353-0683. Again, this is for our series on boys and education for this spring. And we really want as much stories and first-person experience from listeners as we can get. So...

Dr. Brian Anderson, head of CHI, the Coalition for Health AI. Let me turn back to you here about healthcare and AI. I do see the power in this idea of having these vetted, high-quality AI quality assurance labs, as you've described them. But as you know, you have your critics. The very concept has its critics here. So let me just offer you one of the sort of major blanket criticisms that I've heard is that

that by doing this, you could actually end up giving hospitals, let's say, a ticket to introduce AI tools much faster into their treatment than they otherwise would. And that could potentially put the interests, specifically the financial interests of hospitals, ahead of patients. Interesting. I haven't heard that one yet. Yeah.

The pace of AI innovation is breathtaking. And hospitals, I think, are right to want to identify solutions for the myriad challenges that they have, financial, to care delivery, to reaching underserved populations of people. And so I think it's appropriate for looking to potential solutions like those in the AI space. Now, you're right to call out

you know, how do we keep pace with this innovation with the kinds of guidelines and guardrails that would enable patients to have that degree of trust and providers to have that degree of trust.

I don't have all the answers. I can tell you that part of what we're trying to do in CHI is to create that kind of self-regulatory space where health systems and vendors can come together and build that consensus set of frameworks that are going to be those guidelines and guardrails that will ideally be able to keep up with the pace of innovation. I think one of the really exciting things about AI is innovation.

its ability to extend the capabilities of doctors and nurses into places where we can't reach patients. You know, a lot of my family lives on the Southern Ute Reservation in Colorado. There aren't many doctors there.

And you have clinics that are underserved that don't have many physicians with available slots to see people. Imagine a future where you have an AI tool that's deployed in those kinds of situations and they are able to help people. AI models don't get tired. They don't get impatient. That's a really exciting opportunity in the future.

in a rural community to meet people where they're at. And so I think that's the kind of accelerated adoption that I'm hoping to see in these clinics and hospitals. It's so interesting that you bring that up because as you well know, there's already been a lot of concern raised about biases built into these tools. I mean, for, you know, perhaps offering different diagnosis for indigenous people, people of color. I mean, but there's been a lot of sort of concern

raised about that? Would that be part of the kind of QA testing? I think so. I mean, you want to make sure that these models are safe and effective for everyone. Yeah. And, or at least on those people that they're serving and being used on. And so, I think it's appropriate to,

you know, irregardless of what ethnicity and background you come from, that you want that model to be working well on someone like you. And so often we're finding in research that the social determinants of health are highly influential. And, you know, Ziad Obermeyer, I think, has done a lot of research on that. Many have.

And so, yeah, we want to have a robust ecosystem of labs that can, if it's appropriate, test a model on a specific subpopulation of people. Okay, a couple more technical questions here, and then we'll get to the political aspect of all of this. The other thing about AI which makes it uniquely, distinctly different from pharmaceuticals or medical devices is that it's nearly constantly being updated or constantly being

being modified, self-improving, whatever you want to call it. So a one-time test is just that. It is a one-time test.

How would you manage that? Megan, you're clearly better than a five-year-old in this space. I don't know. Some five-year-olds out there know a lot more. So yeah, the post-deployment monitoring is critical. Concepts like drift, degradation, and performance are all really important to be managing and monitoring for. And so one of the things that these labs are also capable of is

especially if you have a network of them across 30 different health systems, and then in addition we have over 300 health systems that are part of CHI, is if you create a monitoring network of those health systems that have those models deployed, where they can generate real-world evidence about how that model is actually working on patients there, and then constantly update that over time, you can have the kind of surveillance network that

We already have, FDA and CDC and other agencies have for drugs, vaccines, medical devices, biologics. We already have those for these kinds of therapies. We don't have that for AI yet, and we desperately need it exactly because of what you said. And so that's what we're hoping to do is establish essentially

a network of health systems that will be tasked with monitoring how these models actually perform and sharing that transparently. You mentioned the patient, right? We want to be able to ensure that patients have access to that kind of data so that they know if I'm going to go see my doctor, chances are they might use this tool

How does it perform on someone like me right now? Okay, so transparent, you said. Does that mean that the results from the quality assurance testing that would go on in these places would be available to more groups than just the hospital system that's interested or that commissioned the testing? Yes. Our intent is that to be a CHI-certified quality assurance lab, you need to be able to share that result immediately.

Publicly. Publicly. Okay. So anyone can get a hold of it. If you can pull that off, that would clear a major hurdle, right, that healthcare and AI have.

of partially because of the law. We want to protect people's privacy. I get it. But also because we hear vendors or developers all the time say, we're not going to give you the data set that we tested on. We're not going to tell you exactly how it works. Here's just the results. And that is one of the things, especially when it comes to our health, that I think gives a lot of people pause. So you're making a commitment to requiring transparency in these results? Absolutely. If you were to ask me what is probably the singular most important

principle and responsible AI, it is transparency. If we want to build trust in AI, we have a huge trust problem. Poll after poll, survey after survey shows that the majority of Americans do not trust AI, particularly in consequential use cases like health.

If we want to address that and build that kind of trust, it starts with transparency. So that gets to, for example, the kinds of people or groups that are coming together for CHI. I'm seeing here that, and correct me if I'm wrong, this is from Politico's reporting, that nearly 3,000 industry partners have joined your effort. That's right. So that's a lot.

Obviously, a lot of them that are healthcare systems, but I'm going to focus on the actual tech side of things, including Microsoft, Amazon, Google. Any other big names we should know of? OpenAI, I think, is one. Anthropic, others. Okay. So, again...

I have a high degree of skepticism about the motivations of those companies because first and foremost, they are for-profit companies. They have the smartest minds in tech in them. That is not what I have an issue with. It's their motivations, right? OpenAI is a really interesting one. I mean, Sam Maltman's all over the place in terms of like what he says is

He thinks it should be private, which should be public. Is OpenAI for-profit? Is it non-profit? I mean, it's hard to understand what the company actually wants to do. And because of that, how can patients trust –

that their best interests are being represented in this group of 3,000? Let me put it a different way. So on the board, for example, what seat at the board, who's sitting at the board whose job is to advocate on behalf of the 330 million patients in the United States? Great question. So on our board, Jennifer Goldsack, who's the CEO of the Digital Medical Education Society, is the board representative for patient community advocacy.

It's really important in CHI because we're driven by our membership. We have a bunch of different stakeholders, as you described. We wanted to ensure that the board was representative of those important stakeholders. And the patient community stakeholder group is the most important, if you ask me. And so that's Jennifer's job.

I think you bring up a really good point though, which is the challenge when you have incentives that aren't exactly aligned. And I think that's the reality that we live in in today's healthcare system. We have for-profit pharmaceutical companies, for-profit medical device manufacturers, for-profit technology companies. Hospital systems. Hospital systems, all delivering tools, technology, care itself to our patients. How we align that

in a way that ensures that we're delivering high quality care to our patients and building trust is part of the challenge. And so, and that's why transparency is so important here. You bring up this concept of like, how did you train your model? What data did you use to train it on? You're right. Many of the technology vendors, and I think rightfully so,

are hesitant to share that because that's oftentimes where their IP is, right? Like how did they train that data? What data did they exactly use? If I were to be completely transparent and share all of that, what's stopping you or someone else from just doing the same thing? And so it's a balancing act between vendors that want to protect their IP and health systems and patients that want to legitimately understand that

how that model was trained because that informs how it might actually perform on someone like me.

And so coming to agreement on that level of detail that goes into that AI nutrition label that describes how did you train that model? What data sets did you use? That's part of the challenge. But that's also where we've landed in creating that label is what are the agreed upon details on the training of the data that the vendors are okay with and that the patient and the health systems as part of CHI are okay with. Okay. So I understand that what last fall –

You went to Capitol Hill and presented a lot of these ideas to a bipartisan crowd, in fact. Well, so it feels like it's been a thousand years since November of 2024. Now here we are in February of 2025 with the new administration.

It's hard to get a real bead on exactly how President Trump looks at AI and health care. But from what I understand, there's some concern from Republican members of Congress about this idea. For example, there is I'm seeing here Representative Jay Obernolte from California. He's a

Republican, and apparently he has written or at least wrote to members of the Biden administration formally warning that your assurance labs could actually – this is interesting. It has nothing to do with patients, but disadvantaged startups because of the cost of, I guess, going through the QA process with Chai. Yeah, I think there was a lot of misunderstanding in the early days of Chai about what we actually do and what we're trying to enable. Yeah.

With the quality assurance resources, if you were to ask any startup in the tech space, particularly in health tech, what the number one challenge is, aside from raising capital, it's getting access to high quality data.

To train an AI model in the healthcare space, you need access to high quality data that is fundamentally private, sensitive, is not shared widely. So how do you do that? It's a real challenge. What these labs enable is to have an ecosystem of 30 plus health systems all signing up to say essentially,

We are here and we want to partner with the technology companies to train models in a robust, efficient, fast way. So quite the opposite. And when I met with Representative Obernolte, I shared this. These labs are intended to accelerate the innovation from startups.

And it's squarely a private sector effort. So, I mean, we, you know, we were very excited when the Biden administration expressed interest in joining CHI and we offered them seats and they were part of, you know, every one of our working groups that we launched. I've shared with the Trump team the same that, you know, we believe in a public-private partnership. This is private sector led.

And, you know, the new administration is more than welcome to participate in our working groups. Yeah. You know, it's clear. I do not dispute at all that the technology is moving so quickly that it's completely unrealistic and unwise to presume at all that the federal government could be the primary, let's say, tester or regulatory oversight. So having some kind of partnership makes a lot of sense to me. But

Maybe we'll have you back in the future to see how things are going because the regulatory capture question isn't going to exit my brain. But we've only got about a minute here, Dr. Anderson. I'm so sorry. We could have talked for another hour here. I had mentioned at the top of the show that you were also a family physician yourself for underserved communities, in fact. So take the last 30 seconds to describe to me why that experience leads you to believe in the importance of AI in health care.

So many doctors and nurses today are challenged in being able to see all of the patients that they need to see. In, you know, a single day you might see in excess of 20 patients. That burns out providers and nurses profoundly. It is not sustainable. I think a sobering characteristic fact is that we have to graduate over two classes of medical school students just to make up for the physicians that commit suicide each year. And part of that is the burnout issue.

When you have physicians that are pouring their heart and soul into serving underserved populations, like those that are here in Boston and in the greater Lowell Lawrence area where I worked, it is important that we equip them with tools,

to enable that kind of service that they want to offer. And AI, I believe, has an incredible potential to help us. You know, I'm thinking also of many rural communities across the United States that are seeing their hospitals or clinics close or consolidate. It's just getting harder and harder to access primary care even.

Well, Dr. Brian Anderson, he's CEO and co-founder of the Coalition for Health AI, or CHI. Thank you so much for joining us today. Really great being here, Meghna. I'm Meghna Chakrabarty. This is On Point.

This doctor says he can make AI in health care safer 47:11 Share

On Point | Podcast

Deep Dive

Shownotes Transcript

This doctor says he can make AI in health care safer