This message comes from NPR sponsor, Atio, the CRM for the AI era. Connect your email and Atio instantly builds a powerful CRM with every company, contact, and interaction you've ever had. Start your free trial at attio.com slash NPR. You're listening to Shortwave from NPR.
Hey, short wavers. Regina Barber here. It seems like artificial intelligence is everywhere in our virtual lives. It's in our search results, our phones. It's trying to read my emails. But NPR science correspondent Jeff Brunfield has noticed that AI isn't just showing up online anymore. It's starting to creep into reality. Yep. I don't know if you tuned in for Tesla's big marketing event last year, Regina. No. But AI was there. Speaking of robots...
Tesla is obviously a car company, but Elon Musk, Tesla's CEO, made a big part of the event about a humanoid robot powered by AI and called Optimus. The software, the AI inference computer, it all actually applies to a humanoid robot.
And Google just unveiled another humanoid robot that operates using AI. We're bringing Gemini 2.0's intelligence to general-purpose robotic agents in the physical world.
Okay, Jeff, but even before AI came along, people and companies have been making like big claims about robots. They have. They have. And the robots, as I'm sure you know, Gina, have always disappointed compared to the vision. Yeah, that's true. And that's why I set out to understand the truth about AI and robotics. The truth. And I think I kind of found it in Ebola trail mix. Today on the show, what happens when artificial intelligence moves out of the chat and into the real world?
We're looking at how AI could maybe revolutionize robotics. You're listening to Shortwave, the science podcast from NPR.
This message comes from NPR sponsor, Microsoft Azure. The AI platform shift brings immense opportunity, but the road to success isn't always clear. Leading the Shift is a new podcast from Microsoft Azure, where leaders and visionaries from organizations of all kinds share what they're learning as they navigate this new era of technology. Listen and subscribe now to explore stories of real innovation in real life. Available wherever you get your podcasts.
Okay, so Jeff, you were interested in finding out more about how AI works in robots. Where did you start? Well, I didn't go to Tesla or Google, but I did drive right by them on my way to Stanford University. Okay. And specifically the IRIS laboratory, which stands for Intelligence Through Robotic Interaction at Scale.
I got a tour from a graduate student named Moojin Kim. Moojin works on a new kind of robot powered by AI similar to the AI used in chatbots. It's one step in the direction of like ChatGPT for robotics, but still a lot of work to do.
Okay. All right. Well, you want to show me what I can do? Yeah, for sure. So, Jeff, what did the robot look like? Well, this wasn't some humanoid robot that the big tech companies are rolling out. It's just a pair of mechanical arms with pinchers. Okay. But what made it interesting was that it's powered by an AI model called OpenVLA.
So first, we should probably just say quickly, you know, a regular robot must be very, very carefully programmed. An engineer has to write it detailed instructions for every task you want it to perform. Yeah. And AI is supposed to change that. Exactly. And that's what's going on here. This robot is powered by a teachable AI neural network.
The neural network operates kind of how scientists think the human brain might work. Basically, there are these mathematical nodes in the network that have billions of connections to each other in a way similar to how neurons in the brain are connected together. And so when you go to program this sort of thing, it's simply about reinforcing the connections that matter between the nodes and weakening the other ones that don't.
So in practice, this means Moojin can just teach OpenVLA a task by showing it. So basically, whatever task you want to do, you just keep doing it over and over, maybe like 50 times or 100 times. The robot's AI neural network becomes tuned to that task, and then it can do it by itself. Yeah, it makes me think of this like smiling robot story we did, and that robot just watched like a lot of videos of people smiling, and then it learned how to do it.
Yeah, it's exactly the same thing, except instead of just smiling, this robot's actually doing stuff. So to show me, Moojin brought out a tray of different kinds of trail mix, and I typed in what I wanted it to do. Okay, so scoop some green ones with the nuts into the bowl.
Oh, my gosh. See what happens. Okay, so, Jeff, personally, I've been waiting for something like AI in robotics because you can teach it to do something, you can ask it to do something to, like, make me an ice cream sundae or something without, like, any fancy programming or special knowledge. That's exactly it, you know? And this really is the dream of the researcher who runs this laboratory. Her name is Chelsea Finn. So in the long term, we want to develop software that would allow the robots to operate intelligently in any situation.
And by intelligently, she means the robot could understand a simple command like scoop some green ones into a bowl or make me a sundae and then execute in the real world. Even just to do very basic things like being able to make a sandwich or being able to clean a kitchen or being able to restock grocery store shelves. These are simple tasks that could help humans do their jobs or do tasks at home.
Now, Chelsea also has co-founded a startup called Physical Intelligence. It recently demonstrated a mobile robot that could take laundry out of a dryer and fold it. Again, this robot was taught by humans training its powerful AI program. OK, so Ice Cream Sundays, is that to advance? Is folding an easier start? I mean, I'd actually argue, Gina, that folding is harder. OK. Let me show you a video.
Okay, it's going to the dryer. It's pulling stuff out, putting it in a basket.
It has the concentration I have when I'm going to do laundry. It almost looks, like, annoyed with folding like I do. Oh, my God, it's doing really well, actually. Yes, it is, right? Wow. And this is a complicated task. It's got to pull these clothes out. It's got to figure out what they are. It doesn't even have a head, but I'm, like, giving it personality. It looks like it's like, oh, I just got to fold another one. Okay, so is it really as simple as, like, just teaching a robot, like, what to do? Because...
If it was, wouldn't these robots be everywhere? Yeah. I mean, right. It looks cool on the video. The truth is that, you know, when you get out and these robots are trying to do these tasks over and over again, they get confused. They misunderstand. They make mistakes and they just get stuck.
So, you know, it might be able to fold laundry 90% of the time or 75% of the time, but the rest of the time it's going to make a big mess that then a human has to get in there and clean up. Got it. Okay. I spoke to Ken Goldberg, a professor at the University of California at Berkeley, and he is pretty emphatic that AI-powered robots weren't here yet. Right.
robots are not going to suddenly become the science fiction dream overnight. Okay, so tell me why. Because AI chatbots have gotten way better super fast. So why are these robots getting stuck?
Okay, so it's true that AI has improved massively over the past couple years, but that's because chatbots have a huge amount of data to learn from. They've taken basically the entire internet to train themselves how to write sentences and draw pictures. But Ken says... For robotics, there's nothing. We don't have anything to start with, right? There's no examples of...
of robot commands being generated in response to robot inputs. And if robots really need as much training data as their virtual chatbot friends, then having humans teach them one task at a time is going to take a really long time. You know, at this current rate, we're going to take 100,000 years to get that much data. What?
OK, that's so long. Like, are there any alternatives? There must be. Yeah. Well, scientists are exploring them right now. And one might be to let the AI brain of the robot learn in a simulation. A researcher who's trying this is a guy named Phulkit Agrawal. He's at the Massachusetts Institute of Technology. The power of simulation is that we can collect, you know, very large amounts of data.
For example, in three hours worth of simulation, we can collect 100 days worth of data. So this is a really promising approach for some things, but it's much more of a challenge for others. So for example, let's talk about walking. When you're just dealing with the earth and your body, the physics of walking around, it's actually kind of simple. When you're doing locomotion, you know, you're mostly on earth. You know, there's no amount of force you can apply which will make the earth move.
And so the simulation can do that reasonably well. But if you want your robot to, say, try and pick up a mug off a desk or something, that's a lot more complicated. Or forces. You know, if you apply the wrong forces, these objects can fly away very quickly.
Basically, your robot will fling things across the room if it doesn't understand the weight and the size of what it's carrying. And there's more. You know, if your robot encounters anything that you haven't simulated 100 percent perfectly, then it won't know what to do. It'll just break. OK, so it sounds like these like simulations have limits and real world training is going to take like a while. I can begin to see why robots aren't going to like be here tomorrow.
Exactly. And some researchers think there are even deeper problems, actually, with trying to put AI into robotics. One of them is Matthew Johnson Roberson at Carnegie Mellon University in Pittsburgh. In my mind, the question is not, do we have enough data? It is more, what is the framing of the problem? So getting back to AI chatbots for a minute.
Matt says for all their incredible skills, the task we're asking them to do is actually relatively simple. You know, you look at what a human user types and then try to predict the next words that user wants to see. Robots have so much more that they're going to have to do than just compose a sentence. Right. Next best word prediction works really well.
And it's a very simple problem because you're just predicting the next word. And it is not clear right now I can take 20 hours of GoPro footage and then produce anything sensible with respect to how a robot moves around in the world.
So in other words, the sci-fi tasks that we want our robots to do are so complicated compared to sentence writing. No amount of data may be enough unless researchers can find the right way to teach the robots. Or have the robots teach the robots. Yes. That's also an option. They can teach themselves. Okay. So, Jeff, you've taken me from, like, optimist to pessimist. It's the, you know, the road I take every day. I'm starting to think that AI is, like,
It's never going to work that well in robots or like it's going to be a really long time. You know, I'm sorry if I've like turned you into a pessimist here, Gina. It happens. And I'm going to have to sort of whipshaw you back because –
AI is already finding its way into robotics in ways that are really interesting. So, for example, Ken Goldberg has co-founded a package sorting company. And just this year, they started using AI image recognition to pick the best points for their robots to grab the packages. Ooh, okay. Yeah, and it's working really well, he told me. And I think we're going to see a lot of that. AI being used for parts of the robotic problem, you know, walking or walking.
vision or whatever, it's going to make big progress. It just may not arrive everywhere all at once.
And to really end on a high note here, let's get back to that Stanford lab. Remember, I asked it to grab some trail mix, right? Yeah. So the robot correctly identified the right bin to Moojin Kim's relief. Usually that spot right there where it identifies the object and goes to it, that's the part where we hold our breath. And then very, very slowly and kind of hesitantly, it reached out with its claw and picked up the scoop. Oh!
It's doing it. Mujin, did I just program a robot? You did. Looks like it's working. And to my mind, it's incredible. Like, remember, nobody really programmed the robot exactly. This is all neural network learning how to move the claws and respond to the commands on its own. And to me, it's pretty wild that that works at all. And I think it's going to lead to some very cool developments. I'm excited to hear more, Jeff. Thank you so much for bringing this reporting to us. Thank you very much.
We'll link Jeff's full story, which has robot videos, in our episode notes. This episode was produced by Burleigh McCoy, edited by our showrunner Rebecca Ramirez, and fact-checked by Tyler Jones. Jimmy Keeley was the audio engineer. Beth Donovan is our senior director, and Colin Campbell is our senior vice president of podcasting strategy. I'm Jeff Brumfield. I'm Regina Barber. Thank you for listening to Shortwave from NPR.
NPR informs and connects communities around the country, providing reliable information in times of crisis. Federal funding helps us fulfill our mission to create a more informed public and ensures that public radio remains available to everyone. Learn more about safeguarding the future of public media. Visit protectmypublicmedia.org.
This is Tanya Mosley, co-host of Fresh Air. Amanda Knox spent nearly four years in prison for a murder she did not commit. When she was exonerated, she made an unusual decision to befriend the prosecutor who argued for her guilt. Maybe he could help her make sense of her case. I spent years thinking about it and trying to understand it until I realized that I could just ask. Listen to this interview on the Fresh Air podcast.
When Malcolm Gladwell presented NPR's ThruLine podcast with a Peabody Award, he praised it for its historical and moral clarity. On ThruLine, we take you back in time to the origins of what's in the news, like presidential power, aging, and evangelicalism. Time travel with us every week on the ThruLine podcast from NPR.