Welcome to a new deep dive from AI Unraveled, the podcast created by Etienne Newman, who's a senior engineer and also a passionate soccer dad up in Canada. Hey, everyone. If you're enjoying these sessions and find them valuable, please do take a second to like and subscribe on Apple Podcasts. It genuinely helps us out a lot. It really does. And also, if you're thinking about upgrading your productivity tools, maybe exploring some AI features,
check out the show notes. We've got a referral link and a discount code for Google Workspace. Yeah, that gets you Gemini PRO.
Notebook Om, Teams, a whole lot of useful stuff. Exactly. And one more quick mention for anyone tackling those tough tech certifications. Etienne's AI-powered Jamgatech app is designed specifically for that. It covers like 50 plus PBQs and simulation heavy certs. Definitely worth a look. Right. So welcome back to the Deep Dive. The idea, as always, is we take the sources you're following, pull out the key bits, and hopefully give you a clear picture quickly. Yep.
And today we're jumping into a mix of AI news and developments from May 7th, 2025. It's quite a range. We've got robots learning to, well, feel things, potential shifts in major AI partnerships, all sorts of things. Should be some interesting connections and maybe a few surprises. Definitely.
Definitely. Okay, where should we start? Maybe with Amazon's warehouses. Sounds good. They've got a new robot, Vulcan. That's the one. And the really interesting part about Vulcan is, well, it has a sense of touch. A robot that can feel. Oh. Okay, that sounds like a pretty big leap from just moving things around. How does that work? It uses force feedback sensors. Right. And the AI behind it has been trained on just
tons of data about physical interactions so it can handle way more different kinds of items, handle them precisely, and crucially, not damage them. So it's not just grabbing, it's sensing the pressure needed. Exactly. It knows how much force to use. And while it's huge for warehouses, think about other areas, maybe elder care or even surgery down the line where that kind of delicate touch is key.
That's a good point. And it raises questions, too, about jobs needing fine motor skills. But for now, in the warehouse, it works alongside people, right? Precisely. The idea is Vulcan takes over the tasks that are ergonomically difficult for humans, you
you know, constantly reaching way up high or bending down low. So it's about efficiency and safety. That's the goal. Improve safety, improve efficiency in the fulfillment centers. Is it actually running now? Yep. It's operational. Currently in some select Amazon facilities, I think, in Washington state and in Germany. Okay.
And you mentioned it handles a lot of different items. Yeah. The claim is it's designed to pick and place roughly three quarters of all the product types they stock. Tasks that, you know, were almost entirely done by humans before. Three quarters. Wow. Okay.
That's a really significant chunk of the work. It absolutely is. And stepping back, adding a reliable sense of touch to automation like this is a major advancement. It just broadens the scope of what robots can do safely and effectively, moving past simple repetition to more nuanced tasks. Okay, so robots getting more dexterous. Let's shift gears a bit maybe to the business side. The relationships between the big AI players, open AI.
And Microsoft. Right. Yeah. There was a report from the information suggesting OpenAI might be planning to adjust its revenue sharing deal with Microsoft significantly, actually. Adjust how? Microsoft invested a lot in OpenAI, didn't they? Tens of billions. Yeah. Yeah.
The current deal reportedly gives Microsoft 20 percent of OpenAI's top line revenue running until 2030. 20 percent is substantial. It is. But according to these financial documents, the information saw OpenAI is looking to reduce that cut for partners down to 10 percent by 2030. And the current deal involves more than just revenue share. Oh, yeah. It covers shared profits, IP rights. The fact that OpenAI's API runs exclusively on Microsoft Azure, it's a deep partnership. So why the potential change? Is OpenAI's
Is OpenAI feeling more independent now? That seems likely. You know, their scale is growing incredibly fast. This might reflect a push for more financial autonomy as their tech gets embedded everywhere. Makes sense. But how does Microsoft feel about this?
a lower return on that massive investment? Well, that's the big question, isn't it? It definitely impacts the long-term financial picture for Microsoft's investment. It could signal a shift in the power dynamic there. And isn't OpenAI also restructuring itself? They are proposing a new structure as a public benefit corporation, yeah. But reports suggest Microsoft still needs to approve that, probably to make sure their financial stake is protected through the transition. Okay, lots of moving parts there. It really shows how these big tech partnerships are always evolving.
Speaking of which, Apple seems to be rethinking things too, especially around search. Yeah, that's another interesting one. Apple is apparently exploring AI-powered search partners for Safari. Why now? Did something trigger this? Well, Apple's Eddie Q actually testified in court recently, and he revealed that for the first time ever, Google search usage declined in Safari last month. Whoa.
He said that in court. And did he say why? Yep. He directly attributed it to people shifting towards using AI tools instead of traditional search. That's a huge admission. So what's Apple doing about it? They're actively looking at partnerships. Mention names like OpenAI, Perplexity, Anthropic. The idea is to offer alternative search options right inside Safari. So could Google actually lose its default spot on iPhones? That multi-billion dollar deal
It suddenly looks like a real possibility, doesn't it? You've got declining usage, plus that ongoing regulatory case that's threatening the Google deal anyway. Right. The antitrust stuff. Exactly. So changing user habits plus regulatory pressure. It looks like Apple is seriously considering a major strategic shift for Safari, moving beyond just Google. Everyone's jockeying for position.
And open AI isn't just dealing with partners. They're looking globally now, too. This open AI for countries thing. That's right. A new initiative where they plan to partner with national governments around the world. The goal is to help them build sovereign AI infrastructure. Sovereign AI infrastructure. OK, what does that actually mean? Like data centers? Yeah. Data centers, yes. But?
potentially more. It seems coordinated with the U.S. government, maybe like an international version of their Stargate project concept. Open AI is offering technical help, customized AI models tailored to local languages, local needs, health care, education. So a country gets its own tailored AI running locally. That's the pitch.
And crucially, it implies more national control over the data, the algorithms, maybe even the ethical rules governing AI within their borders. That's ambitious and expensive. Who pays? The plan is for it to be co-financed. OpenAI and the partner country would both invest.
And what's OpenAI's angle here? What's the bigger goal? They're framing it as promoting democratic AI, ensuring the tech develops in line with democratic values, human rights, that sort of thing. So there's a philosophical layer, too. Absolutely. Strategically, you can see OpenAI positioning itself as the global partner for national AI development. It promotes their tech, their way of doing things, their democratic AI rails, as they might put it. But it could also create dependencies, right?
For sure. It fosters a global ecosystem built around open AI's models and principles. It's a very significant strategic move. Definitely one to watch. OK, let's get back to the tech itself. Google's been updating Gemini, right? There's a new version. Yes, they released an early preview, an I/O edition of Gemini 2.5 Pro just last week, May 6th actually.
And reports suggest it's showing some really strong improvements. Improvements where, specifically? Particularly in coding and web development, it seems. Okay, how do we know? Are there benchmarks? Yep. Almost immediately after release, it apparently shot to the top of the leaderboards,
Both the WebDev Arena, that's where humans rate AI-generated web apps, and the general chatbot arena. Wow, number one on both. Did it beat the other top models? Reportedly, yes. It surpassed models like Claude 3.7 Sonnet and even OpenAI's O3 model, which was a previous leader. So real measurable gains, especially for developers. Looks like it. Enhanced performance for front-end UI stuff, transforming code, editing code, building more complex agentic workflows. Agentic workflows.
Like AI doing multi-step tasks. Exactly. And it also has new video understanding capabilities. They mention things like turning video content into interactive learning apps. That's cool. And overall, it's number one on the LM Arena leaderboard, beating OpenAI's latest. That's what the reports indicate. Yeah. Across all categories. It
It really shows Google is pushing hard on refining Gemini and achieving state-of-the-art results, at least according to these human preference benchmarks. The competition is just fierce. No kidding. The pace is incredible. Okay, from the model's brain to its, well, face. Fair. AI avatars are getting more realistic too. Hey, Jen. Absolutely. Hey, Jen updated their Avatar Tech, Avatar 3.0 and Avatar 4V are the new ones. And the big focus is making them more emotionally expressive. Emotional AI avatars.
Sounds a bit sci-fi. How do they do that? The system looks at a text script or listens to audio and then generates the facial expressions, the gestures, the voice tone, even the body language to match.
The idea is to make video presentations using these avatars feel more natural and engaging. So it's analyzing the meaning or feeling of the words. Seems like it. They have a new audio-to-expression engine, apparently inspired by diffusion models. It analyzes the voice to create really photorealistic facial movements, even micro-expressions and hand gestures. Wow. And what does it need to create one? Just a single reference image and a voice script, they say.
And it apparently works with different subjects, even pets or anime characters and different angles. Avatar 4s also does portrait,
Half body and full body now. So much more dynamic. What kind of videos are they pitching this for? They're highlighting things like, you know, influencer style videos, singing avatars, characters for games, maybe even visual podcasts like this one, but more expressive. Interesting. What's the broader implication here? More lifelike digital humans. Pretty much. It's a step towards making those interactions feel less robotic, more natural. That could be huge for marketing, customer service, education, entertainment.
Anywhere you want that human connection. From fancy avatars to practical tools, AI for personal finance using Zapier. Yeah, there's a guide on how to use Zapier agents. That's their AI automation thing to build your own personal financial assistant. And the key is you don't need to code. OK, so I could set up an AI system.
To like track my spending automatically. How does that work? Essentially, yeah. You connect the apps you already use, maybe Google Sheets, your accounting software, whatever. Then you just tell the Zapier agent in plain English what you want it to do. Like track my expenses or summarize my spending. Exactly. Or check if this invoice got paid or remind me to pay this bill. Stuff like that.
How complicated is it to set up? What are the steps? The guide makes it sound pretty straightforward. Create a new agent, tell it what to do. Like when a new invoice appears in this Google Drive folder, then you add the tools it needs, maybe Google Drive to get the file, ChatGPT to read the invoice details, Google Sheets to log the info. Then you test it, make sure it works and turn it on. That actually sounds doable for a lot of people. What's the big takeaway?
It's really about empowerment, right? Giving non-coders the ability to build custom AI tools for their own needs. Linking apps together, automating annoying tasks, all just by talking to the AI. It makes automation much more accessible. Making powerful tools easier to use. That seems to be a theme. And speaking of accessibility, Lightrix open sourcing their AI video model.
Sounds like a big deal for developers. It really is quite significant. Lightrix, they make apps like Facetune and Videoleap release their LTX video model family. That includes LTX V13b, a 13 billion parameter model. 13 billion. That's pretty large, isn't it? It's substantial, yeah. And they've put it out under an open source license. It's free for smaller entities, anyone under $10 million revenue. You can find it on Hugging Face.
GitHub. What does it do? Just text to video? Text to video, but also image to video. They're highlighting this new technique they call multi-scale rendering. Supposedly makes it fast and high quality. Multi-scale rendering. How does that work? The way they describe it, it's sort of like building the video in layers of detail.
Think rough sketch first, then adding finer details. Helps with smoothness and consistency, they claim. And the big news is it runs on regular computers. That's a key point, yeah. They say it can run on consumer-grade GPUs. That lowers the barrier to entry massively. Usually these big models need serious, expensive hardware. Right. Any other cool features? They mention precise camera control, keyframe editing, tools for sequencing multiple shots,
It sounds like they're aiming for fairly sophisticated video creation. And they partnered for training data. Yeah, with Deady Images and Shutterstock, which is important for the quality and legality of the output. So why open source it? What's the impact? It should really accelerate innovation in AI video. Making advanced tools like this accessible just lets more people experiment, build new things, compete. It could really stir up the generative video space. More tools, more creators, more innovation.
Makes sense. Okay, let's shift to some really critical applications. Using AI drones for medical deliveries. A drone lifeline. Yeah, this is incredibly impactful stuff. AI is making drones much smarter and more capable for delivering vital medical supplies. How does AI help?
What does it enable the drones to do? Well, it allows for autonomous flight, for one, but also optimizing the route, considering weather, terrain. It helps them avoid obstacles dynamically, and it assists with the whole logistics management side, too. And these are carrying things like vaccines, blood. Exactly. Vaccines, blood, medicines, vaccines.
Critical items going to places that are hard to reach, remote areas, disaster zones, places with poor infrastructure. Cutting down delivery times must make a huge difference. A massive difference. There are projects already running in parts of Africa and India showing real life saving potential. It's about improving health care access dramatically by overcoming those logistical hurdles. That's amazing. Truly AI for good. Now, for something completely different and maybe a bit controversial, AI.
AI in a U.S. courtroom. This was definitely a first of its kind situation. Yeah. In Arizona, during a sentencing hearing for a fatal road rage case, the family of the victim, Christopher Pelkey, used AI to create a video of him delivering a victim impact statement. Wait, they generated a video of the deceased victim speaking?
How? They used AI tools combined with existing photos and videos of him and a script they wrote from his perspective. Apparently, the message was one of forgiveness towards the person being sentenced. Wow. How did the court handle that? What did the judge say?
The judge acknowledged the emotional weight of it. But as you can imagine, it sparked a lot of discussion. I bet the ethical questions, legal questions. Yeah. Authenticity manipulation. Exactly. It's a really novel use of AI in a legal setting. It raises incredibly complex issues about the role of this kind of technology in the justice system that we're
only just starting to grapple with. Definitely uncharted territory. Okay, moving back towards research. Anthropic has a new program for scientists. Yes, they launched AI for Science. The goal is pretty
Pretty clear. Use AI to speed up scientific discovery, especially in biology and life sciences. How are they doing that? Are they giving away free AI access? Essentially, yeah. They're offering selected researchers free API credits reports, say up to $20,000 worth to use anthropic models like CLAWD. What kind of research would that support?
Things like analyzing huge data sets, generating new hypotheses for experiments, helping design those experiments. They do mention a biosecurity review as part of the process, though. Makes sense. So they're actively trying to get their AI used for scientific good. That's the idea. By putting their tools in the hands of researchers, they hope to help accelerate breakthroughs in really complex fields. It seems like a positive initiative. Now,
Uh, online platforms. Reddit is trying to crack down on AI bots. That's right. They announced plans for stricter user verification. This comes after some controversy about an unauthorized AI experiment running on the platform recently. Ah, okay. So what's the plan? How will they verify users more strictly?
They haven't laid out all the specifics, but the aim is to get better at detecting and blocking those AI bots that try to mimic human users. They might use third-party services, but they also say they want to try and preserve user anonymity as much as possible. That's a tough balancing act, isn't it?
Spotting bots without compromising privacy. It's a huge challenge for all platforms now. As AI gets better at sounding human, the defenses have to get better too, just to maintain trust and stop manipulation. A constant battle. Yeah. Okay, one more research item, WebThinker. An AI agent for research. Yeah, this sounds pretty advanced. It's an AI agent framework from Renman University, BA AI, and Huawei.
It's designed to make large reasoning models, LRMs, better at complex research. How does it do that? What's different about it? It allows the AI agent to autonomously browse the web, navigate websites, pull out information,
and even draft reports, all as part of its reasoning process. So it's not just retrieving facts, it's actively exploring and synthesizing. Exactly. The goal is to go beyond standard RAG, retrieval augmented generation, where the AI just fetches info and uses it. WebThinker aims for a deeper integration of web interaction into the reasoning itself for those really knowledge-heavy questions. Sounds like a step towards AI that can genuinely conduct research on its own. That seems to be the direction, yeah.
More autonomous agents capable of deep exploration and reporting. Okay, wow. We covered a lot.
Before we wrap, there were a few other quick news hits from May 7th worth mentioning. Yeah, just a few rapid fire ones. OpenAI is reportedly buying Windsurf. That used to be Codeum, a coding platform for a huge $3 billion. That'd be their biggest acquisition ever. $3 billion for a coding platform? Yeah. Are they serious about AI for developers? What else? Google launched AI Max features in search specifically for advertisers, helping them optimize campaigns.
Elon Musk's lawyer fired back at OpenAI's restructuring plan, basically calling it window dressing. Still tension there. Yeah. And Microsoft had concerns, too. Reports suggest Microsoft is indeed looking for assurances that its, what, $13.75 billion investment is safe under OpenAI's new public benefit corporation structure? Understandable. Anything else?
Our URA, the Smart Ring Company, added new AI features for logging food and monitoring glucose. And a company called Future House put an AI agent named Finch into closed beta. It's specifically for analyzing biology data. Biology data analysis, another specialized AI tool. It really just shows how AI is touching almost every field imaginable now. Absolutely. This snapshot from just one day, May 7th,
really paints a picture of incredible speed and diversity in AI innovation. You've got robots getting physical senses. Right, the tactile thing. Huge strategic shifts in partnerships and potential search defaults. Yeah, the Apple-Google dynamic. New ways individuals can use AI for finance or get more realistic avatars. And these really complex ethical questions popping up, like in that courtroom case. It's honestly hard to keep track. The tactile robot really stuck with me. That feels like a fundamental shift.
And the whole Apple potentially moving away from Google search, that could reshape things significantly. Plus just the constant march towards more human-like AI, both inability and appearance. And I think things like WebThinker, the autonomous research agent, really hint at how AI might change knowledge work itself. And that Arizona court case, it just forces us to think about AI's role in society in completely new ways. Definitely. So maybe a final thought for you listening.
Considering everything we've just talked about, the robots, the search changes, the science tools, the ethical dilemmas, what are maybe the most unexpected ways you think AI might start showing up in your daily life, maybe sooner than you think? Yeah, look beyond the obvious. What are the surprising ripples, both the good ones and maybe the challenging ones? Something to chew on. Thanks again for diving deep with us into the world of AI. Until next time.