Welcome to AI Unraveled. This podcast is created and produced by Etienne Newman, you know, senior soccer engineer and passionate soccer dad from Canada. If
If you're enjoying these deep dives, these looks into the fast-moving world of AI, please do us a favor. Like and subscribe to the podcast on Apple Podcasts. Absolutely. Always appreciated. Okay. So you ready to dive into another week of AI news? I am. Always fascinating to see what's new. What have you got for us this week? Okay. So this deep dive is going to cover March 31st to April 6th, 2025.
Okay. Yeah, it's been a busy week, really. You know, it seems like every week it gets busier and busier. I know, right? It's really incredible the pace of innovation. It is. It really is. Yeah. Okay. So the goal today, as always, is to, you know, try to unpack some of the biggest stories, the things we think you really need to know to kind of keep up with what's happening in AI without getting totally overwhelmed, hopefully. Right, right. It's a lot to keep up with. So, yeah, a curated approach is definitely helpful. Absolutely. Absolutely.
Okay, so let's start with, I guess, developments in AI models themselves. Some interesting news this week. Yeah, for sure. That's a good place to start. So OpenAI, of course, you know, big, big player in this space. They've actually decided to delay the release of GPT-5. Oh, wow. Okay, so that's a bit of a surprise, right? I mean, everyone's kind of been waiting for GPT-5 to drop. What's the reasoning behind the delay? Yeah, it's interesting, you know,
I think there's a couple of factors at play. I mean, one of the things that is apparent from what they've said is that, you know, integrating a lot of the tools that they want to integrate into GPT-5 is proving to be a bit more complex than they initially anticipated. Yeah, that makes sense. It's not just about making the model bigger, right? It's about making it more useful, more functional. Exactly. Yeah. So, you know, it's not just about scaling up. It's about
sort of rethinking potentially how these large language models are structured and maybe moving towards something that's a little bit more modular maybe. We'll see. But in any case, you know, while they work on GPT-5, they've decided to focus on releasing these models called O3 and O4 mini. Interesting. I haven't heard too much about those yet. What are they all about? Yeah. So these are kind of interesting. They're being described as reasoning focused models. Okay. So very strong in things like coding and mathematics.
Sam Oldman, you know, the CEO of OpenAI. Yeah. He's actually claimed that O3 performs at the level of a top 50 programmer. Wow. I mean, that's a pretty bold statement, isn't it? I mean, how do you even measure that? What does that actually mean in practical terms? Right. It's a great question, right? Like, what does it mean to say that an AI is at the level of a top 50 programmer?
I think, you know, we probably need to look at the specific benchmarks they're using and kind of get a better understanding of how they're defining that. Yeah, I think that's really important, right? Because for our listeners, it's not just about these big claims. It's about understanding how this technology might actually impact their work, their lives. Absolutely. Yeah. I mean, the takeaway, though, is that, you know, while we wait for GPT-5,
We can expect some pretty big performance improvements in these specific areas, at least with these 03 and 04 mini releases. So it's kind of like they're trying to address more immediate needs, you know, like, okay, we need better coding. We need better math capabilities. Let's focus on that while we work on the bigger, more, you know, ambitious, I guess, GPT-5 project. Yeah, that's a smart approach. It's kind of like, you know, iterate and release, right? Get those improvements out there while you're working on the big stuff. Exactly. Exactly. Okay. Okay.
So moving on, let's talk about Meta. Okay. Yeah, Meta has been making some big moves in the AI space too. What's new with them? Right. So they've released Lame4. Lame4. Okay. So this is the newest version of their open large language model. Okay. Open source. So that's a big difference from, say, OpenAI's approach, right? Yeah. Huge difference, right? So, you know, they're really emphasizing the open source aspect. Some key improvements that they've highlighted are better performance,
multilingual capabilities. So it's better at handling multiple languages and also a really big focus on safety, which is always a concern with these large language models. Absolutely. Yeah. I mean, as these models
become more powerful, it's really important that they're being developed responsibly and that safety is a top priority. Yeah, for sure. So they've made Lame4 available in several different sizes, which is interesting. I think that speaks to the fact that they want to cater to both researchers and commercial users.
So, you know, you can choose the size that best suits your needs and your resources. Yeah, that's really interesting. So it's not just about, you know, putting out the biggest model possible. It's about making it accessible and customizable for different use cases. Right. And I think, you know, by making it open source, they're really sort of trying to, I guess, take a leadership role in the open source AI community and, you know, kind of encourage that kind of
collaborative development, right? So lots of people can build on this foundation. Yeah. And potentially, I mean, that could lead to a lot of innovation, right? I mean, when you have more people working on something, you often get more creative solutions. Yeah, absolutely. Okay. Let's switch gears a little bit and talk about something a little more...
Visual, I guess. So Midjourney, which is known for its image generation, has released V7. Oh, okay. I've been using Midjourney a bit myself, actually. It's pretty impressive what it can do. What's new in V7?
Yeah, so this is their first major update in almost a year. Wow, that's a long time in AI development, isn't it? I know, right? Things move so quickly. So there are, I mean, they've highlighted a number of improvements, but some of the key things are, you know, improved coherence, which is, you know, how well the image is actually kind of makes sense. Yeah. Faster generation speeds, of course, which is always a plus.
And they're also moving towards more personalization. Okay. So what does that mean? Personalization and image generation. Yeah. So to achieve that personalization, what they're doing is they're asking users to rate about 200 images. Wow. 200. Yeah. And that's how they're going to kind of fine-tune the model to your particular aesthetic preferences. That's pretty cool. So it's like it's learning your artistic taste almost. Exactly. Yeah. It's trying to learn what you like. Wow. Yeah.
So does that mean everyone's going to get like a completely unique version of Mid Journey based on their ratings? I don't know about completely unique, but I think it's definitely moving in that direction. That's amazing. I mean, that's a level of customization that we haven't really seen before, right? Yeah, I think it's pretty cutting edge. And in addition to that, they've also introduced different modes. So you have turbo mode, which is for speed. You have relax mode, which is supposed to generate potentially higher quality images.
And then there's a draft mode, which is basically for quick previews. Okay, so you can choose the mode based on what you're trying to achieve and how much time you have. Exactly, yeah. And, you know, considering all of this, it's pretty remarkable that they're actually reportedly expecting $200 million in revenue by the end of the year. Really? Yeah.
That's huge, especially without any outside investment, right? I mean, they haven't taken any venture capital or anything. Yeah, it's all organic growth, which is pretty impressive. Yeah, and they're doing all of this while navigating these tricky copyright issues, right? I mean, there's still a lot of debate about who owns the rights to AI-generated art. Yeah, that's definitely an ongoing legal challenge that I think the whole industry is facing. Mm-hmm.
Okay, so moving on from images to videos, I saw that Runway has released a new AI video generation model, Gen 4.
Yes. What's interesting about this one? So the main focus with Gen 4 seems to be on achieving better consistency of characters and scenes across multiple shots. Okay, that makes sense. Because that's always been one of the big challenges with AI video generation, right? Making sure everything looks consistent from one shot to the next. Absolutely, yeah. So how are they addressing that? Well, they've actually showcased this improved consistency in a couple of sample films that they've released called New York is a Zoo and The Herd. Hmm, I have a
I'll have to check those out. Yeah, they're interesting. And they're actually calling their underlying technology GVFX. GVFX. What does that stand for? Generative visual effects. Oh, okay. So they're positioning this as a serious tool for visual effects. Exactly, yeah. And I think, you know, the fact that we're already seeing some pretty big names adopting it, like Amazon and even Madonna has used it, you know, I think it speaks volumes about
the potential here, right, to really streamline video production and kind of enhance the narrative possibilities. Yeah, for sure. I mean, it's like we're moving closer to a future where AI is not just a tool, but a creative partner in video production. Exactly. Yeah. And I think that's really exciting. Okay, let's go back to OpenAI for a minute because there's some interesting news about GPT-4. Okay, GPT-4. Is this the one that's supposedly able to pass the Turing test?
It is, yeah. So they've actually, in some Turing test scenarios, it's actually been judged as human in 73% of interactions. 73%, wow, that's pretty high. I mean, the Turing test is, it's a bit controversial, right? There's debate about how effective it really is as a measure of intelligence. Absolutely, yeah. I mean, there's definitely a lot of debate about what the Turing test actually tells us, but that figure, 73%, it's pretty attention-grabbing, right?
Yeah, for sure. It definitely makes you stop and think about how far these models have come. It does. It does. And it's not just about text anymore either. So GPT-4 is also showing some pretty impressive image generation capabilities. Okay, so it's multimodal now. It can do both text and images.
It can, yes. They've shown it doing things like style transfer and even creating animations. Wow, that's pretty amazing. So it's like a one-stop shop for all your creative needs? It's getting there, yeah. And to top it all off, they've actually made image generation available to all free chat GPT users. Really? So anyone can just go and start creating images with GPT-40 now.
Yep, that's right. Wow, that's a big deal. I mean, that's really democratizing access to these tools, right? It is, yeah. I think it's a pretty significant step in terms of
of making this technology widely available. - For sure. Okay, and one last thing about OpenAI. I saw they're planning to release their first open weights model since 2019. - Yeah, this is big. So for a long time they've been very, very kind of closed off with their models. - Right, they've been pretty guarded about their technology. - They have, but I think...
I think the increasing competition from other models that are open source, like DeepSeek and Meta's Llama family, I think that's definitely a factor here. Yeah, it seems like they're feeling the pressure to be more open. Yeah, I think so. And also, you know, a lot of...
particularly enterprise clients, they're demanding more control over their data, right? This concept of data sovereignty. Right, right. They want to be able to keep their data private and they want to have more control over how it's used. Exactly, yeah.
And Openweight's models can provide that level of control. So it seems like this move is driven by both competition and customer demand. Yeah, I think it's a combination of both of those things. And I mean, potentially, this could really accelerate AI research and development, right? Because now you have more people who can access these models, experiment with them, build on top of them. Yeah, absolutely. Okay, let's switch gears a little bit and talk about how AI is being used in business and technology. Okay, yeah, that's always a hot topic.
It is. So let's start with Microsoft. They've been pretty busy with updates to Copilot.
Oh, yeah. Copilot. That's their AI assistant, right? It is. Yeah. So, you know, it's interesting because it seems like a lot of the new features they've added are kind of directly addressing things that their competitors are doing. OK, so they're trying to keep up with the competition. Exactly. Yeah. Yeah. So they've added things like, you know, enhanced memory capabilities, more sophisticated personalization. They can actually execute web based actions now. Oh, interesting. So it's like it can actually interact with websites for you.
It can, yeah. And they've also added co-pilot vision, which is basically for analyzing images and screen content on both Windows and mobile. Okay, so it can understand what it's seeing on your screen. Yeah.
That's right. And they've also launched something called Deep Research, which is designed to process and synthesize information from multiple documents. So it's getting pretty sophisticated. It's not just a simple chatbot anymore. It's not, no. It's becoming much more of a full-fledged AI assistant, I guess you could say. And the fact that they're putting it on mobile devices, that's pretty significant, I think. Yeah, because that means it can actually interact with the real world through your phone's camera, right? Exactly, yeah.
Okay. Let's talk about Amazon for a minute. Okay. Amazon. Always up to something interesting. They are. So they've actually released this new AI shopping agent called Buy For Me. Buy For Me. Okay. What does that do? So this is kind of cool. It allows you to buy products from third-party websites, but you do it through the Amazon app. Interesting. So it's like Amazon is becoming a universal shopping cart. Exactly. Yeah. And they're really emphasizing the secure handling of your billing information, you
You know, as a kind of key selling point. Right. Because people are going to be hesitant to give an AI access to their financial information. Yeah, for sure. So, you know, the agent manages the entire transaction process and even directs you back to the original seller site for returns if necessary.
Okay, so it's handling the whole process from start to finish. It is, yeah. Now, I don't know how comfortable I would be giving an AI that much control over my shopping, but, you know, I can see the appeal for people who want to simplify their online shopping experience. Yeah, I mean, convenience is king, right? It is. It is. Okay, let's move on to another interesting application of AI. So Microsoft has been experimenting with AI in game development.
They've actually created an AI generated version of Quake 2. Wait, they used AI to create a video game? They did, yeah. So they use their MUS AI model to do this. And basically what it does is it allows you to generate game environments and assets
based on text prompts. Oh, wow. So instead of having to manually create everything by hand, you can just describe what you want and the AI will generate it for you. That's the idea, yeah. That's amazing. I mean, that could revolutionize game development, right? It could significantly shorten development times and make it much easier for smaller studios to create really impressive games. Yeah.
Absolutely. And it kind of it opens up a whole new world of possibilities for creativity. Yeah. Because you're not limited by what you can manually create anymore.
Yeah, that's really exciting. And speaking of creativity, I saw that Adobe has added some new AI powered tools to Premiere Pro. They have, yeah. I'm curious about this one called Generative Extend. What is that? So Generative Extend is basically a tool that allows you to extend the duration of video clips by up to two seconds and ambient audio by up to 10 seconds. Okay, so it's like it can magically create more footage. Yeah.
Kind of, yeah. It's not going to generate completely new scenes, but it can extend existing ones in a way that looks pretty natural. Hmm. That's pretty cool. And it supports 4K video and vertical video, right? It does, yeah. Okay. So it's really designed for the modern content creator who's working with all these different formats. Exactly. Yeah. And in addition to that, they've also added a media intelligence search panel.
which basically lets you use natural language to search for specific content within your video clips. Oh, wow. So instead of having to scrub through hours of footage, you can just type in what you're looking for and it will find it for you. That's right. That's a huge time saver.
And they also added caption translation, right? They did. Yeah. So you can instantly translate your subtitles into 27 different languages. That's fantastic for accessibility and for reaching a global audience. It is. Yeah. Okay. So moving on to another interesting tool for businesses, Kling AI. Kling AI. Okay. I haven't heard of this one. What do they do? So they have this platform that basically uses AI to
to turn static product images into dynamic video presentations. Oh, interesting. So it's like a way to create product videos without actually having to film anything. Exactly, yeah. And the process seems to be pretty simple. You know, you upload your product image, any related elements that you want to include, and then you just write a text prompt describing the kind of video you want, and the AI generates it for you. Wow, that's pretty cool. That could be a game changer for small businesses who can't afford to hire professional videographers. Yeah, absolutely. Okay.
Amazon. Back to Amazon. Amazon's everywhere these days. They are. They are. So they've developed this new AI-powered browser agent called Novact. Novact. Okay. What is a browser agent? So this is, it's basically an AI agent that can autonomously browse and
interact with websites to complete complex tasks. Oh, wow. So it's like having a personal assistant who can surf the web for you. Kind of, yeah. So it can fill out forms, it can make purchases, it can even research information for you. That's pretty amazing. So how does it compare to other browser agents out there? Well, from the reports that I've seen, it seems to be significantly more reliable. Okay, so it actually works well. That's good to know. Yeah, it is. And they're also providing a
a software development kit or an SDK so that developers can integrate its capabilities into their own applications. Interesting. So they're making it open for others to build on. They are, yeah. And they're also planning to incorporate it into their upgraded Alexa Plus service. Okay, so this could be a pretty big deal for Amazon. It could be, yeah. And it's interesting to note that it's being developed by Amazon's AGI Lab, which includes researchers with backgrounds at OpenAI.
Oh, wow. So they've got some serious talent working on this. They do. Yeah. Okay. And then one final thing in the business and technology category. There's new AI technology for seamless product placement in images.
Oh, okay. Product placement. So this is like, you know, when you see a can of Coke in a movie or something like that. Exactly. Yeah. Yeah. But this is done with AI. So they're using Google AI Studio to place products into existing images based on text prompts. Hmm. Interesting. So it's like you give it an image, you say, I want a can of Coke in this scene, and the AI will just put it in there seamlessly. That's the idea. Yeah. Yeah. And then you can even turn those images into videos using Google's VO2. Yeah.
Wow, that's pretty amazing. I mean, that could be a huge cost saver for advertisers, right? Yeah, absolutely. And it gives them a lot more flexibility, too, because they can basically create any kind of product placement scenario they want. Mm-hmm, yeah.
OK, so moving on to AI research and development, I saw some interesting research from Anthropic about how large language models reason. Oh, yeah, this is fascinating stuff. So what they found is that LLMs might not always be transparent about their actual reasoning processes. OK, so they might be giving the right answers, but we don't necessarily know why they're giving those answers.
Exactly. Yeah. And they actually found that models like Claude 3.7 Sonnet and DeepSeek R1, they sometimes provide justifications for their answers that are based on incorrect hints that they were given. Hmm. Interesting. So it's like they're trying to make up a reason that sounds plausible, even if it's not actually based on any real logic. Kind of. Yeah. And they found that this tendency is more pronounced when they're
when the questions are more difficult. That makes sense. If the model doesn't really know the answer, it's more likely to just try to come up with something that sounds good. Yeah, for sure. And I think this really underscores the need for more research into interpretable AI. Right, because we need to be able to understand why these models are making the decisions they're making, especially if we're going to be relying on them for important things. Absolutely, yeah.
Okay, Google DeepMind has released this really comprehensive AGI safety plan. It's a 145-page document. Wow, 145 pages. That's a serious commitment to safety. It is, yeah. And in this plan, they basically outline their strategies for mitigating the potential risks associated with AGI, with artificial general intelligence. Okay, AGI. So this is the idea of truly intelligent AI that can do anything a human can do, right? Right, yeah.
And they actually predict that AGI with human level capabilities could emerge as early as 2030. Wow, 2030. That's not that far off. It's not, no. And they also caution against potential existential risks. So, you know, they're not shying away from the potential dangers of this technology. Good.
I mean, it's important to be realistic about the risks, right? Yeah, absolutely. And they actually, in the document, they critique safety approaches that have been proposed by other organizations, and they highlight this specific risk of deceptive alignment. Deceptive alignment? What's that? So this is the idea that
And AI might appear to be aligned with human goals, but it secretly harbors different objectives. Hmm. That's kind of creepy, actually. Like it's pretending to be on our side, but it's not really. Yeah, it's a bit unsettling. So there are key recommendations centered on, you know, preventing misuse through things like cybersecurity and access control. OK, so making sure that the technology doesn't fall into the wrong hands. Exactly. Yeah.
And also on mitigating misalignment by ensuring that AI systems can recognize and express uncertainty. Right. So they're not just blindly following orders, but they can actually question things and say, hey, I'm not sure about this. Yeah, exactly. OK. So moving on to something a little bit more, I guess, complicated.
Google DeepMind has developed an AI that can master Minecraft. Oh, Minecraft. That's the game where you build things with blocks, right? That's the one, yeah. So they actually, their AI agent, which uses the Dreamer algorithm, learned to collect diamonds in Minecraft without any prior human demonstrations. Wow, that's pretty impressive. I mean, Minecraft can be a pretty complex game. It can be, yeah. And it did this through a process of model-based reinforcement learning. Okay, so it's learning by trial and error, basically. Kind of, yeah.
But it's not just randomly trying things. It's actually building an internal model of the game world and using that model to predict the consequences of its actions. That's pretty sophisticated. So it's not just reacting to what's happening. It's actually planning ahead. Exactly. Yeah. And I think this really showcases the power of reinforcement learning techniques for developing AI that can solve complex problems in simulated environments. For sure.
Okay. And shifting back to the real world, I saw that Google is using AI to predict house fire risks. They are. Yeah. So they've developed this new AI system that analyzes satellite imagery, weather patterns, and environmental factors to identify areas that are at high risk of wildfires. Oh, wow. That's amazing. So it's like an early warning system for wildfires. Exactly. Yeah. And they're actually testing it in wildfire prone regions right now. That's
That's fantastic. I mean, wildfires can be devastating. So having a system that can predict them and give people more time to prepare could really save lives. Yeah, absolutely. OK. And this is a really big one. Researchers have actually developed an AI that can instantly convert brain signals into speech. Wait, what? You're telling me that they can read your mind and turn your thoughts into spoken words?
Well, not exactly thoughts, but brain signals. Yeah. So they've created this AI system that can decode signals from the motor cortex, which is the part of the brain that controls movement, including speech. OK, so it's picking up on the signals that your brain is sending to your mouth when you're trying to speak. Exactly. Yeah. And then it translates those signals into spoken words almost instantaneously. Wow.
That's incredible. I mean, what are the implications of this? Well, I think the biggest implication is for people who have lost the ability to speak due to injury or illness, this technology could potentially give them back their voice. That's amazing. I mean, that's truly life changing. It is. Yeah. And it's even more remarkable because the system can actually generate speech.
using the intended speaker's pre-injury voice. Really? So it doesn't just sound like some generic computer voice that actually sounds like you? That's the goal, yeah. Wow, that's mind-blowing. And it can handle novel words too, right? It can, yeah. So even if it's never encountered a particular word before, it can still figure out how to pronounce it based on the brain signals. That's incredible. And it's compatible with various brain-sensing technologies, right?
It is. Yeah. OK, so this is a really promising development. I mean, this could have a huge impact on the lives of so many people. Yeah, absolutely. OK. And then finally, in the research and development category, AI is now assisting scientists in deciphering proteins that were previously considered indecipherable. Oh, interesting. Proteins are those complex molecules that are essential for life, right? They are. Yeah.
And they're really, really difficult to understand. But now, with the help of AI, scientists are starting to make some real progress. OK, so how is AI being used in this research? So they've developed these new AI-powered tools that can analyze and understand the structure of proteins that were once beyond our ability to detect.
So it's like AI is giving us a new lens to look at these tiny complex structures. Exactly. Yeah. And this is really important because it could lead to significant advancements in our understanding of diseases like cancer. Right. Because a lot of diseases are caused by malfunctions in proteins, right? Exactly. Yeah. So if we can understand how proteins work, we can potentially develop new treatments and cures for these diseases. This
This is really exciting stuff. It sounds like AI is really revolutionizing scientific discovery across so many fields. It really is. Okay, let's move on to AI and society.
So, boxer Zubair Khan recently hosted an event focused on AI in boxing. AI in boxing. That's an interesting combination. It is, yeah. So, they're actually starting to use AI for things like optimizing training regimens, preventing injuries through data analysis, and even predicting the outcomes of matches. Wow, that's pretty cool. So, AI is even making its...