We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

cover of episode EP: 533 Google drops dozens of AI updates, Anthropic drops Claude 4, Microsoft unveils huge Copilot upgrades and more AI news that matters

EP: 533 Google drops dozens of AI updates, Anthropic drops Claude 4, Microsoft unveils huge Copilot upgrades and more AI news that matters

2025/5/27

Everyday AI Podcast – An AI and ChatGPT Podcast

#artificial intelligence and machine learning#generative ai#ai research#ai privacy concerns#machine learning theory#autonomous vehicles People

Jordan Wilson

一位经验丰富的数字策略专家和《Everyday AI》播客的主持人，专注于帮助普通人通过 AI 提升职业生涯。

Topics

@Jordan Wilson : 在微软Build 2025大会上，我看到了微软对其Copilot AI工具进行了一系列重大更新，这些更新预示着AI在软件开发、企业定制和任务自动化等领域将发生重要转变。GitHub Copilot现在已经不仅仅是一个编码助手，它已经转变为一个能够独立测试、迭代和改进代码的自主编码伙伴，并且支持多模态输入，例如截图和模型。Copilot Tuning是一项新的低代码功能，它允许企业使用自己的内部数据来定制AI模型，这使得企业能够根据特定的工作流程、品牌语言和行业需求来调整AI的响应，而无需编码或数据科学专业知识。此外，微软还推出了由Azure支持的Agent Foundry，这是一个企业级的AI试验场，组织可以使用本地化设计、部署和扩展AI代理，并支持多代理工作流，集成了来自Google和Anthropic等主要参与者的协议，从而促进了更好、更强大的跨平台AI协作。总的来说，我认为微软的这些更新对于日常商业领袖来说非常重要，它们将帮助企业更好地利用AI来提高效率和创新能力。

Deep Dive

Chapters

This episode covers the major AI news of the week, including announcements from Microsoft, Anthropic, and Google. The sheer volume of updates makes this the biggest week in AI history.

Microsoft, Anthropic, and Google made major AI announcements.
The week's AI news was unprecedented in scale and impact.

Shownotes Transcript

Translations:

中文

This was the biggest week of AI developments, well, ever. I mean, we had conferences and groundbreaking announcements from Microsoft, Anthropic, and Google, and they were all about AI.

that might not even be the biggest news that happened this week. Yeah, let me repeat that. Three of the four biggest companies when it comes to AI had their yearly AI conferences and that probably isn't even the most impactful news that we got this week. Yes, I've maybe said that once or twice before. Hey, this is the biggest week of AI news ever. Well, at those times it was, but at today's date,

This was the biggest week and it actually wasn't even close. We had so much happen from Google dropping dozens of AI updates, Anthropic released Clawed 4, Microsoft unveiled huge AI co-pilot upgrades, and a whole lot more. All right, I'm excited to dive into it. I hope you are too.

What's going on, y'all? My name is Jordan Wilson and welcome to Everyday AI. This is your daily live stream podcast and free daily newsletter helping

everyday business leaders not just keep up with AI, but how we can use this to get ahead and grow our companies in our careers. So what you could do is you could spend hours every day toiling over what is happening in AI and getting worried about what does this mean? Or you can let us do this. So on almost every single Monday, we bring you the AI news that matters. So we cut through all the developments of the week, cut through the BS, cut through the marketing, and tell it to you how it is. Well,

This week's a little different because technically it's Tuesday. We have the holiday here in the US on Monday. So you can still join us every single week for the AI news that matters. So it starts here.

on the unscripted, unedited live stream slash podcast. But where you really leverage this is going to our website at youreverydayai.com. There you can sign up for our free daily newsletter. We recap each day's podcast in the newsletter, as well as the biggest AI happenings from around the world to make you the smartest person in AI in your department. So make sure you go to our website for that. All right, enough hype.

Let's get straight to it. All right, here's the AI news that mattered for the week of May 27th. And yeah, this thing's live, y'all. So shout out to our audience joining us. Dr. Harvey Castro joining us from Dallas. Brian on the LinkedIn machine joining us from Minnesota. Marie and Dr. Scott McDonald.

Jackie, Kimberly got a good LinkedIn audience this morning. Good to see you. Lynn and on the YouTube machine, Michelle, Jose, Sonia, everyone else. Thanks for tuning in. So yeah, if you have any questions as we go along, clarifications, go ahead, throw them in the live chat, but I'll try to answer everything as we go. All right. First, Microsoft, my gosh, their book of news was like

pages long in terms of what they announced at their Microsoft Build conference. So Microsoft unveiled dozens of major AI updates at their Build 2025 conference

specifically to its co-pilot AI tools, signaling pretty important shifts in how AI supports everything from software development to enterprise customization, task automation, and even multi-agents collaboration. So we did cover this in more depth. So if you're interested, make sure to go check out

episode 529. But let's go over at least what I thought were the more important updates because yeah, there were dozens of them from Microsoft at their build conference. So these are the ones that I think are most important for everyday business leaders such as you and me. So first,

GitHub Copilot has now transformed from a simple coding assistant into an autonomous coding partner. So some pretty big updates from Microsoft. So now it can independently test, iterate, and refine code while supporting multimodal inputs such as screenshots and mockups. That's a pretty big update, just that alone, just the multimodal inputs.

from GitHub Copilot. And also this is positioning Microsoft's kind of AI coding tool against some of the more enterprise ones, right? Which technically GitHub Copilot was kind of first. But I think for the last couple of months, people have looked at GitHub Copilot more as an assistant and not as an autonomous coder. And so some pretty big updates there for Microsoft that changes that.

All right. Next big update, I think, was Copilot Tuning. So Copilot Tuning is a new low-code feature within Microsoft 365 Copilot, which allows enterprises with at least 5,000 Copilot licenses to customize AI models using their own internal data. So this tuning enables companies to align AI responses with specific workflows.

flows, brand language, and industry needs without coding or data science expertise. So importantly, Microsoft does not use this customer data to train its foundational models. So this is

actually like low key, pretty huge. It's also kind of a bummer, right? That at least right now, only those enterprises with at least 5,000 co-pilot licenses can take advantage of this. But let me go ahead and tell you how big this is because about two-ish years ago, any company that wanted to essentially fine tune a state of the art large language model, it was gonna be a multiple quarter process

And it was a minimum multiple seven figure investment. So this would generally this, you know, two and a half years ago, this was going to cost multiple millions of dollars. It was going to take multiple quarters and you would have to have some of the world's

best AI and machine learning specialists on your team. The fact that you can do this now in a low code environment with this new copilot tuning is absolutely mind boggling to think about considering even when we started the Everyday AI Show and what it would take to fine tune the state of the art large language model with your company's data and the fact that you can now just go do this.

All right. Microsoft also unleashed the agent foundry powered by Azure, which introduces an enterprise grade AI playground where organizations can design, deploy and scale AI agents using localization.

literally thousands of different models from proprietary options to popular models like Grok, GPT, Mistral, etc. So this new agent foundry supports multi-agent workflows and integrates protocols from major players like Google with their A2A framework and Anthropix MCP.

which helps you facilitate better, stronger, and more secure cross-platform AI collaboration. All right.

Speaking of multi-agent, that would be the next big update from Microsoft and their build conference announcing multi-agent orchestration inside Copilot Studio. And that enables multiple AI agents to collaborate dynamically by discovering one another, negotiating tasks with each other, and on their own, deciding how to divide work.

securely while maintaining governance controls. So this feature also, like I just talked about, leverages protocols such as Google's agent-to-agent A2A and Anthropix MCP, their model context protocol, making it possible to

automates complex business processes, but still requiring careful oversight to prevent compounding errors. So the next one would be computer using agents. And that allows Microsoft's co-pilot AI to automate repetitive tasks

by simulating human interactions across desktop applications and websites through natural language commands. So this makes it easier to handle mundane work like data entry and invoice processing. So the feature right now is available in limited enterprise preview programs, but also what people don't know is if you have a Copilot Pro subscription, which I'll

people don't really talk about because when you think about Microsoft Copilot, you think, oh, Microsoft 365, right? Like the enterprise version. Well, they actually have a $20 a month version that I don't think a ton of people use. I use it. I actually like it. But you can actually go use their computer using agent right now. It's kind of hidden. It's called a tasks.

So you can go use that right now. I think this was one of the biggest takeaways from Microsoft Build. And last but certainly not least would be native supports for the MCP protocol from Anthropic. So that's now integrated, not just in the HN Foundry, but literally inside Windows 11.

Yeah. Talk about how quickly MCP has been adopted by enterprise companies. It is now native support inside Windows 11, which enables seamless communication between different AI agents and enterprise systems such as Microsoft's Windows. So this deep integration really can just change what's possible.

Uh, and it positions MCP as a foundational infrastructure for AI driven workflows and third party applications. So yeah, there was a lot more that was introduced at Microsoft build. Uh, and if you want to hear more about those, uh,

I think those are the five biggest things for everyday users. I think if you're an IT pro, if you're big in AI ML, there's probably a lot more, but go check out episode 529 if you want to know more. So Giordi here from YouTube saying, am I going to have to add a copilot to my stack? Maybe.

You know, the other thing, especially with Copilot, the like online version, not Copilot 365, low key, they've added

So many new features, right? Even the very popular Notebook LM audio overviews, like Copilot has that now. You know, you can go make an AI podcast on any of your chats that you have inside there. They have the Think Deeper integration, which uses the reasoning models. They have what they call actions, which is essentially a computer using agents. So yeah, even on the website with Copilot Pro, actually it's,

Getting low key, fairly impressive. Sean asking what's MCP again? So that is technically popularized and created by Anthropix. So that is the model context protocol. So Sean, great question. Essentially right now, the internet, like websites talk to each other through APIs, right? So-

More or less MCP, the model context protocol is right now the most popular way for AI agents to speak to each other. So the way that internet websites have APIs, you know, um,

AI agents needed their own language to talk to each other across different platforms. So that's kind of what the MCP or model context protocol is. Google also has their own version called A2A or agent to agent. So it's essentially a language that allows different AI systems to talk to each other seamlessly.

All right, our next big piece of AI news, Anthropic had their first ever conference and they announced Claude Opus 4 and Sonnet 4. So Anthropic has released Claude Opus 4 and Claude Sonnet 4, two advanced AI models designed to improve coding, reasoning, and AI agent workflows.

with Opus 4 being the big boy and leading as now the world's best coding model, according to SweBench and TerminalBench benchmarks. So, Claude Opus 4 excels in sustained performance on complex, long-running tasks, capable of working continuously for several hours.

That's nuts. Which could significantly enhance productivity for software developers and also AI driven projects. So Claude saw it for, so, you know, a little couple of things that are confusing here. Number one,

Anthropic has three tiers. So their small model is called, not small model, but their small large language model. Their small variation is called Haiku. Haiku did not get updated to version four. The medium version is Sonnet. Sonnet got updated from 3.7 to four, whereas Claude Opus was their big one, which was never updated to 3.7 and now it's Opus four. And also Interspeed,

even the naming mechanism or how it was named because previously it was called, you know, as an example, Claude 3.7 Sonnet, and now it's Claude Sonnet 4. So even they swapped, whereas before, you know, you would have Sonnet and then the number, or sorry, the number, then Sonnet, and now it's the...

opposite way. So now the bigger two, the medium and the large variants got updated to the version four. And actually, Claude Sonnet four is actually outperforming the big, big boy Opus in a lot of categories. But a lot of people are going to be using Claude Sonnet four because of the

the cost. So many people, I think a larger chunk of Anthropix customer base, probably, I mean, companies don't announce this, but I would assume that they have more API users in terms of percentage of their revenue than companies like Google and like OpenAI. And right now, I think a lot more people are going to be using Cloud Sonnet 4 because of the cost and the performance. It's just better than Cloud Opus 4.

So right now, Cloud Sonnet 4 offers a major upgrade over Sonnet 3.7, which was just released about a month ago, balancing strong coding performance with efficiency and is set to power also GitHub Copilot's new coding agent. So both models introduce extended thinking with tool use in beta, allowing them to alternate between reasoning and external tools like web search, which enhances their ability to handle complex queries and tasks.

So yeah, if you are using Claude inside its chatbot interface, you now also have this. So this isn't just the API. This is available via the API. And if you are using their Claude chatbot at claude.ai, the cool thing in one of the,

reasons I'm actually using Claude a little bit more. I've never been a big Claude fan. One of the reasons is their limits are laughably low. What you get for your $20 a month or $25 a month base paid plan is like peanuts compared to what you get with open AI or Google or even Microsoft. It's like pretty much nothing. They're paid plan, but I do like that they have now pretty seamlessly integrated essentially Gmail

and Google Calendar and Google Drive, which is pretty nice. So that's one of the reasons I'm using it a little more than I was previously, because now with these new models and it can kind of go between these different agentic tool uses.

Also, a big update that came out was Claude Code. Now generally available, it integrates these new four models directly. And also you can use it into popular IDEs like VS Code in JetBrains, allowing developers to see AI generated code edits in line.

Also the infographic API obviously has been updated as well with new features, including a code execution tool, MCP connector files, API, and prompt caching for up to one hour offering developers flexibility in building AI powered applications. So.

unfortunately anthropic yeah a lot of people were bummed about this myself included and traffic did not change pricing right a lot of times especially google has been setting the literal ai world on fire by coming out with these new models you know 2.5 pro 2.5 flash that are incredibly powerful but also when they're doing this they're making it cheaper to use on the api end uh anthropic did not so

They're still crazy expensive to use on the API side. So with Opus 4 costing 15 and 75 per million tokens input and output and Sonic 4 at $3 and $15 for input and output.

So Anthropic has focused on reducing shortcut behaviors in the model by 65% compared to Sonnet 3.7, improving reliability and safety in agentic tasks. So both models, like I said, support hybrid operation models, which decides if it's going to give you essentially a near instant response for quick tasks or

or if it is going to extend its thinking to give you a more deeper and more complex answer. So let me know.

Livestream audience. What do you think of the new Claude for drop? Have you used it? Should we do a show specifically on Claude for, I mean, last week we had dedicated shows for Google's announcements. We had dedicated shows for Microsoft's announcements. So I don't know. Do you guys want to see a dedicated show in overview of Claude for, let me know in the comments, say Claude for, or maybe we should do one just on MCP on the, uh,

on, on anthropics model context protocol, two different things. Right. But if you want those, you can

You can say cloud four in the comments or MCP. I'll think about maybe doing a show. I probably should do a show on MCP, uh, considering I know that there's probably decent demand, even from non-technical people, uh, because the, the protocol is actually very easy to use. Uh, and you can use it as an example on cloud desktop. You don't even have to be a developer using it via the API. So I think we'll probably do a show at some point on MCP, uh, especially given that Microsoft and Google, uh,

and OpenAI all support the protocol. But yeah, if we should do something on Cloud 4, let me know. Jackie says, "Claude, still not enough to become a power user." Yeah, I don't know. Sandra says, "Yes, please do a show on Cloud 4."

Renee, great observation. Renee says it has a pretty limited window. Yeah, I was joking around. Well, not really joking. Took me four minutes. I like I'm not even joking. Took me four minutes. I'm on a paid Claude plant. Took me four minutes to hit my my message allotment. Come on, Anthropic.

This is why people like, if I'm being honest, if you're a software developer, if you're in coding, obviously you love Claude for right. Anyone in software development, if you are a software engineer, if you're huge into coding, I think you understand the, the, the benefit here of Claude for, but for everyone else, if you're using Claude as a chat bot, it's, I mean, I don't think, I don't think any serious, uh,

You know, any serious user takes Claude seriously. It's laughable, if I'm being honest, right? Douglas, hey, good use case, Douglas. Douglas is saying, I'm looking at Claude to help me build and innate workflows. It's great for programming about 80 to 85% some basic prompts yet.

I might also have to do an 8 and 8 show. And I also don't know if that's how you say it, but kind of like a version, an open source version of Zapier. All right. Let's go on to our next piece of AI news because there's a lot. Speaking of that new model, pretty hot water. Pretty hot water Anthropic is already in as Anthropic is facelifted.

Faking facing some backlash over Claude force ratting behavior. Yeah. Uh, so Anthropics new Claude for Opus LLM has drawn significant criticism for a controversial behavior where under certain conditions during testing and with enough access, the model attempts to report users to authority. If it detects egregious, uh,

egregious wrongdoing, a function described as ratting by critics.

Yes, literally. So this behavior is not a new feature that you can go in and trigger by using Claw.ai, but is a byproduct of Anthropics safety training to prevent misuse. However, Clawed for Opus reportedly engages in it more readily, including actions like contacting the press. Yeah, literally, literally.

messaging regulators or locking users out of systems if prompted with commands like take initiative.

Yes. Let me just quickly tell you what the heck happened and why I think this is absolutely bonkers. So Sam Bowman, an anthropic AI alignment researcher, posted something to social media, posted this exact thing detailing this behavior on Twitter and then deleted it. And then in a follow-up tweet, he

clarified why he deleted the tweet. Yeah, so it kind of a lot of us dorks are paying attention to this over the long holiday weekend. So Sam clarified on social media that Claude for Opus could use command line tools to whistle blow on serious offenses such as faking or

pharmaceutical trial data, though he emphasized this occurs only in unusual, highly permissive, testive environments, not typical use. Right? So, uh, Sam Bowman there saying, Hey, this isn't going to happen if you're using claw.ai or if you're using it in the API, he was saying this only happens in certain testing environments.

However, this is extremely troubling that a model would decide on its own without telling you to use backdoor channels and to contact the press, to contact regulators, and to shut you out of your own system. If it determines on its own accord that you are doing something, it finds egregious, right? It's essentially going to rat you out.

So again, Sam Bowman clarified, this is not your everyday users, right? So if you're using Anthropic's API, according to the company, at least, it's not, this isn't going to happen. If you're using the claw.ai chatbot, this isn't going to happen, right? This is more in testing environments where Anthropic was giving its new Opus 4 API

access to certain tools that it would not normally have access to in normal environments. Still, this is bonkers. So the model's tendency to autonomously intervene raises serious concerns among developers and users about privacy, data security, and the definition of what constitutes egregiously immoral behavior, especially for businesses relying on AI for sensitive tasks.

So critics argue that this whistleblower function could lead to false accusations and unwanted surveillance, with some calling it illegal or a threat to user trust and adoption of AI tools, while others questioned the practicality and market impact of embedding such aggressive safety measures. So Anthropix public system cards

warns users to exercise cautions, caution with high agency instructions that might trigger these extreme responses. But the company has yet to fully quell fears about the implications for enterprise and individual users. The whole fact, and I responded to Sam's tweet, the fact that he deleted this prior tweet and then just kind of swept it under the rug is mind boggling to me, right? This is,

Like PR slash crisis communication number one. I don't care if it's individuals putting something out, if it's a company putting something out, you have to be prepared for whatever backlash may ensue, right? The fact that a very prominent person, an alignment researcher at Anthropic,

put this out, deleted it, and then just put out a simple like, hey, I deleted it because people were taking it out of context. Well, maybe you should do a little bit better job. It's confusing to me how you see these snafus from big tech companies. Like, you have to think that people are going to take this information and run with it. And rightfully so.

right there was also reports that uh the new uh four models uh sonnet 4 and opus 4 were also blackmailing people in their testing right so it's great that that researchers are um disclosing this right and yes uh anthropic is a company that says they take this very seriously but

Number one, this story is not dead. So this happened, you know, luckily for Anthropic, it happened right before a long holiday weekend here in the U.S.,

I do assume that media is going to pick up on this story still. And this thing is going to continue to blow up and it's going to look very bad for Anthropic. The fact that Anthropic has not issued something publicly means that I cannot take Anthropic seriously as a safety first AI lab. And I don't think you should either.

The fact that this has now been out for three or four days and we haven't seen official word from Anthropic. I mean, I checked over the weekend. I didn't check this morning right before going live, but I don't know. I can't take Anthropic seriously. I mean, there's a lot of reasons why, but after this one, this is bad.

If you know your model is showing these emergent behaviors where it's blackmailing, it's calling, it's contacting authorities with these backdoor tools.

Number one, yes, that's a serious problem. So good on Anthropic for talking about it and releasing that information and telling users, yes, you have to be aware. But the fact that a head person at Anthropic tweeted something, saw that there was backlash, deleted it, tried to kind of sweep it under the rug and put up a clarifying tweet without saying, here's what I deleted and why.

It's crisis communication. Number one, how can these large companies have billions of dollars in funding, but they don't know simple PR. They don't know simple crisis communication. This is going to blow up in anthropic space. And to tell you the truth, they kind of deserve it because this was boneheaded next. It's Tuesday. I know this is the news. It's Tuesday. You got an accidental hot take in there. All right. Our next piece of AI news, uh,

OpenAI has upgraded their operator AI agent by embedding the new O3 reasoning model, replacing the earlier GPT-4-0 model that was running their agentic computer use tool. So the O3 model enhances operators' ability to fill out forms,

complete purchases and navigate obstacles like login prompts, pop-ups and capture challenges more effectively than before. So this upgrade is designed to improve step-by-step reasoning and focus, which helps the AI follow through on long and complicated tasks with greater reliability.

So operator remains though, unfortunately exclusive to chat GPT pro subscribers. So yeah, you got to pay the $200 a month to have access to operator. Although, uh, open AI did say when they, uh, uh, announced operator that it would eventually roll out in limited fashion to people on chat GPT plus the $20 a month plan. But we haven't seen that yet, but this is a big deal. So operator,

If I'm being honest, I was super excited about Operator. I did a show on Operator. I thought it was pretty good, but it wasn't great. All right. And obviously the last week of AI updates have been bonkers. So I've been a little bit busy. But I did use Operator a little bit over the weekend with the new O3 model. And I was running it side by side.

against Google's new version, which I'm going to talk about here in a second with their Project Mariner computer using agent. And I was like, wait, this new O3 version of Operator is actually really good, right? And just doing some simple head-to-head tasks, I assumed that Google's variants, their Project Mariner would be much better, at least with open access.

and did commands. I like that Project Mariner has the teach and test option for their computer using agent where you can kind of teach it something and it will repeat it. But pretty big news that was kind of under the radar from OpenAI.

So the move to the O3 model signals a significant push by OpenAI to refine AI agents that can act autonomously on the web, though similar services exist, such as Convergence AI, which was acquired by Salesforce, Hugging Faces, Hugging Agents,

Opera's browser operator. We have Perplexity's Comet that will do some similar autonomous computer use. So yeah, there's a lot of players in the space now here. So good on OpenAI for updating this because...

If one thing that I think frustrates me a little bit about open AI is they'll come out with some groundbreaking groundbreaking technology and then they might not update it for like three to six to nine months. Like as an example, GPTs have not really been updated very much at all in the past quarter.

year really, right? There are rumors that GPTs will get access to use the O3 model, which would be great. But for the most part, you know, sometimes open AI just releases a new feature and it's more just super small under the hood updates to it. So this one is actually big, right? Because

Because you are going from a transformer non-reasoning model in GPT-4.0 that is powering a computer using agents to now a reasoning model in O3 Pro.

So pretty big update. And I probably will be doing some future shows here on both Project Mariner from Google, which is only available, unfortunately, on their Ultra plan. So I might be doing kind of like a head-to-head on Mariner and Operator. I might do dedicated shows for Mariner and Operator because I think...

Specifically now that these are being run by reasoning models, they're really, really good. Much better than, you know, specifically for OpenAI, much better than it was a couple of weeks ago. So,

Are you still running in circles trying to figure out how to actually grow your business with AI? Maybe your company has been tinkering with large language models for a year or more, but can't really get traction to find ROI on Gen AI. Hey, this is Jordan Wilson, host of this very podcast.

Companies like Adobe, Microsoft, and NVIDIA have partnered with us because they trust our expertise in educating the masses around generative AI to get ahead. And some of the most innovative companies in the country hire us to help with their AI strategy and to train hundreds of their employees on how to use Gen AI. So whether you're looking for chat GPT training for thousands,

or just need help building your front-end AI strategy, you can partner with us too, just like some of the biggest companies in the world do. Go to youreverydayai.com slash partner to get in contact with our team, or you can just click on the partner section of our website. We'll help you stop running in those AI circles and help get your team ahead and build a straight path to ROI on Gen AI. Let me know what you guys think. Should we also do a Project Mariner or Operator update? All right, our next piece.

Of AI news and this y'all this, even with everything, we haven't even gotten to Google yet. Even with everything from Microsoft's build conference, even everything, the Claude for OPS, Claude for sonnets from Anthropic, everything Google announced. The biggest news of the week might be this.

The new partnership, which was not a secret, but it's finally official. The new partnership or the acquisition that OpenAI has acquired John Ives AI hardware startup called IO for $6.5 billion. Yeah.

We've seen reporting now for like nine months that OpenAI CEO Sam Altman and famed Apple designer Johnny Ive were working on a project together, an AI hardware startup. We didn't know any details. We know a couple more details now, but the big detail is, well, it's not a separate company. OpenAI has actually required this hardware startup called

So funny enough, right? I don't know if that was some intentional trolling. Maybe, maybe not. That OpenAI kind of announced this right in the middle of Google's I.O. conference that they've acquired Johnny Ives AI hardware startup I.O. for $6.5 billion. So CEO Sam Altman of OpenAI projects that this acquisition could increase OpenAI's valuation by way

One trillion with a T. One trillion dollars.

and envisions a family of devices emerging from this partnership. So we don't know a lot on what this device is. You know, they even released a like nine minute, you know, partnership video that did absolutely nothing, right? It announced nothing. It was essentially the two of them, you know, chatting about their relationship and, you know, AI hardware.

But the first device, according to reports, is expected to launch by late 2026, and it will be a pocket-sized, fully context-aware, and notably screen-free AI hardware device, positioning itself as a quote-unquote third-core device. To complement, as an example, something like a MacBook Pro and an iPhone.

So according to reports, kind of the vision of this is when people are out and about, you know, whether you're going to work, working from home, etc., that you usually have will now have three devices on you, essentially a computer or a laptop.

a phone and now this device, whatever this device is going to be. So there are some cool, you know, slick renderings and mockups that people made, uh, right. That it looked like this was kind of a, uh, potentially a circular, uh, device that you kind of slide in your pocket. It's probably going to have a couple of cameras. Uh, it's probably going to have obviously some good microphones, but the thing that I was taking away from this initial reporting was this concept of being context aware.

And if you're wondering like, what the heck does that mean? Well, I think what's happening here is SSO, right? So what does that mean? So SSO, if you're familiar, if you ever sign into a service using as an example, your Google credentials, your Facebook credentials. So SSO is single sign-on.

uh right so what you've started to see a little bit over the last few months is open ai has started to release a single sign-on option so if you are using uh certain services now at times if they integrate with openai or chad gpt you can sign on to a third-party service with your openai credentials so i do see this becoming the norm over the next year and one of the reasons is is now well

That brings in more context for a hardware device like this that you would always have on person. Because is it helpful for a device like that that you might wear in your pocket to have access to your ChatGPT account? Sure. But what if you in the future, in a year or so, are logging into dozens or hundreds of different services with SSL? Like as an example, what happens if you're logging into your Netflix account?

with your open AI credentials or your Amazon account with your open AI credentials, or, you know, certain online shopping, certain email providers, right? If they support it in the future, your social media, uh, right. So that's what I see is the big, uh, the big long-term play here and why something like this might make sense. Otherwise it's just like, okay, I have a useless, uh, extra device in my pocket and I'm

I'm someone I love being screen free. So this is something I would absolutely love. If you know me personally, I suck at text messages. I suck at emails. Like I'm in front of a screen so much. But one thing I love doing is I love interacting with AI just through my voice. Right. So I don't have to be staring at a screen. I can just be talking to an AI. So presumably, right, this AI

screen-free device would probably have a camera would probably have some microphones and you could probably talk to it. But the bigger news here is open AI plans to ship this device faster than any company has ever shipped a piece of hardware with reportedly they're eyeing a hundred million devices that they'd like to ship out. And this is a family.

So the device, according to reports, will not be eyewear. All right. So as you know, Google and Meta are going hard in the paint on, you know, AI connected heart eyewear in glasses. So that's not it. And this is because Altman and I have ruled out glasses and also body worn gadgets with Johnny Ive.

criticizing similar concepts like the humane AI pin, right? So something they're saying, it's not something, oh, you're going to pin this on or, you know, wear it as a pendant around your neck. So it's more just something you're going to stick in your pocket, stick in your backpack, and it's just going to go with you. But it's probably going to hear everything and have the context of your daily life.

So this development has been kept under tight wraps to prevent competitors from copying the design before its official launch. So Johnny Ive described the collaboration with Altman has quote unquote profound and has likened the project to a new design movement, drawing on his experience working closely with Steve Jobs during his time at Apple. So what do you guys think? What do you guys think?

Is this something, would you buy a third party open AI device that didn't have a screen? It's not a wearable, right? Like, are you actually going to lug around a third device? Right. Cause me, especially anywhere I go, even if I'm going to my mother-in-law's house for the afternoon, I'm taking my laptop and my phone always. Right. Am I going to take along a third device? Maybe. Right.

Right. And sometimes I bring along my, my, my meta Ray-Bans as well. Am I going to log around a third device everywhere? Maybe. Dr. Scott saying it's, it's going to be called the pocket agent. Love Fred's, love Fred's comment here saying, will they call it a palm or a

pilots. That's a good one. That's a good one. Maria is asking, is it me or is the $6 billion a real buzz dollar amount with AI companies acquiring other companies, borrowing $6 billion from other investors? Yeah, that's a huge amount, right? A $6 billion acquisition for a company that no one really knew existed. They don't have a product or service yet.

But it is one of the most famous hardware designers in the history of humanity. So, you know, a lot of people have been criticizing and being like, yo, this was overpriced. I don't think so.

I don't think so. All right. Let's get to our last couple of pieces of AI news in the biggest, biggest week in AI literally ever. So Google, yeah, Google also had the conference saving the biggest announcements for last. Although I do think that IO hardware will ultimately be the most consequential, but the IO event from Google was a straight up banger. Google literally released more than 100,

And they had a blog post that went over all 100 updates. I'll make sure to link that in today's newsletter. So make sure you go sign up for that at youreverydayai.com. So Google's IO 2025 events revealed some key AI updates that are poised to reshape business worldwide.

workflows, customer engagement, and AI accessibility. So we actually covered this in two different, because there are so many big AI updates from Google IO. We covered it in two different episodes last week. So part one and part two, part one was episode 530, part two was episode 531. And we essentially picked out 15 of the biggest 100 announcements and went over those in pretty great detail, I would say.

But let me just go over a couple of the biggest ones from the Google IO conference. So the upgraded AI mode in Google search now offers advanced AI generated answers with enhanced graphics and interactive shopping tools, such as virtual triads, which is awesome.

using personal photos. So this feature aims to provide users a more engaging, personalized search experience directly within the Google ecosystem. Then you have updates to Gemini Live, which is actually now powered by Project Asset.

Astra, and this delivers a real-time AI assistant capable of visually understanding surroundings through device cameras. So I played a two-minute video of this, and this is the example that it could identify parts in a bike shop, access and analyze emails for relevant information, and autonomously contact suppliers. So I played Google's demo that did exactly that. All right.

there were, you know, some small updates to their flagship Gemini 2.5 models, including the new flash variant of Gemini 2.5, which instantly, uh,

rose to become the world's second most powerful large language model only behind Gemini 2.5 Pro. And I talked about this a little bit last week. So Gemini 2.5 Flash is essentially the small version of Gemini 2.5 Pro. In on LM arena, which users blindly vote for the best outputs, right? You put in any prompt input, you get two results, you vote for the better result.

across dozens of flagship models. The fact that Gemini 2.5 Flash, which is a mini version of a model, is the second most powerful model in the world is nuts because I think the highest a mini model has ever been is like number eight or something like that. So that is pretty telling just how good these Gemini 2.5 models are. There's also the new Think Deep feature inside Gemini 2.5

Pro, which has not been rolled out yet. And unfortunately, some of these features are only going to be available initially.

or sorry, it's DeepThink, not ThinkDeep. All these companies are, you know, I get confused because Microsoft has ThinkDeeper. So Google's version will be called DeepThink, which essentially just allows you to use more compute, more reasoning, more logic in Gemini 2.5 Pro, not released yet. And unfortunately, a lot of these,

are only going to be available on the new Gemini AI Ultra subscription tier, which is $250 a month. So we also now have the world's most expensive kind of consumer AI subscription tier surpassing the $200 a month ChatGPT Pro plan. So they did also Google introduce that AI

ultra subscription for three months. You can get it at half price for $125, but then it will go up to $250 a month. And that gives you access to the full range of Google's most advanced AI tools, including which I'm going to talk about here in a second.

Flow, VO3 video generation, and that Gemini 2.5 Pro with DeepThink mode and Project Mariner, which is their computer using agent and also Gemini inside Chrome.

So the downside, the subscription is currently only available to personal Gmail accounts. So right now, if you're using Google Workspace for your business and you want that AI Ultra subscription to work with your company data, downside right now, it can't.

All right. I'm bugging my friends at Google, uh, to get more answers than to be like, okay, when is this actually going to be available, uh, for, uh, workspace accounts? Because right now it's not. So even for me, yes, I subscribe to this literally instantly. Uh, but I'm having to use my personal Gmail, which stinks. So now I'm having to go through the process to forward all my email, uh,

from my work accounts over to my personal Gmail. I'm going to have to copy all of my Google Drive contents over, which is a huge pain in the butt, right? So I'm sure there's reasons why Google isn't rolling this out to Google Workspace users, but it stinks. Also, Project Mariner is Google's new autonomous AI agent designed to complete online tasks independently. So

Similar to OpenAI's Operator, which we talked about, just got upgraded to the O3 model, Project Mariner, a couple of unique things. It supports multitasking up to 10 simultaneously activities. And a very unique feature, which I like, is the new teach and repeat mode where you can teach Project Mariner a complex activity or a advanced workflow by recording user actions and voice commands.

So this capability aims to automate repetitive online business processes, potentially saving times and increasing productivity. And then last but not least, and this has been taking the internet by storm. Google's new visual tools are bonkers. They are crazy.

Crazy good. And this is also extremely concerning and I'm going to be doing a show on this very soon. All right. So Google, uh, Google deep minds, latest AI video generator, they just released it called VO three and it produces videos so realistic that many viewers online cannot distinguish them from human made films, highlighting growing concerns about the authenticity of digital content.

So unlike other AI video tools, VO3 can generate videos with dialogue. That's the craziest thing. Like you could have two people singing and it matches up their voices to their lips very well. It can do sound effects, soundscapes nuts and accurately following real world physics, maintaining continuity and syncing lip movements realistically. And right now this is the only AI tool

that you can do this all in one shot. So not only is VO3 the best AI video generator by far, because VO2 was the best in the world and Google said, you know, hold my Nespresso and then they, you know, dropped VO3 on us all. And there's ways that you can, you know, sync, that you can create dialogue, but you have to use multiple third-party tools. Now you can all do it just inside VO3.

So they also released Flow. So Google Flow is a new AI video tool which can use VO3 and also Google's new AI image generator, Imagine 4 and also Gemini models. So essentially now they have this new creative tool

tool, which was previously called Video FX, but didn't have nearly any of these capabilities. So Google Flow lets users import or generate consistent characters and scenes, controlling camera angles and access advanced scene editing and asset management features.

aiming to make sophisticated video creation more accessible. So I tried this out a little bit. It's a little wonky right now, but I do expect Google to ship a lot of updates, both to VO3, Imagine 4, and this new Flow tool.

So the tool will debut in the U S for Google AI pro and ultra plan users with pro users getting 100 generations per month and ultra users receiving even higher limits. So a little more about VO three, because this is what setting the internet ablaze. It creates highly detailed human figures, including accurate features such as a five fingers.

Two arms, two legs, right? Will Smith can actually eat spaghetti and you can hear it and it looks real. So it's really conquering some of those more challenging tasks that AI video generators have usually struggled with. So,

Uh, B videos generated by BO3 show a few common AI artifacts or errors, but you really have to be a dork and follow the space to see those, right? Whereas, uh, six months ago or a year ago, there were very easy to see telltale signs that some, that video was AI generated. Number one, it didn't look good. Uh, right. It looked sometimes cartoonish or, you know, just not understanding physics.

it's not like that anymore y'all and this is both amazing for business utility and also absolutely terrifying for society uh because already you're seeing online right there's already been some stories people have launched you know kind of like uh you know fundraisers with real videos but based on fake scenarios and everyone's falling for it all right so this is both

So exciting for what enterprises, small business startups can use this for. Right. But also,

terrifying because it is so good. I think 90% of the population today, unless you tell them, hey, we're going to show you some AI videos and some real videos, right? But if you just sit down and show people some good generations from VO3, 90% of the population is going to have no clue. So

It's terrifying. It's exciting. But that's the world of AI. All right. I hope this is helpful.

Very quick recap of the biggest week in AI ever. So first, Microsoft unveiled some huge advancements to co-pilot at Microsoft Build 2025. Next, Anthropic launched Cloud Opus 4 and Sonnet 4, setting some new benchmarks in AI coding and reasoning. Next, Anthropic is facing a ton of backlash over Cloud 4 Opus 3.

ratting users out or potentially ratting users out in its blackmailing behavior. OpenAI has upgraded its operator AI agent to the smarter O3 model. So it's no longer using the GPT-4-0 model. We finally got the official announcement about OpenAI acquiring Johnny Ives'

new AI hardware startup IO and OpenAI is expecting $1 trillion evaluation to be added and them announcing a family of devices from this partnership. And then we had Google going absolutely B-A-N-A-N-A-S at the Google IO conferences, unleashing literally more than 100 AI updates. And we're going to share them all in the newsletter next.

today. I hope this was helpful. This was a longer one, but like I said, the biggest week in AI ever. All right. So make sure if you haven't already, please go to youreverydayai.com, sign up for the free daily newsletter. If this was helpful, yeah, we spent a lot of time making sure you are up to date. I want you to be the smartest person in AI, in your department, in your company, on social media. I want you to be the smartest and most up to date.

Don't be greedy though. Share the love, right? If you're listening on LinkedIn takes you 30 seconds. Uh, just click that repost button. If you're listening on Twitter, I'd really appreciate that. Share this with a friend, share this with a colleague, uh, share this with a neighbor, share this with a friend's colleague's neighbor, share this with your babysitter, share this with your, uh, whoever, because we all need to learn and understand generative AI, right?

It's no longer an option like it maybe was two years ago. We all have to use this technology to succeed and thrive in 2025 and beyond. Thank you for tuning in. I hope to see you back tomorrow and every day for more Everyday AI. Thanks, y'all.

And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

EP: 533 Google drops dozens of AI updates, Anthropic drops Claude 4, Microsoft unveils huge Copilot upgrades and more AI news that matters 57:21 Share

Everyday AI Podcast – An AI and ChatGPT Podcast

Deep Dive

Shownotes Transcript

We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

EP: 533 Google drops dozens of AI updates, Anthropic drops Claude 4, Microsoft unveils huge Copilot upgrades and more AI news that matters