We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

EP 508: OpenAI’s impressive new thinking models, Google gives free AI to millions and more AI News That Matters

2025/4/21

Everyday AI Podcast – An AI and ChatGPT Podcast

AI Deep Dive AI Chapters Transcript

People

Jordan Wilson

一位经验丰富的数字策略专家和《Everyday AI》播客的主持人，专注于帮助普通人通过 AI 提升职业生涯。

Topics

我观察到AI领域发展迅速，竞争激烈。微软发布的AI代理能够像人类一样使用电脑，但这并不是本周最重要的新闻。OpenAI发布了五个大型语言模型，谷歌免费向数千万用户提供Gemini AI，Anthropic也更新了Claude。OpenAI收购Windsurf的举动显示了代码生成市场的竞争激烈。谷歌的Gemini 2.5 Flash模型允许开发者设置计算推理预算，其性能优于竞争对手。谷歌的VO2 AI视频生成器也十分强大。美国政府可能禁止DeepSeek，这反映了地缘政治因素对AI发展的影响。微软的Copilot Studio中的AI代理可以自动化网站和应用程序上的操作。谷歌向美国大学生免费提供Gemini Advanced，这体现了公司争夺用户的策略。Anthropic为Claude增加了Google Workspace集成和新的研究工具。OpenAI正在测试一个新的社交媒体平台。GPT 4.1模型具有100万个token的上下文窗口和更低的价格。OpenAI发布了O3和O4 Mini模型，这些模型能够进行推理并使用多种工具。

Deep Dive

Chapters

OpenAI is in talks to acquire Windsurf, an AI coding company, for $3 billion. This follows failed acquisition talks with AnySphere, the parent company of Cursor. The move highlights the intense competition in the AI-powered coding assistant market.

OpenAI in advanced talks to acquire Windsurf for $3 billion
Acquisition driven by securing a major stake in the code generation market
Prior investment and failed acquisition attempts with AnySphere (Cursor's parent company)
Intense competition among AI coding startups like Codium
Windsurf's annual recurring revenue (ARR) of $40 million compared to Cursor's $200 million

Shownotes Transcript

Translations:

中文

This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. One of the largest companies in the world in Microsoft just released an autonomous AI agent that can use a computer and use apps and browse websites just like a human can.

Yet somehow that probably wasn't even a top five AI news story of the week. That's because we got five new large language model modes from OpenAI. Google released a ton and they're giving away their Gemini AI products for

for free to tens of millions of people. And even Claude's, or sorry, Anthropic from Claude came out and finally made some sizable updates that make you now have to consider Claude. Yeah, a lot going on as any week in the world of AI news. And we're going to be covering all of it today on Everyday AI with our weekly installment of the AI News That Matters.

What's going on, y'all? My name is Jordan Wilson, and I'm the host of Everyday AI, and this thing, it is for you. This is our daily live stream podcast and free daily newsletter, helping us all not just keep up with what's happening in the AI world, but how we can use all this information to get ahead, grow our companies, and grow our careers. So if you are trying to be the smartest person in your company or department when it comes to generative AI, this

This is your new home. Your second home is our website at youreverydayai.com. So there on our website, you can sign up for our free daily newsletter. So each and every day, yeah, we have the live stream and the podcast at 7:30 AM Central Standard Time. But then we recap all of the most important insights in our free daily newsletter, as well as keeping you up to date with everything else that's happening in the world of AI.

Speaking of everything else that's happening in the world of AI, there's literally too much to keep up with. And I get it, right? So that's why almost every single Monday, we do our little special installment of the AI news that matters. So we cut through all of the most important updates of the week, all of the fluff, all of the good stuff, and we just give it to you straight.

No bias. Just give you the bullet points in our take. So that's what we're going to do right now by going over the AI news that matters for the week of April 21st.

All right. I'm excited, y'all. Livestream audience. How y'all doing? It's good. Good to see some of you. Aiden in the house rooting for the Hoosiers. Good to see you, Aiden. Gene and Douglas, everyone else. Christopher, Fred, holding it down for my people here in Chicago. Rolando, Rolando, Kyle, Sandra, everyone else. Thanks for joining.

Love doing this live, right? People are always like, oh, Jordan, you should, you know, maybe prerecord this thing and edit it so you don't say the word um so many times. I love doing this live because then I get to hang out with you all and learn together. So let's get straight into it with our first big AI news story of the week.

An acquisition, a multiple billion dollar acquisition from OpenAI? Okay, maybe. So according to reports, OpenAI is in advanced talks to acquire AI coding company Windsurf for $3 billion as the company seeks to secure a major stake in the rapidly growing code generation market.

So despite open AI's prior investments in any sphere, the creator of the popular coding assistant cursor, the company's acquisitions discussions with any sphere reportedly failed twice as confirmed by CNBC. So, uh,

Cursor is currently generating around $200 million in annualized recurring revenue, while Windsurf brings in about $40 million in ARR, highlighting the intense competition among AI coding startups. So after OpenAI's acquisition talks with the parent company for Cursor collapse, that's when the talks to

of acquire Windsurf picked up. So OpenAI's pursuit of Windsurf, despite its own recent launch of the Codex CLI coding tool, signals a sense of urgency to capture market share and not wait for in-house products to gain adoption.

This move underscores how competitive the AI-powered coding assistant sector has become with several startups, including Codium, vying for leadership as developers increasingly rely on generative AI to speed up software creation.

So yeah, this one is pretty interesting. Wasn't shocked when I saw it, but I was like, huh, right? Because yeah, as we just said there, OpenAI has invested in any sphere, the parent company of Cursor, and apparently was trying to acquire Cursor before those talks fell through. Yeah, apparently Cursor, you know, got too popular too quickly already bringing in $200

million in annualized recurring revenue. So now it looks like OpenAI's kind of aim or their focus has turned to windsurf. So yeah, I would say that that's the 1A and the 1B kind of in the

AI IDE or the coding AI realm right now with cursor in the lead and then windsurf right behind. And then you have more specialized tools like lovable and bolt. And, you know, obviously I don't know why more people aren't using Microsoft's get a GitHub copilot. It's an amazing tool, very similar to some of these others. But it looks like, you know, for whatever reason, you know, cursor in, in windsurf really just took off, you know,

you know, online pretty quickly. And I think a lot of that kind of online buzz led to just users in the hundreds of thousands flocking to these new IDE tools. It was pretty, pretty interesting as well. I got to talk to Windsurf's leadership team at the Google Next conference. Yeah, I know. Crazy, right? Like I talk to people that don't end up on this show, but, you know, some pretty cool things that they were cooking up there that I got to talk to them about.

All right, our next piece of AI news. Google has launched Gemini 2.5 Flash, a new AI model that allows developers to set a thinking budget controlling how much computational reasoning the model uses. So pricing also reflects this flexibility with output costs ranging from 60 cents per million output tokens with reasoning off

to $3.50 with reasoning on. So yeah, if you keep that kind of thinking mode enabled on the new Gemini 2.5 Flash, you're looking at about a more than 5X increase in price, almost a 6X increase in price.

So the new model from Google, the Gemini 2.5 Flash, automatically adjusts its reasoning budget based on the task complexity, aiming to help businesses save money on simple queries and invest more for complex problem solving. So early benchmarks show that Gemini 2.5 Flash, even though it is the smaller, much smaller brother of the world-leading Gemini 2.5,

But benchmarks show Gemini 2.5 Flash is already outperforming key competitors like Anthropix Cloud 3.7 Sonnet and DeepSeek R1, while even coming close to OpenAI's new O4 Mini in reasoning tasks.

So right now it's available in Google's AI Studio to play around with for free, right? But obviously that training data inside Google's free AI Studio goes to Google, but you can also pay for it and start using it on the backend in your products in Vertex AI. You can also use it in the Gemini app.

And this release is part of Google's broader AI strategy, including some things that we're going to be talking about here in a minute.

So pretty impressive benchmarks, obviously, for Gemini 2.5 Flash. I mean, you just saw this. This is their small model, right? Not a small language model, but their small version of their large language model, Gemini 2.5 Pro. And it already, the small one, is getting better marks on benchmarks than Anthropix Claude 3.7, which is also a thinking model. So

obviously pretty impressive results here from Google. I would not want to be anthropic in this position, right? When your two biggest competitors in OpenAI and Google both in the same week come out with smaller versions of their models that cost a fraction of what it costs to use Anthropix Claude on the backend and it is blowing their big boy out of the water. All right.

So, what do y'all think? You know, I'm going to get some comments and some thoughts here from our live stream audience as I sip on my very strong coffee. Yeah, Trevor, what's up, Trevor?

Trevor's great with the live stream stuff on LinkedIn. Trevor says, hard keeping all these versions straight. Yeah, absolutely is, right? I think some people have asked for this. I think I'm going to create a graph, essentially, that has the latest models, what they're good at, because even OpenAI, I said, they just released five new models, right? So it's like, okay, models that you were probably using last week, like O3 Mini High,

are gone right uh gpt 4.5 is leaving at least in the api and now you have this alphabet soup of all these other new models including gemini 2.5 flash but it is extremely impressive so yeah you would probably not want to use gemini 2.5 flash on the front end right so if you're using it inside you know the paid you know if you're on that 20 a month uh google ai plan

there's really no need to use that model right because you get full access to the gemini 2.5 pro this is really for developers who are building with this on the back end right so if you're using google's api to build your own product or to create a version of google gemini 2.5 pro a smaller version with their flash model impressive results so far all right speaking of impressive

It's not even close. Google VO's AI video generator VO2 is by far the best and most capable AI video model. And now it is rolling out to Gemini advanced subscribers.

So yes, Google has finally unveiled VO2, their industry-leading text-to-video AI model for Gemini Advanced subscribers, letting users generate 8-second 720p videos from just prompts.

So users can now create videos that can be shared directly to social media with though monthly limits on how many can be made. Also right now inside Google Gemini's, you know, their front end chat bot. That's where you can go use VO2. Now it is a slow rollout. I didn't have it on any of my Gemini advanced accounts right now, but you will just have to go back and look. But right now it is just text prompts. So if you do want the

full power of VO2, you still might have to use their own either Vertex AI platform or it is available as well inside Google's AI studio.

But right now, Google 2, or sorry, VO2, is touted for its improved realism and understanding of physics and human motion, producing more lifelike content. All videos, though, do feature the SynthID digital watermark for transparency about their AI origin. Google also introduced WiskAnimate, a tool that turns images into short videos available online.

tongue-tied this morning, available globally. I was trying to combine available and globally, which doesn't make sense. Available globally to Google One AI premium subscribers. Yeah, so more on the Google One AI premium, but what that means is essentially you're paying $20 a month for a little bit of everything. You get some of Google's normal non-AI tools and features as well as all of their AI offerings.

So, I don't know. Livestream audience, has anyone seen this pop up in their Gemini advance plan so far? I think I have three or four different accounts with that $20 a month Gemini advance plan. I didn't see it pop up in any over the weekend yet.

but I would assume probably in the next week or so. But right now I've been using Google VO2 on the back and inside Google AI Studio. It is a little bit more flexible and you get some of these new features that aren't yet available when you're using the Gemini chatbot.

So, so Kimberly says tried VO two. It's nice. Yeah, it is extremely nice. I still think I'll probably use open AI Sora for some instances. I think there's some, you know, some, some UI UX features inside Sora that I really like the ability to kind of string together multiple clips at once and create more of a, a,

of a short with multiple of these AI generated clips. But if you're just looking for one clip or if you're just looking for overall quality, I still think VO2 cannot be matched, at least not right now. Obviously these AI video tools are being updated just like large language models almost on the daily, but you have to, I mean, if you haven't already, you gotta go check out VO2. It's nice. It is really, really nice.

Yeah, Kyle said it didn't pop up in his either, but he's liking Wisk. Yeah, Wisk is just a super fun tool to use.

All right. Our next piece of AI news. The Trump administration here in the U.S. is considering new restrictions on Chinese AI lab deep seek, potentially limiting its access to NVIDIA's AI chips and also barring and banning Americans from using it. So this move follows the White House's recent timeline.

tightening of rules restricting NVIDIA's AI chip sales to China, expanding on measures first introduced by the Biden administration. So DeepSeek has rapidly gained popularity among U.S. developers due to its competitive pricing, prompting Silicon Valley to lower the cost on its own advanced AI models.

The Trump administration's actions are part of a broader U.S. effort to slow China's progress in artificial intelligence and also protect American technology and consumer markets. There are ongoing concerns about DeepSeek's business practices as OpenAI has accused the Chinese lab of distilling its models in ways that may violate intellectual property rights and OpenAI's terms of use.

For individuals and companies, these restrictions could mean fewer low-cost AI options and increased pressure on U.S. firms to innovate and protect their own intellectual property. So according to the New York Times, the decisions made could reshape the competitive landscape for AI development and access, especially for startups and smaller businesses relying on affordable, cutting-edge AI tools. All right, so I've covered this plenty.

I'm going to try not to accidentally go into a hot take Tuesday here on this piece of news. I will say this. Cover the deep seek saga a couple of months ago. So if you want the...

The truth with receipts. Go read that. But there's a reason why the US government is potentially looking to ban deep seek. It's because whether you know it or not, if you are using directly deep seeks API, if you are using deep seeks chat on the front end, all of your data goes directly to the Chinese government.

So I know people don't like to talk about geopolitics a lot, and I'm not going to dive in too deeply. Right. But here's the reality. I'm from the US. Right. So this is like artificial intelligence, whether you want to admit it or not. It is about so much more than technology. It is about global power. Right.

Let's just call it what it is, right? Right now, compute AI chips and large language models are the new oil. They are the new gold. They are the new currency, right?

Right. Essentially. So when it comes to geopolitical tensions, I think it's important to call that out. This is about so much more than just, oh, large language models or, you know, chip exports. No. Right. I think we've already seen it for the past year and a half. I think we're going to continue to see it even more. Just tighter restrictions around the most powerful technology. But y'all, you have to be smart.

And this is why I told you all I told you all this. Right. I didn't jump on the bandwagon like, you know, it's funny. These you know, you have these, quote unquote, AI influencers on social media. And when Deep Seek came out, almost every single one of them is like, go use Deep Seek. It's so cheap. OK, that's because you were sending your data to China. Right.

Whether you want to send your company's proprietary confidential data to China is ultimately up to you, right? But DeepSeek does not work in the same way that if you go and log on to, you know, ChatGPT or Google Gemini or Anthropix Cloud, right? There's built-in data protections with those companies being based here in the US. So yeah, if you have been using DeepSeek's API directly, not through third-party service providers,

who essentially go through and they make this safer and they take out some of the built-in biases. But if you've been using DeepSeek's API directly, if you've been using DeepSeek directly on the web, anything that you've uploaded has been sent and is being used by the Chinese government. So maybe you're fine with that. That's okay. But just important to call that out. And that's why we're probably going to see these talks on a DeepSeek ban continuing. All right.

our next piece of ai news yeah i started the show off on this this isn't even a top five ai news story of the week which is silly right because this is huge so microsoft has launched a new computer use agent inside copilot studio enabling ai agents to automate actions on websites and apps as if they were actual human users right and you can get this set up essentially no code

low code. So you can go into Microsoft's Copilot Studio if your organization has given you full access to Microsoft 365 Copilot in Copilot Studio. And you can go right now inside Copilot Studio and get a computer using agent that just

So this is significant because it lets AI agents handle tasks even when there are no APIs or built-in integrations, which could dramatically expand automation possibilities for businesses.

So the feature allows agents to click, type, and navigate, essentially performing any activity a person could do online, such as filling out reports, logging into secure sites, or even managing customer service requests.

So Microsoft executives have emphasized that if a person can use an app, so can the AI agent, making automation accessible for a wider range of business processes. So the update builds on Microsoft's earlier

actions features, but is designed for more advanced in business scale automation rather than just personal use. So the technology is also able to adapt to changing websites and app layouts, making it more reliable for ongoing real world automation, uh,

automation needs like invoice processing, data entry, or just even more complex research that maybe some of these research tools can't get to. So this development follows similar efforts by OpenAI's operator and reflects a broader industry push to streamline repetitive tasks and free up time for more valuable work.

Yeah, it could run your LinkedIn activity 24-7. Sure. You know, that's actually one thing I tried to get operator to do, OpenAI's operator, and it didn't work very well. One of the main reasons it's not actually an operator, like, so yeah, if you're wanting, you know, one of these AI agents to go use LinkedIn. One of the reasons it actually doesn't work very well is because the LinkedIn interface

stinks, right? So what I was trying to do to have operator do is to go through my DMS, not reply to them, right? But just mark anything that's important because I don't know, 50% of what I get on LinkedIn is spam.

And it's hard for me because I get obviously legitimate people like you all reaching out, right? Like if you all see a story that breaks, you know, when people are building new products and, you know, they, you know, want open AI or sorry, they want everyday AI to cover them, right? I get a lot of important DMs, but it's hard for me to go through them all because I don't know. I have probably, I don't know.

a couple thousand unread over the years, right? So I was trying to train operator to go through and read them and it did okay, but it was more of an interface bug because when you're doing this infinite scroll thing on the LinkedIn inbox, one little pixel difference is all it takes, right? So yeah, maybe I'll have to try out the new computer use agent from Copilot Studio, see if that does any better. All right, let's keep it going. Google just gave away

It's most powerful AI for free.

to like 20 million people. So Google has announced that all US college students with a valid dot edu email address can now get a full year of free access to Gemini Advance. So this move is part of CEO Sundar Pichai's strategy to reach 500 million Gemini users by the end of 2025.

So right now, eligible students can sign up for the Google One AI Premium Plan, which is normally $20 a month. And that includes the advanced Gemini Pro models, unlimited deep research tool usage, VO2 video generator, notebook LMR,

plus Gemini Live, as well as two terabytes of Google Drive storage. So yeah, if you are a college student and you have a valid .edu email address for a university here in the US, yeah, you are going to get a free year of Google's

best AI offering. So the offer is immediately available in runs until June 30th. So yeah, you do have to sign up by this June 30th, but the free access actually lasts through this spring semester of 2026. So yeah, if you sign up as an example today, you could get up to like, what is that? Like

like 13 or 14 months of free Google Gemini. So that's, you know, almost $400 of free value.

So Google's definition, though, of student is broad. So it says anyone with a .edu email address qualifies even if they are not currently enrolled in classes. So this strategy could obviously help students and recent graduates. So, you know, students are preparing for finals right now. There you go. You can prepare with Notebook LM+, which I would highly encourage you to do. My gosh.

as well as recent graduates, right? If you're looking to land a job, you know, some great tools inside Google's Gemini that can help you do that. For Google though, the promotion represents a very calculated effort to build user loyalty among younger adults and future professionals, even at the cost of short-term revenue. You know what? This is where it's like, again, one of those things, I don't want to be anthropic, right?

Because OpenAI has said that they're losing money on their more premium tiers, such as the $200 a month premiums.

pro tier, right? It's been widely reported that open AI is losing billions of dollars a year. Here we go. Now Google following suit, just being like, eh, we don't really care about short-term revenue. We care about users, right? So I think it's an extremely smart move from Google. And this follows after open AI essentially announced two months before

of free access to its ChatGPT Plus $20 a month plan to students. So Google essentially said, yeah, OpenAI, we'll see that two months and we'll raise you an entire year. So yes, this is a race for users. It's a race for eyeballs. And again, I don't want to be anthropic in this one. Speaking of anthropic, finally, they upped their relevancy just completely.

A little bit. All right. So Anthropic has rolled out Google Workspace integration for its Cloud AI chatbot, letting users pull information directly from Gmail Docs calendar, according to the company.

So the integration is available to all paid Anthropic Cloud users. However, if you are on a team or enterprise plan, administrators must first enable access before individual users can connect their Google, Gmail, Docs, or Calendar accounts.

So, yeah, I'm curious. Has anyone tried this? I have. I have some mixed thoughts on it. But one other new update here from Claude, they did start rolling out their new research tool, which automatically searches the web and workplace documents to answer questions.

Uh, but right now that is only available on that super pricey, uh, either max plan, which is a hundred or $200 a month or team or enterprise plants in select countries. So when it comes to the new research tool, uh, which is very similar, uh, to all these other deep research tools that we have, uh, from open AI, we have them from Google perplexity, grok, everyone else. Uh, so again, uh,

you know Claude here a little late to the party but the difference is it can also look at your workspace information as well when pulling these deep research reports together or as they call it just their new research tools so I did test this out I tested out the Gmail integration because

That's something that I think is unique. So OpenAI already rolled this out a couple of weeks ago to their Teams users, which I think works very well. But one thing they don't have is the ability to go through Gmail. So I was testing it out a little bit over the weekend.

uh i was actually in line for something for like an hour uh so i was just on my phone uh testing this out and it was okay right like my use case you know speaking of uh my my uh my linkedin dms being full of spam um my email is probably worse right so a lot of times i get

companies reaching out and they want to advertise with everyday AI, or maybe they want to hire me to speak at their conference, train their employees, et cetera. But my email inbox is disastrous because I also get pitched dozens of times a week for people that come on the show and just a bunch of spam. So I'm trying to use this new feature from Claude and I tried it with

thinking. So with 3.7 Sonnet, with thinking enabled, with a disabled, and it's okay. It's not great. It did an okay job. But I don't think it's something that I would necessarily be like, okay, this is a game changing feature or even a reason to stay on a paid plan from Anthropic, right? Funny enough, I did a show last week. I believe it was on Tuesday that I'm like, y'all, Anthropic's in trouble unless they come out with some meaningful updates.

Coincidentally, these updates dropped hours later after that very live stream, right? Funny enough. I don't know if this is enough, right? I am going to give this a little more of a look right now from Anthropic, but I don't know. First impressions, it didn't do a good job of going through my email, right?

like at least judging by the questions that I asked, right? I'm like, Hey, go through, uh, find people that have reached out about sponsorship or hiring me to, you know, uh, train their teams or to speak at their events. It really looks like it only went through the first couple of pages, even when I encouraged it to go deeper or if I said, okay, you know, pick up where you left off. So I did a lot of, of tinkering around and it still only looked

like this initial feature, sorry, like this feature was initially only able to go through the first pages of my emails, right? When it's like, I have, I don't know, I don't delete emails. I feel people are either like inbox zero or inbox a trillion. That's me. I'm part of a ladder. I just let the email stay in the inbox. So I have tens of thousands of emails

Literally, I think that email inbox probably has, I don't know, 50,000 emails in it. Right. So it didn't do a very good job of going past maybe page five or six because I knew, right, I'm kind of doing these needle in the haystack tests. It's like, oh, I know this company reached out two months ago. I forgot to get back to them. My bad size.

kind of seeing if Claude could pick up on it and it didn't do a good job. So, you know, you might be wondering, okay, well, couldn't you just go and search, uh, you know, your email and type in the word partnership or sponsorship or advertisement? Yes. Right. But that's the whole point of large language models is, uh, kind of this, this natural language processing, right? Uh, because people might not always use those same keywords, right? They might, uh, use a different set of words. So that's the whole point of, you know, having a large language model that can connect

to your live data. But at least my early testing of this, not super impressed. Livestream audience, if anyone else did this, let me know if you found better results than I did.

Are you still running in circles trying to figure out how to actually grow your business with AI? Maybe your company has been tinkering with large language models for a year or more, but can't really get traction to find ROI on Gen AI. Hey, this is Jordan Wilson, host of this very podcast.

Companies like Adobe, Microsoft, and NVIDIA have partnered with us because they trust our expertise in educating the masses around generative AI to get ahead. And some of the most innovative companies in the country hire us to help with their AI strategy and to train hundreds of their employees on how to use Gen AI. So whether you're looking for chat GPT training for thousands,

or just need help building your front-end AI strategy, you can partner with us too, just like some of the biggest companies in the world do. Go to youreverydayai.com slash partner to get in contact with our team, or you can just click on the partner section of our website. We'll help you stop running in those AI circles and help get your team ahead and build a straight path to ROI on Gen AI. Speaking of the big AI companies trying to compete in a new space,

OpenAI is quietly testing a social media platform that could reshape how AI in online communities interacts, according to reports. So OpenAI's reported new social media network mimics X, or formerly Twitter,

And centers on ChatGPT's new and extremely viral image generation feature. So CEO Sam Altman has reportedly been gathering private feedback on the social media project, hinting at serious interest in actually launching it.

So this move follows Twitter or X's successful integration of Grok AI, which competitors reportedly envy for fueling viral posts.

Meta as well has tried adding AI features to Facebook and Instagram, but faced scrutiny and mixed results. So yeah, the big social media companies have already been tinkering on how they can better integrate AI into their offering. So here we have the reverse approach.

the most popular AI company in the world when it comes to users, when it comes to monthly active users, thinking about taking the reverse approach by saying, ah, we have all the AI users. Maybe we should start rolling out a social media network.

Uh, last year, my audience, what do you think about this? I have my thoughts, but right now it is unclear if the new platform would either be a part of chat GPT or a standalone app. And there's obviously no official word on if this will become an actual product or not, if it'll even even get launched, but for businesses and creators, this could mean new tools for audience engagement, but also challenges around data use and content authenticity.

Here's what it comes down to data, right? Data. So that's a huge one of the biggest, I guess I wouldn't call it features. One of the biggest advantages that companies like XAI with their grok in the X slash Twitter social media network, as well as Meta Lama with their integration to Facebook, Instagram, WhatsApp, etc.,

is they can train their model on all of that data, right? Which can be a good or a bad thing, right? Personally, if I'm a business, right, I would not want to touch XAI and Grok because, you know, from reports, what we've heard, a large percentage of its training data is just X posts. So I think there's good evidence

side of this and a potential bad side of this. But what open AI wants is they want more training data, right? They obviously want users more engaged, right? Because the more engaged they are, obviously the more ways that they can monetize this in the long run, aside from just a, you know, $20 or $200 a month subscription to their more premium services.

So Joe says AI plus social media, not a good idea. All right, so yeah, some people are not fans. Sandra says, how do you control misinformation?

That's a big one, right? Yeah, because like I said, with X, AI, and Grok, you know, I can say this. Studies have shown that X, the X platform is by far the platform in the U.S., the social media platform with the most misinformation and disinformation. So yeah, you hope these companies that are, you know, using this live real-time data for their AI models, right?

uh can decipher between the real information and the misinformation disinformation but the reality is is it's probably hard to keep up and it's hard and difficult to do that so i don't know so uh

I don't know. I'm not a big social media person myself. Obviously, this live stream goes out to LinkedIn. I'll look at Twitter for AI news, but everything else, I'm not following people on social media. I'm not posting things, right? At least about my personal life. So I guess I use social media a little bit professionally for everyday AI, but that's about it. But I would assume,

that open ai has bigger plans if they do release a social media network for it to be much more than just sharing your uh you know latest uh ai image generation with their impressive new uh 4-0 image generation tool all right uh our next piece of ai news 4.1 yeah we have a new model but it's not available for everyone

So OpenAI has launched GPT 4.1 touting a 1 million token context window and dramatic API price cuts. So this new small token

much smaller technically version of their new model is not yet available on the front end and it may not be. So if you are going to chatgpt.com and you want to use this new upgraded GPT 4.1, you will not find it because right now it is only a developer model. So API pricing for GPT 4.1 is now much lower than competitors with input at

$2 per million tokens input and output is $8 per million tokens plus a 75% cashing discount that rewards prompts reuse.

So pretty impressive, especially given that this new GPT 4.1 model beats Anthropic Cloud's 3.7 Sonnet in both coding benchmarks and real world GitHub code reviews, making it a strong contender for coding applications. So yeah, and actually the 4.1 mini model is

has actually gotten rave reviews uh for both power benchmarks uh and price as well

So competitors like Anthropic and Google are now facing increased pressure as their pricing is higher. And Gemini's complex tiers and lack of billing safeguards have drawn criticism. But obviously, Gemini responded, Google and Gemini responded to this days later with their Gemini 2.5 Flash app.

So again, who we have on the outside is poor and the rapid, right? Uh, it has been, uh, kind of the coding, uh, sweetheart over the years, but, uh, gosh, I'm not wanting to be a entropics quad right now with these new updates in GPT 4.1 developer only model, uh, extremely adapt, uh, in the software development side in coding, uh, as

as well as being extremely affordable. And then like we said, Gemini 2.5 Flash, a hybrid model. So pretty interesting. But yeah, at least for right now, GPT 4.1 will not be available inside of ChatGPT. So at least right now, it is not a front-end model. Also, OpenAI did announce that they will be getting rid of the GPT 4.5 model

on the backend. They did not mention, right? So they're not going to be supporting it, I believe, past the summer on the API side. They did not announce if it was going to be going away from chatgpt.com. I'm assuming it might still looking for a little clarity from OpenAI on that one.

All right. Our last piece of AI news, and it's probably the biggest. So OpenAI has launched new models, a full version of O3. It's an extremely impressive thinking model and O4 Mini. Okay.

So OpenAI has released its most advanced AI models yet in 03 Full and 04 Mini, giving users faster, smarter, and more flexible tools for tackling complex questions.

So these new models can search the web, analyze images and files, and write code and generate charts all in one conversation. Yeah, that is the biggest new feature. So previously, if you were using a model like OpenAI's O3 Mini High, which was actually my workhorse model, now it's

Now it's gone. But the difference is now 03 and 04 mini. Yes, they have this. They are reasoning models. However, now they can still use all of these other tools under the hood, right? Which is impressive because previously they could not all do that. So

As an example, you know, O3 full is probably the most capable model, which I know is super confusing because if you are on a pro plan like myself, now you have three different generations of these O models, right? These models that use this kind of chain of thought

or reasoning under the hood, right? They can think and plan ahead and adapt. And, you know, sometimes they'll think for, you know, three, five, 10, 15 minutes before providing you a response. But right now, right, if you're on a pro plan, you have a one pro, which is a model I still use for a ton of things. But now you also have this new 03, uh,

Full, right? Which is different because previously we had 03 mini and 03 mini high. So now we have 03 full and then we have 04 mini and 04 mini high. So we don't have a full version of 04. We have a full version of 03 and y'all using the full version of 03, it feels criminal.

It is so, so good. So there's instances, right? And I'm still exploring this myself. It just came out a couple of days ago. So, you know, I'm still getting my, you know, trying to find the time to fully investigate this. I probably only spent about maybe three hours so far using these new O3 and O4 MIDI, which is not a lot for me. Normally I'm like, you know, eight hours the minute it comes out. I haven't had a ton of time yet, but O3,

is so good, it feels criminal to use it. And I think one of the reasons why is you get access to all of the tools. So not only can you have this model that thinks step by step, it can reason, it can plan ahead, but in doing so, it can also use multiple tools and it can go back and forth, right?

So as an example, it can use Canvas. It can use ChatGPT Search. It can use Python, right? That right there, the combination of using all of these different tools, it's pretty amazing, right? So one way I tested it, I had a screenshot. And I just did this all on my phone. I had a screenshot of some of the top AI tools. And there's probably like 30 or 40 of them.

them. So, you know, on my phone, you know, cause again, I was in the line, I was in a line for something for an hour, uh, this weekend. Uh, so I just uploaded that screenshot to oh three. I said, Hey, give me pricing for all of these tools. Cause I knew them all. Uh, right. But I didn't know pricing for all of them. Cause there's like 30 or 40 of them. And I was subscribed to most of them, but not all of them.

Not only could OpenAI number one, or sorry, this O3 model use computer vision, see them all. It went and it used the web, but I could see it as it went. It was using the web and using Python interchangeably, right? Because what it was ultimately doing, it was putting together a graphic model

a graphic for me and a table on all of these different AI tools. It was sorting them and categorizing them and it was going and doing research and going back and forth multiple times. So I just kind of watched it in awe and I'm like, this changes things completely because essentially you can chain together multiple commands that use these tools, right? And one of the biggest things that's

separate kind of a general large language model from AI workflows from agentic AI is the ability for agentic tool use, right? So what that means is a large language model or an AI workflow can decide on its own, hey,

I need to go query the web for this. Hey, I need to use computer vision for this. Hey, I need to put this in a table. Hey, I need to run some Python code for this, right? And it can make that choice on its own, right? And the user does not have to tell them to. So to be able to have a model as powerful as O3, be able to string those things together, y'all,

I was just like jaw on the floor. Uh, the first couple of times I use this. So a couple more stats on this open AI says, Oh, three is now it's top performing model for reasoning, coding, math, science, and understanding visuals. Uh,

making 20% fewer major mistakes than the previous version. Yeah, the hallucination rate is still pretty high on this. So you do always have to start any chat with more data and keep your expertise in the loop. However, it does build on the previous generations of these thinking models.

A little bit on the 04 Mini model. Well, it's designed for speed and cost effectiveness, hitting a 99.5% pass rate on major math competition when giving code access. So both 03 and 04 Mini can figure out when and how to use different tools to answer tough models.

multi-step questions adapting as they go. And for the first time users can upload images like photos of whiteboards or diagrams and the models can think with those visuals to help solve problems. So yeah, that is, you know, kind of my example. I had a screenshot, uh, right. That I could kick off that whole flow with, uh, which was, uh,

a little mind boggling if I'm being honest. So OpenAI has reportedly rebuilt its safety features for these models to better handle sensitive topics and reduce risks. So the new models are available right now for chat GPT plus and business users with 04 mini also open to free users. If you click the think icon,

option. So if you are on the $20 a month plus account, there are limits on these new models. If you are on the pro plan, I believe it is essentially unlimited is what I read. So keep that in mind. But even if you're on the free plan on chat GPT, you can click that think option and you can use a four mini, although it is very limited.

So developers and companies will benefit from a smarter, faster response that could help them save time and money on everyday tasks. Livestream audience, have any of y'all used this, the new 03 or 04? I was personally surprised.

a little flabbergasted, right? I use AI way more than the average person, even way more than the average power user. And I was like, wow, this can really change workflows, right? And also, you know, starting to bridge the gap between traditional large language model use and AI workflows, right? Which is extremely important to keep in mind as well.

Yeah, Michael here from YouTube says, can't believe how much is happening every week. Still, I don't think it will ever stop.

Yes. So Sandra asking, what are the best applications for it? Great question, Sandra. I might have a dedicated show on this new O3 model if you all would like. I think the best application for it is an example of kind of what I did, right? When you have to work between multiple modalities, you need research and you also need maybe an output that requires more than text, right?

Right. So maybe it's starting. Yeah. Like a simple example, starting with a whiteboard. Right. And maybe your team is just ideating or brainstorming things for an upcoming product launch. You know, take a picture of that screenshot. Right. Combine it with some of your data. And then the new O3 model can go agentically, essentially research the web. It can go use.

uh python and other built-in tools uh when you're done with it right you can use the canvas mode uh to kind of work iteratively uh with o3 so yeah like i think the the best applications for it are when you need to do research uh maybe when you're starting with an image

as an input, as well as if you need a little bit of code, if you need, you know, ChatGPT to, you know, essentially go through and categorize something, organize, or do a bunch of research, and then categorize that research as well.

So a ton of potential use cases for this new model. All right, that's a wrap, y'all. Let me know in the comments, live stream audience. I know Mondays are always long shows, but let me know what you want to hear more of. It should be a very interesting week in AI, but let's do the very, very quick recap. Here's the AI news that matters for the week of April 21st.

So first we started with OpenAI is pursuing a $3 billion acquisition of Windsurf, which, hey, I didn't even mention this, but that is probably or might have been the reason why when we saw this announcement of GPT 4.1,

You got free usage for windsurf users. So there's a little nugget. All right. Our next piece of AI news, Google launched Gemini 2.5 flash in extremely capable and impressive model. They also unveiled VO two, their AI video tool for Gemini advanced subscribers. Uh, the Trump administration is reportedly weighing a ban on deep seek, which I personally think is a good idea. Uh, Microsoft has launched, uh,

in computer using agent inside of copilot studio. Uh, Google is giving away a year plus of Gemini advanced to all us college students with a valid.edu email address, uh, anthropic, uh, released, uh, Google workspace integration, uh, as well as a new research tool that everyone else already has, but

It's only available on its higher price plans. OpenAI is reportedly eyeing a social media platform, potentially competing in a different way with Axe slash Twitter and Meta. OpenAI released a

API only model in GPT 4.1 with a 1 million token context window and cheaper pricing. And then last but not least, OpenAI also released 03, 04 mini, 04 mini high, some extremely capable reasoning models with agentic tool use. My gosh, it was a lot. What do you guys want to hear more of this week?

I might have an open slot or two on the show this week. So if you want a dedicated show on any of these, let me know. I'll probably put a poll out in the newsletter as well. So if you haven't already, please make sure you go to youreverydayai.com, sign up for

for that free daily newsletter. Also, if this was helpful, y'all, please, I'd super appreciate this. Number one, if you're listening on the podcast, please subscribe, follow the podcast, leave us a rating if you could. I'd really appreciate that. As well as if you're listening on social media,

Don't keep everyday AI your little secret. That's rude. Share it with the world. Share it with your coworkers. Share it with your neighbors, best friends, mothers, babysitters, dog walker, right? Everyone needs to learn AI and it is extremely hard to keep up. That's why we do this AI news that matters almost every single Monday to cut through the marketing, cut through the BS, cut through the noise and tell you what

really matters. I hope this was helpful. Thank you for tuning in. Hope to see you back tomorrow in Everyday for more Everyday AI. Thanks, y'all. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

EP 508: OpenAI’s impressive new thinking models, Google gives free AI to millions and more AI News That Matters 56:18 Share

Everyday AI Podcast – An AI and ChatGPT Podcast

Deep Dive

Shownotes Transcript

EP 508: OpenAI’s impressive new thinking models, Google gives free AI to millions and more AI News That Matters