We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Ep 530: Google I/O AI Updates: 15 new features and how they can grow your business (Pt 1 of 2)

2025/5/21

Everyday AI Podcast – An AI and ChatGPT Podcast

#artificial intelligence and machine learning#generative ai#ai market trends#technology#ai research#ai and public perception#ai integration in product development#ai entrepreneurship challenges#ai in creative process#ai-generated content People

Jordan Wilson

一位经验丰富的数字策略专家和《Everyday AI》播客的主持人，专注于帮助普通人通过 AI 提升职业生涯。

Topics

@Jordan Wilson : 大约15个月前，我认为谷歌甚至没有进入前三名。当你看到微软、OpenAI和Anthropic时，我认为大约15个月前，谷歌实际上排在第四位。但现在，毫无疑问，谷歌是绝对最好的。他们在生成式AI和大型语言模型领域处于领先地位。他们在I.O.大会上宣布的内容非常疯狂。我认为如果没有其他因素，这确实巩固了谷歌至少目前作为领导者的地位。我们将拭目以待其他人如何以及何时做出回应。但至少目前，谷歌在AI方面做得非常出色。我将分解Google I/O大会中最有用的15个AI更新。

Deep Dive

Shownotes Transcript

Translations:

中文

This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Google has come a long way in a very short period, which seems weird saying that about one of the biggest companies in the world. But when it comes to the AI race,

Let's be honest, about 15 months ago, I don't even think Google was in the top three, right? When you look at Microsoft and OpenAI and Anthropic, I think about 15 months ago, Google was actually in fourth place. But now, without a doubt, Google is the absolute best.

leader in the generative AI in large language model landscape. And what they just announced at their I.O. conference is nutty. And I think if nothing else, it really just cements Google's place, at least right now, as the leader of the pack. We'll see how and when everyone else responds. But at least for right now, Google

is just cooking when it comes to AI. And they released dozens of notable AI updates. And on today's show, well, on today and tomorrow's show, we're going to be breaking down what I think are the top 15 most useful. So yeah, we're going to have a part one, which is today, and a part two, which is tomorrow. But we're going to be going over the top

15 most useful AI updates out of the Google I/O conference for everyday business leaders such as yourself. All right.

I'm excited to dive in. I hope you are too. If you're new here, what's going on, y'all? My name is Jordan Wilson. I'm the host of Everyday AI. And this thing, it's for you. This is your daily live stream podcast and free daily newsletter, helping us not just keep up with AI, which is very hard, but how we can actually use it to grow our careers, grow our company. So, let's get started.

Is that you? Did that hit home? If so, well, you're in the right place. This is your home. It starts here on the unedited, unscripted live stream of podcasts. This is where you learn, but where you're actually going to leverage this and put this to use is on our website at youreverydayai.com. Because once you're there, you can sign up for our free daily newsletter. We're going to be recapping today's show, but we also keep you up to date with everything else happening in the world of AI. And yeah, even though Google is sweeping the headlines, there's still a lot more happening.

And then also on our website, you can go and listen to for free sorted by category, more than 500 past episodes, whatever you're trying to learn. We've already spoken to the experts. It's all there already. All right. So,

Normally, we start out each live stream with the daily news, but let's be honest, Google is the AI news today. All right, so I'm excited for today's show. What's up, live stream fam? It's good to see you. Yeah, if you listen on the podcast, maybe sometime drop by at 7.30 a.m. Central Standard Time. You know, when we have guests on, what other place can you go and ask questions live to the smartest people in the world on AI? Today, it's just me, sorry. But

What's up, livestream fam? So Christian, join in on YouTube. Good to see you. Brian and Michelle, Dr. Harvey Castro, Big Bogey, everyone else. Daddy, good to see everyone. Let's just not tease you anymore. Here's at least the first half of our top 15 AI updates from the Google I.O. Conference for everyday business leaders such as yourself. Here we go with 15 through 8.

Number 15, Imagine 4. 14, Chrome with Gemini integration. 13, personalization in email. 12, notebook LM updates. I can't believe that didn't make the top 10. 11, Gemini diffusion. A whole new type of large language model. 10, real-time chat.

Translation in Google Meet. Nine Gemini app updates and eight Gemma 3N. Yeah, that's a lot. And y'all, I didn't miss anything. We still have our, you know, number seven through one. But here's something that didn't even make the list, right? And if you've been following the AI news over the past, you know, I don't know, 12 to 20 hours, you

These are big advancements that didn't even make our top 15 list. All right. Gemini Code Assist, Synth ID Detector, Lyria 2, the virtual try-on in shopping, Google Beam, which is...

enormous news even of itself formerly called project starline jewels the new autonomous coding agent uh the a2a agent to agent enhancement so yeah when i say that there were dozens i literally had to scratch my head and look at my list of like 50 and say what are the top 15 uh right so uh

Very hard to do. Very hard to do. All right. So there's probably some big things you're like, wait, where are some of these big ones? Well, those are tomorrow, right? You'll notice I didn't even say the word Gemini 2.5 there. A lot of updates there or VO3. Yes, VO3, which is shocking. All right. So we're going to be going over those and a lot more tomorrow. All right. But let's stick on our top 15 for today. Hopefully a...

concise show or more concise show than normal for you all instead of doing an hour and a half show or something like that. We'll try to keep this one short. All right, first, Imagine 4. So this is Google's updated text-to-photo platform in Imagine 4.

It's really good for our live stream audience. You can probably see if you're listening to the podcast, nothing overly visual or overly instructive today, but maybe you want to check out what's on the screen. You can always do that by checking out your show notes and go on our website and watch the video. But look at this image.

This looks beyond real, right? So this is a young girl here, a young woman, looks like in a dorm room, maybe with pink hair and earrings and kind of a grungy t-shirt with light filtering in through the window. It looks like an amazing photo that was captured with a high-end DSLR. This does not look AI-generated in the least bit. Let's just start there. It is somewhat...

I don't talk too much about my background here. I just realized, y'all, I just realized I don't even have my mic plugged in. This is how much work I was doing and maybe how sleep deprived I am. So live stream audience, give me a second. Let me know if you can hear me now.

Hopefully you can. Can I get a thumbs up from the live stream audience? I didn't have my mic plugged in, but it must have been picking up somewhere else on my computer. All right. So thanks to my computer for still delivering some type of audio, even though my mic wasn't plugged in. All right. Hopefully, hopefully y'all can hear me. All right. Let's keep it going. So this is good. So imagine for let's talk a little bit about what's new.

Thank you. Thank you, Marie and Laura, for letting me know. You can hear me. Appreciate that. Okay. So.

Here's what's new in Imagine 4 and what it is if you haven't heard of it. So maybe you've heard of Mid Journey. There's the new kind of viral GPT-4-0 image gen inside OpenAI. There's a lot of these AI photo generators, stable diffusion, flux. There's a good five to 10 pretty good ones.

i'm going to be interested to see where imagine 4 lands on the benchmarks so in the same way we talk about the lm arena which is kind of blind taste test for large language models they have that for image and video models as well so um i'll be interested to see where imagine 4 lands on the list but

from early eye tests. And as someone, I was a photographer, you know, kind of before in my earlier life, I've probably taken more than a million, yes, more than a million photos with the DSLR. So I would say my eye is a little more trained than the average eye when it comes to looking at things like photo realism or even being able to decipher what's real and what's not. And I will tell you, imagine for images,

are otherworldly good. In the same way, Midjourney V7, very good, but geez, these Imagine 4 photos, so good, so good. All right, a little bit about what Imagine 4 is, what's new, when it's rolling out, all that good stuff. So this is Google's latest and most capable image generation model with improved detail and text rendering within images. That's a big thing. Midjourney can't render text and they kind of said, yeah, we don't really care about

that. Uh, this is good. The ability to render texts. Yes. GPT four Oh image gen does great at rendering texts for whatever reason you may want. Right? So maybe you want this person wearing a shirt to have a, a t-shirt that says, you know, the name of, you know, university of Illinois or something like that, or Chicago, uh, right. Some AI image generators struggle with that. Uh, imagine for so far does a really good job like GPT four Oh image gen does, but in terms of photorealism, uh,

quality, uh, imagine four is pretty good. And by pretty good, it might be the best out there. Uh, time will tell. So right now it's rolling out now in the Gemini app. Also, this is pretty interesting. It's coming to all of Google's, uh, different products. So Google, Google docs,

slides and other workspace apps. So yeah, I don't really use Google slides, but now I'm like, okay, there might be some use cases where, you know, I might want to, or maybe might need to in some instances. Right. Uh, so this is going to be in the new, uh, included in the Google AI pro and ultra subscription. We're going to be talking a little bit more about

that tomorrow. But for these things to make sense, you have to know previously, you know, Google had a couple of tiers, right? There is a free tier and then there was a Gemini advanced and in typical Google fashion, they're confusing us all. So now they're still obviously a free tier. The new $20 a month

plan is called a Gemini eight or sorry, Google AI pro I'm already getting confused. Google AI pro is the base $20 a month plan. And now you have the ultra, which is ultra expensive. And

at $250 a month, technically $249.99. And I think for the first three months, it's like half off, but you know, the base plan is going to be $250. So this is already rolling out to those people who have either of those two subscriptions.

So like I said, some of the standout features here, significantly better text rendering and images, enhanced photo realism, improved handling of complex prompts. So prompt adherence, uh, there's in painting and outpainting capabilities. So if you want to change something inside the photo, you can do that very easily. If you want to extend a photo, uh, right. So whether it's a photo you start with or a photo that you create inside, imagine for you can outpaint or extend it.

to bring in more of the scene that was actually never captured originally. And also this supports a range of aspect ratios up to a 2K resolution. So there is a coming soon for this, a faster version. I don't know if they're going to call it turbo, but apparently it's going to get 10 times faster fairly soon.

What the heck could you use this for to grow your business? Well, first, get rid of those ugly stock photos on your website. They look horrible, right? Also, starting from here, if you're creating any videos for social media, anything like that, start with an Imagine 4 image.

image, right? Yes. Start with images. If you're doing AI video, it turns out better, but there's no shortage of ways that companies can just use visuals. Chances are everything you're using, whether it's for internal or external purposes is either extremely old, extremely boring, or a combination of both. All right. Number 14.

Chrome with Gemini integration. All right. So what this is, well, the Chrome browser is finally going to get a little smarter. All right. So I can't pretend that this is some groundbreaking new feature. It's more of like, oh, in about time, because let's call a spade a spade here, card players.

Microsoft and their Edge browser, which is actually freaking fantastic. It's based on Chromium, right? So all your Chrome extensions, everything like that will sync over. Microsoft Edge has had this for like a year. Not all the capabilities, but they've had a built-in co-pilot for like more than a year. And that's why I use Edge a ton. But...

About time we're going to be getting Chrome with Gemini integration more than just being able to summarize web pages and things like that. But it helps you also with web browser tasks. So this is also, you're going to have to be on a paid plan and you can summarize web pages that can help you explain complex information, answer questions,

about page context, uh, content. And eventually here's the eventually and why maybe it's for paid subscribers and not available for everyone for free. Uh, eventually it will be able to help you navigate websites autonomously, which is pretty big. That's been a big shift over the last even month or two. A lot of companies, uh, the, the DIA browser from the browsing company, uh, perplexity coming out with a comment browser, um,

Even Microsoft Edge with their vision feature, you know, built in, you can see web pages. So the ability for browsers by default to perform tasks is not some future, you know, sci-fi. This is, it's already available, but it's been like wildly popular the past like three months. So this Chrome with Gemini integration will be eventually be able to do that, at least Google says so.

What are some business use cases for this? Well, pretty straightforward. Number one, it's gonna help you summarize web content faster, right? Which if you haven't already just been doing that in Microsoft Edge, I told you about it like, I don't know, a year and a half ago, and I'm like, start doing this. So nothing super new there, but obviously the ability for Chrome to perform actions on your behalf without having to launch a separate agent, pretty big in terms of time savings, winning back time, all that good stuff.

McDonald said, this is very impressive. All right. Oh, talking about Imagine 4. So he says art director for 20 years and things like Imagine 4, very impressive. Yeah, I agree. Like I said, I've been taking more than a million photos with a DSLR, getting paid to do so. And it's really, really good.

Are you still running in circles trying to figure out how to actually grow your business with AI? Maybe your company has been tinkering with large language models for a year or more, but can't really get traction to find ROI on Gen AI. Hey, this is Jordan Wilson, host of this very podcast.

Companies like Adobe, Microsoft, and NVIDIA have partnered with us because they trust our expertise in educating the masses around generative AI to get ahead. And some of the most innovative companies in the country hire us to help with their AI strategy and to train hundreds of their employees on how to use Gen AI. So whether you're looking for chat GPT training for thousands,

or just need help building your front-end AI strategy, you can partner with us too, just like some of the biggest companies in the world do. Go to youreverydayai.com slash partner to get in contact with our team, or you can just click on the partner section of our website. We'll help you stop running in those AI circles and help get your team ahead and build a straight path to ROI on Gen AI. All right, that's number 14. Let's go to number 13, personalization in email.

So this is an actual, not what's on my screen, but this personalization and email was one of the things that actually Google CEO Sundar Pichai actually talked about during his keynote, which I found interesting because when there's literally dozens of updates,

that are huge personalization and email at first. I'm like, okay, this is no big deal. But when you look at some of the marketing materials, again, there's obviously a huge gap between what's being marketed, what's being promised and what actually happens, right? And Google is getting much better. Although their original track record on this a year and a half ago, two years ago, no bueno.

Now they're just shipping, right? So I actually do have a high degree of confidence. A lot of these things are going to be shipped on time, but the personalization and email something against and DARPA Chai mentioned in his keynote address with his limited time on stage.

So for our live stream audience, you kind of see an example here. So there's kind of this blue area that's shaded, a green area that's shaded, and then a yellow area that's shaded. And it's showing you how Google and Gemini are going to be able to use personalization

based on your context, right? So it's not just those auto replies, right? Which had been in Google Gemini for a long time. And I don't really use them because I don't think they're good. This, when and if it gets released, will be actually really good. So as an example, you know, the things in blue, it's basing part of an email reply based on

your own writing style. So it goes and it sees how you respond to emails. So the type of words that you use, the format, is it long, is it short, et cetera, right? So it bases it number one on your writing style. Number two, pulling in context from your past emails, which is obviously important, right? We want AI to be smarter.

And then also based in the yellow portion there for our live stream audience is based on files in Google Drive. That's the part that I'm like, holy frick, this is really good. So in this example, right, it's talking about someone's asking about a package or a service this company offers. And it says our pampering packages range from $90 to $230, depending on your dog's size and the specific services you were looking for. So, uh,

It's pulling in that information based on a Google Drive file, according to what Google released here. So that right there, extremely impressive. Personalizing emails based on your writing style, based on past emails, based on files in your Google Drive. When and if this happens, I'm going to love it. I won't have, I'm embarrassed to do this live, but I'm going to tell you guys the truth. All right.

I get just bombarded with emails. Somehow people find my personal emails, the emails for the podcast. Mainly, it's just a bunch of people wanting to push their sometimes garbage AI products and services to you all. And I say no to many of them, but there's some great people that

land in the email. But, you know, already today I have dozens of emails and most of them are unread because right now the Google Gemini, you know, abilities are not good, you know, to reply to emails. So when this happens, oh yeah, I was going to look. So I have 2,328 unread emails. I hate email. I hate it, right? I get too many emails. It takes too long to respond because number one,

I have to do these three things, right? I have to write it in my own style, right? I don't want people to think I'm using AI, even though I will end up using AI, right? You know, I need to pull in context from past emails. And, you know, in many instances, people are asking, hey, I want to sponsor the podcast. I want to do this and this. Will you come speak at our event, right? I have all that information in different ways.

Google drive files, but sometimes I forget it. So it takes a lot of time to go and do those three things. So, uh, this personalization piece will be huge. So, uh,

This is launching in Google labs. So you have to sign up for Google labs. It's a free program. That's essentially where you get beta access to certain tools and features. So right now it's saying it's launching, uh, in Google via Google labs in July of this year. Initially, it's going to be on the web only, uh, you know, so you can't use this inside different apps and it's going to be English only at first.

So I'm excited for that. And the business use cases for that are obviously off the charts. Cecilia, I 100% with what Cecilia says. Cecilia says email is the bane of every professional. So anything that helps is more welcome. Absolutely. Absolutely. And I do know, you know, spending spending a little time.

on Twitter last night, looking at all the new releases and everything else. Logan Kilpatrick, who I've had on the show a couple of times, he's lead of product for Google and AI Studio. And he did mention that email priority is extremely high because someone's like, yo, is this actually going to happen? And he's like, yes.

It is going to happen. So, you know, vote of confidence there from Chicago's own Logan, who's been on the show a couple of times. So, yeah, I'm really looking forward to this one. Hopefully it does come out in July. Heck, Google, I'll even take 2025. Please give us a working version of this in 2025. And the business world will be crying tears of joy. Next. Hey.

Tears of joy. If you're a Notebook LM user, you're going to like these updates. It's actually crazy that this didn't make our top, you know, seven for tomorrow's show. But here's what's new in Notebook LM. And if you don't know Notebook LM, it won our 2024 AI Tool or Mode of the Year Award. And it wasn't even close.

Notebook LM is an amazing piece of technology. It is powered by Gemini 2.5 now, whereas previously it wasn't. So that just rolled out at Google Cloud Next about six weeks ago. So if you haven't used Notebook LM recently, you should go use it now because it uses a hybrid thinking model. So it's even better than it was before.

but it is grounded in your data. So as an example, let's say I load it up, which I literally did for this show. I load it up with a bunch of information about Google IO updates, and I ask it about deep dish pizza. It's going to be like, can't respond. Don't know. So it is grounded in your data. It only works with what you give it, which is huge for trust, transparency, and being able to use something with accuracy, knowing that there's

likely not going to be any hallucinations. So some of the cool things is video is

is going to be coming out, which is going to be wildly fun. All right. So not a ton of updates yet, but there's kind of these multimedia features. One is the audio overview, which is essentially a deep dive podcast. It makes a podcast between two hosts that sound very real, right? And many of you probably feel like you even know those two AI hosts, right? Because you listen to them all the time if you're like me. So you are going to be able to have the default podcast

time to either five minutes, 10 minutes or 20 minutes. So the default is 10 minutes. If you click shorter, when you go to customize audio overview, uh, that's about five minutes. If you click longer, it's about 20 minutes. So that's great. Uh, I was able to already kind of do this with some simple, you know, quote unquote prompt engineering, which is just, you know, uh, instructing it over and over, uh, for a time or giving it more complex, uh,

request when asking it to customize to get it longer anyways. So yeah, there's going to be some simple video generation based on your files, which I'm excited to see what that looks like. And then, like I said, the ability for 5, 10, and 15, or sorry, 5, 10, or 20 minutes for the audio overview. So

Also, they did update this to 50 languages a couple of weeks ago. So I don't think the video overviews, FYI, they're not going to be like VO3 quality, right? Something that you would produce and go say, okay, this is going to be our new explainer video for our business. I don't think that's what we're looking at here. What we are looking at is more of a fun video.

and uh kind of cutesy way at least the uh kind of the examples that they showed were more um

Kind of, I would say animated, right? Like more retro ask graphics, which is fine, but great for explaining more complex topics, which is something I use notebook LM for anyways. So yes, this is when you're talking about business use cases, this probably isn't something that you're going to export and go put on the front page of your website, but I don't know, maybe it will be, or at least something that you might put on social media. I could see that as well. So yeah, a couple of new updates there.

Um, also there's obviously higher, much higher limits for Google AI pro and ultra subscribers. Although I think even the free limits for most people on a free plan, uh, is more than enough. All right. Uh, our next one, this one's interesting. A Gemini diffusion model. Okay. This is pretty big. This is pretty big. So, uh,

This is not a transformer large language model. So diffusion, how do I explain this? It's almost like a live denoising process. All right. So Gemini and most large language models are quote unquote traditional transformer models, right? A very advanced next token predictor, right? Kind of, you could say in theory, working from left to right.

where a diffusion model, it kind of just starts with noise and then it updates the whole thing. All right. This is a very non-technical description, right? But this is an experimental text model using diffusion techniques, which like I said, diffusion models are inspired by image generation methods. And this is to refine answers with exceptional speed. So

I have the example up here and what Google's kind of going to be releasing this for initially is for things that are more finite, right? Things like math and coding, because that's what I think diffusion models might be better at. And you might be saying like, okay, like why, why do we need a diffusion model? Well, how about for speed? So Google says their early testing show four to five

X faster, four to five times faster on math encoding texts compared to comparable

you know, non-diffusion models. So this is a completely new technology, but if you do use large language model for coding, STEM, you know, specifically math tasks, I think it's going to be great. So right now it's in limited preview and there's a wait list. So like I said, this is a very novel approach to coding.

applying diffusion-based methods to language models, which have not been used before. And it's really just focused on solving complex reasoning problems. So this is less about creating long form blog posts and more about working in areas that usually have more of a

right or wrong answer and less about using them in areas where there's a ton of gray, if that makes sense. So like I said, some business use cases, if you're in anything with coding, math, and if you're already finding a ton of utility by using Google Gemini or other large language models, but maybe you need more speed,

this could be it, right? So it's a completely new technology, uh, diffusion for text based large language models. Uh, diffusion, uh, technology has been out there and been wildly popular for, uh, image models, right? So it's essentially denoising. So if you ever watch an AI image be generated in real time, which many of us do because you go in, you know, whether you're using GPT-4 image gen or, you know, you're using, uh, imagine or, uh,

mid journey, right? It starts, you can watch it go live, right? So whether it's five seconds or a minute and you see it kind of transform, it starts with this blurry, uh, noisy outline. It's like a bunch of blobs and then slowly it comes into focus. So that's kind of like what a diffusion model does versus kind of going left to right next token prediction on steroids. So pretty interesting here with the, uh, Google, uh, or Gemini diffusion model, uh,

All right. We have three, a couple, a couple more here in our part one of our top 15 features. Okay. So real-time translation in Google Meet. So this is really cool. And like I said, it works.

This is technically nothing groundbreaking. Microsoft Copilot has already had this for certain users, right? So Microsoft Copilot has had a version of this for their Teams meetings, but you did have to have a certain Copilot plus PC. So you had to be able to do this

locally on your device. So Google is bringing this to the cloud. So what is this? Well, it's very limited right now, but very cool. So it is live speech translation during video calls that work like having a human interpreter present.

So in terms of availability, initially, it's only going to be available for people on the $20 a month pro or $250 a month ultra plan. And at least right now, it's only going to be Spanish and English.

But Google did say there's more languages coming soon. So essentially it translates this in near real time with natural voice synthesis. So, uh, if I was talking to, uh, you know, some of my wife's family, uh, in, you know, Bolivia or Chile, uh,

we could talk to each other, right? And I would speak in English and it would use a voice that kind of sounds like mine in real time, translate what I'm saying to Spanish and then translate what they're saying from Spanish to English. And at least from the demos they showed, there's not a huge delay, right? It literally sounds like a world class human translator or interpreter, right?

Like you can't really tell any lag, right? So it's not like you say a full sentence and then, you know, 10 seconds later, you know, the translated version comes. It is milliseconds. It is almost instantaneous, right? So again, that's the demo. We'll see what actually happens when this rolls out and specifically how

how it rolls out because one of the things I'm wondering, and I am gonna be following up with my Google contacts to get a lot of answers to questions. So if you do have questions on this, let me know in the comments because I will track down the answers. But one thing I'm wondering is like, okay, do both users need to have

a pro plan, right? Or can just one person be on that $20 a month? Because if both people have to have a pro plan, I think that really limits the talking that you can do and having this be great. But think of what this does for business. This is absolutely nutty, right? Once this does roll out to more countries and more languages, and I do assume that Google will be trying to

to update this and i'm guessing that this would be in the latter part of 2025 uh to the 50 languages that notebook lm supports uh that would be my guess i don't have that on authority but google did say they're working on more languages and it probably makes sense uh to work on the 50 languages that they've already incorporated into notebook lm which are the most widely spoken languages in the world uh so this is huge even if just for right now

Think if you have business in Latin America, South America, the language barrier is gone. Yeah, you might have to, you know, if both users need to have a $20 a month, like who cares, right? Imagine being able to talk to your colleagues from another country without a language barrier.

This is huge. This opens up so many new business possibilities, especially when you look beyond where we're at now, right? And like I said, this has already been out with Microsoft for many more languages, but the downside is you had to have that running on your local device. So you had to have a newer co-pilot plus PC that essentially was running a language model locally on your own device. So if Google can pull this off and expand it to 50 languages,

This completely changes how you can do business, right? Maybe you've only been a domestic business for now, and maybe the language barrier is one of the biggest reasons why, right? This is huge. This is huge. All right.

One or two more here as we wrap up. So number nine, Gemini app updates. A ton here. So there's been a lot of enhancements to both the Gemini mobile app and the Gemini app. And we'll probably, in the coming weeks, we'll probably have a lot of dedicated episodes covering this. And we're going to be covering this a little bit more tomorrow when we talk about Gemini Live.

So a lot of new updates there, but some of the Gemini app updates are rolling out now to both iOS and Android users. You get a lot of the core features free with some of the more premium capabilities for people on those subscription plans. So some of the ones that I think are worth noting specifically within Gemini deep research, right?

Anyone out there using deep research like every single day like I am, I'm excited for that. But you can start deep research by uploading PDFs or images, which is huge in terms of personalizing your deep research. And like I said, I think a month ago,

Open AI was in a league of their own with their deep research. But now I think Google Gemini is probably slightly ahead because they did change how their deep research worked because they upgraded it to their Gemini 2.5 model. So it used more thinking and reasoning and planning, right? But if you don't know anything about deep research, essentially you give it a query, right?

And it'll go off and spend anywhere from, you know, two to 20 minutes researching anywhere from a dozen to hundreds of websites. But now what makes it better inside Google Gemini with 2.5 is it uses this thinking model. It plans it step by step. And a lot of times,

it will make a turn, right? It'll start going down one path. And then in its research, it finds out like, oh, I was wrong about that. So I should probably not go look at another 100 web pages if I found out I was wrong about my original plan. So then it'll deviate and pivot, right? Which is what OpenAI's version of deep research has always done. But now Google Gemini's version does that as well. But the new thing here is it

at least with deep research is being able to start with uploading a PDF or an image, which is huge. A lot of new updates for canvas, which we're probably going to have multiple shows in the very near future. Just looking at Gemini 2.5 canvas and all of these new updates, you know, you can create now infographics, interactive quizzes, and then everything with Gemini live that we're going to be going over a little bit tomorrow. So,

I mean, just improved response quality through personal context, more natural voice interactions with emotion detection in the voice features. And when you talk about business use cases, I mean, there's a ton, right? This is really where I think a lot of knowledge workers should be starting their day, right? Whether it's Chad, GBT, Google, Gemini, Copilot, right? You should be starting so many of your tasks in Google.

a large language model, not in the middle, not at the end, but start with idea, strategy, research, et cetera. So a lot of these app updates, they're more than quality of life. They're changing what's possible. And then speaking of changing what's possible, and this is last on today's list, but not least Gemma 3N. So this is Google's latest, fast, inefficient, open, open,

open source multimodal model designed for on-device AI applications. So what the heck does this mean? Gemma 3N. Well, first of all, it's scary good. This is a small language model, 4 billion parameters. So what does that mean? Well, without getting too technical, a small language model, a 4 billion parameter model

can fit on a phone, can fit on today's smartphones, right? So edge AI and small language models have been saying this for years. This is the future of large language models because what's one of the one reasons that most enterprise companies or even individuals don't

work with large language models. Well, they're like, okay, well, data security, you know, all those things. Okay, sure. Makes sense. I don't want to send my stuff to the cloud, even though you already have everything in the cloud and it doesn't matter. It's the same thing regardless for, for, for those that aren't smart enough to make that connection and, you know, figure out that, you know, one plus one equals two. I don't know when the new, the new math, the core math, if one plus one still equals two, but one plus one still equals two here because,

When you talk about edge AI, that takes out all those data security things because you're not sending any information to the cloud. You can shut off your internet and use Gemma 3N on a local device.

Right. Uh, and the performance is absolutely nutty. Okay. Claude 3.7 Sonnet is one of the world's most powerful proprietary models. Obviously you have to use it in the cloud, right? Cause it is enormous.

Uh, you know, we don't know how big it is, but chances it's a couple trillion parameters, uh, or at the very least, at least a hundreds of billions of parameters, um, which just means size, right? Think of like a, like a gigabyte, uh, of storage or something like that. Uh,

Gemma 3N is a fraction. I would say it is less than 5% of the size of Claude 3.7 Sonnet. Yet for chatbot arena ELO scores, so side-by-side comparisons, it is essentially the same, right? There's only a four point difference. So that means when humans don't know the difference,

And, you know, everyone, I don't, I, I'm not a huge quad fan FYI, but Claude's latest model, although there's rumors that they might be releasing, you know, a quad, uh, four sonnet or quad for Opus any day now, but at least their most powerful proprietary model, this itty bitty model that you can download, you can fork it. You can do whatever you want is just as powerful, just as powerful.

So the availability is the preview of this is available now via Google AI Studio, also Google AI Edge. It's free for developers. You can download it, fork it, fine tune it with your company's data, et cetera. So it's engineered to run smoothly on phones, laptops, and tablets with minimal resource requirements. And it is multimodal as well. It can handle audio, text, image, and video inputs seamlessly.

That is amazing. So it's optimized for resource constrained environments while maintaining strong capabilities and modalities also

speed fast, right? Not having to send something to the cloud and wait for the inference for it to go do its thing on the cloud. It's happening on device. So it's faster. It's more secure. And I've been saying this for a long time, ever since we saw the first version of Gemma 3 a couple of months ago. I said, don't sleep on Gemma 3. It is wildly powerful, y'all.

This, this completely changes how we're going to work in the future because what this signals, what this signals is this is going to force the other big companies, open AI, Anthropic, et cetera. Those companies that don't have an open model yet. This is going to force them to go open.

Because if you have a Gemma model, right? And also, you know, there's good open ask models from Mistral, from Meta, their Lama models as well. But I mean, this right now, Gemma 3N is benching off the charts for how small it is, right? This is going to force big companies that are only doing proprietary models to offer open models, right? OpenAI CEO, Sam Altman did say that they're going to be releasing something, but this is huge because this means that

that probably within, I don't know, a year or two, most new computers, I mean, well, I won't speak for Apple since they're still operating in the 1990s when it comes to artificial intelligence, but you would have to think even Apple is going to have to catch up. Most computers are going to come with a state-of-the-art level, large language model that can run everything locally. So you won't even have to worry about

about data security because nothing's leaving your hard drive. It's the same thing as saving a file to your local device working with a model like Gemma 3N. So this is huge. So that's our quick recap of what's new, at least on the first half here.

If you want some related episodes, y'all, I've had some recent ones. So I was at Google Cloud Next a couple of weeks ago and covered what was new there with Logan Kilpatrick. Already mentioned that. So if you want to go listen to that, that's in episode 501.

Also, there's been a lot of new updates just with Gemini 2.5 Pro. And we're going to be talking about some of those even newer updates tomorrow. So if you want to go get caught up, go listen to episodes 494 and 495 as we do a two-part series on Gemini 2.5. You don't got to wait for anything. It's live. It's there. It's free on our website. Go listen to it. So a very quick rundown as we wrap things up.

our part one. Here we go. Number 15, imagine four, 14 Chrome with Gemini integration, 13 personalization and email 12 notebook LM updates, which I'm extremely excited about 11 Gemini diffusion, a brand new type of large language model, 10 real-time translation in Google meet only English Spanish now, but more coming soon. Nine Gemini app updates and eight Gemma 3D.

three and the world's most powerful small language model. It is insanely good. I can't wait for tomorrow. Make sure you tune in for part two. I'm telling you some of these things that we saw mind boggling. I don't even know how I'm going to verbalize it with words, even though that's all I do. So thank you for tuning in. If you haven't already, please go to your everyday AI.com sign up for the free daily newsletter. Please make sure you join us tomorrow and every day for more everyday AI. Thanks y'all.

And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Ep 530: Google I/O AI Updates: 15 new features and how they can grow your business (Pt 1 of 2) 46:06 Share

Everyday AI Podcast – An AI and ChatGPT Podcast

Deep Dive

Shownotes Transcript

We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Ep 530: Google I/O AI Updates: 15 new features and how they can grow your business (Pt 1 of 2)