cover of episode EP: 482 Google’s surprise AI releases: What’s new and how it changes the LLM race. Gemini 2.0 Flash Thinking, Deep Research 2.0, Gemma 3 and more.

EP: 482 Google’s surprise AI releases: What’s new and how it changes the LLM race. Gemini 2.0 Flash Thinking, Deep Research 2.0, Gemma 3 and more.

2025/3/14

Everyday AI Podcast – An AI and ChatGPT Podcast

AI Deep Dive AI Chapters Transcript

People

Jordan Wilson

一位经验丰富的数字策略专家和《Everyday AI》播客的主持人，专注于帮助普通人通过 AI 提升职业生涯。

Topics

Jordan Wilson: Google近期发布了一系列令人意外的AI更新，涵盖Gemini模型及其他小型语言模型和机器人技术，这改变了大型语言模型的竞争格局。Gemini 2.0进行了多项更新，包括Flash Thinking的升级、个性化更新、机器人技术整合以及Notebook LM的更新，同时还发布了小型语言模型Gemma 3。大型语言模型可分为传统Transformer模型和推理模型两种，Gemini 2.0 Flash Thinking是推理模型的更新版本，具有改进的性能和推理能力。Gemini 2.0 Flash Thinking的新功能包括文件上传、改进的性能、更好的推理能力和速度提升，以及内联图像生成功能，可以在Google AI Studio免费使用或在Gemini付费用户前端使用。Google Deep Research 2.0 使用Gemini 2.0 Flash Thinking，采用分步推理方法，改进搜索和信息整合能力，提供更优质的报告生成。Google Gemini 的个性化模式利用用户的 Google 搜索历史记录来改进上下文和响应，未来还将整合 Google 相册和 YouTube。Google Gemini Robotics 基于 Gemini 2.0，将多模态推理与物理动作相结合，使机器人能够理解自然语言指令，并与人类和环境无缝交互。Notebook LM 更新后由 Gemini 2.0 Thinking 驱动，提升了准确性和理解能力，并增加了笔记中的引用功能和音频概述的自定义来源功能。Gemma 3 是一个轻量级、高性能的开源小型语言模型，可在各种设备上运行，其性能在ELO评分中名列前茅，改变了大型语言模型的竞争格局，并为本地运行大型语言模型带来了新的可能性。

Deep Dive

Chapters

Google surprised the AI world with a flurry of new AI updates in March. This episode explores these unexpected releases, focusing on the Gemini models and other advancements like new small language models and robotics.

Google released numerous AI updates in March.
Updates included Gemini model improvements, new small language models, and robotics advancements.
The releases were unexpected and significant in the LLM race.

Shownotes Transcript

Translations:

中文

This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Out of nowhere, Google just said, hey, it's March. Let's bring some AI madness. Because out of nowhere...

Google just had a huge drop of new AI updates, both to its Gemini models and even outside of that with some new small language models and robotics. Like, where did this all come from? So today on the show, we're going to be looking at Google's surprise AI releases, what's new and how it changes the LLM race.

All right. I'm excited for this one. I hope you are too. But what's going on, y'all? My name is Jordan Wilson, and this is Everyday AI. Welcome. This thing is for you. It is your daily live stream podcast and free daily newsletter helping everyday people like you and me not just learn AI, but how we can actually leverage it to grow our companies and our careers.

If that sounds like you, welcome. This is your new home. You're in the right place. The other home that you need to do is, well, our website, youreverydayai.com. Because yeah, you can learn a lot from the podcast and the live streams and the great guests that we bring on, but how you actually leverage it

That's what happens in our newsletter. So make sure you go sign up at youreverydayai.com. Also on there, you can go listen for free to now like 470 some back episodes from some of the leading experts in the world and myself.

uh, on whatever you want to learn about, right? Whether it's, it's marketing, uh, communications, legal ethics, it's all on our website, sort of by category. Uh, all right. Uh, if, if you want the daily news, uh, go check that in the newsletter. Sometimes we, uh, you know, spit that out right before, uh, we start our podcast episode, but there's so many Google updates today. I'm like, I don't want this to accidentally turn into a, you know, 50 minute show. I know some of you are all are, uh,

walking on the treadmill while listening to this live. So, uh, I won't, I won't keep you too long.

All right. Livestream audience, thank you for joining us. Love to see it. What questions do you have? I'll see if I have time to tackle them at the end. And have you tried any of these new updates? And if so, what do you think about them? Maybe if you have a good take, we'll put your take in our newsletter for today. So thanks for joining us. The YouTube crew coming strong. Yeah. So this is

If you didn't know, if you mainly just listen to the podcast, this is an unedited, unscripted, the realest thing in artificial intelligence. So, yeah, we do this live. So thanks to live stream audience, Dr. Harvey Castro, Christian and...

the AI physician on YouTube. Love to see it. Michael, big bogey face. Sandra, Marie joining from LinkedIn with Douglas and Christopher and Denny and Brian. Thank you all for tuning in. Woozy, good to see you. All right, let's get into Google's surprise AI releases. Y'all, this came from nowhere. Like the amount of new AI updates that Google has released over the last like three days, it's like nothing.

their Super Bowl. So we all know back in December that OpenAI and Google kind of had this back-to-back AI release fight. It was huge. OpenAI had their 12 days of OpenAI and Google kind of out of nowhere, I think, came in and stole the show maybe. But generally,

You know, when these AI updates are coming, right? The company might say something, they might hold an event, a release, right? This was out of nowhere. I was not expecting this. And I think most people, you know, unless you worked at Google, we're not expecting everything that Google just updated over the past couple of days. And, you know, we've been recapping them in our newsletter. And I was like, wait, like,

This is so much like what is going on? All I know, especially if you are a fan of Google's models, if you use their Google workspace, you know, for your organization, some huge things here. I'm not even going to be able to cover it all because there's that much. All right.

Let's get also FYI. Is anyone else going to be at the GTC conference at NVIDIA? So that's I'm going to be there next week, actually starting Sunday, but I'll be there from the 17th to the 19th. But the conference goes through the 20th.

So hey, if you are going to be at the NVIDIA GTC conference, make sure to holler at me. So I'm excited to be partnering with NVIDIA to bring you all a lot of exclusive insights. We actually have something fun that we're working on with the NVIDIA Inception program, right? They have thousands of some of the brightest startups in the world in the NVIDIA Inception program. So I'll have more details on that for you guys next week. But I'm extremely excited.

And if you don't know, essentially, NVIDIA powers AI across the world. Most of all the biggest companies are using NVIDIA's GPUs to create their AI or create the AI that we all love and use. So it's going to be some exciting updates coming there. All right. Here's what's new.

So there is an updated version of Google's Gemini 2.0 Flash Thinking. Google's deep research got Gemini 2.0 Flash Thinking. All right. So it got upgraded and updated to the latest model that Google just released. Now we have Gemini with personalization updates, which some people might not like. I personally love it.

Uh, then we have a Gemini 2.0 robotics. We have notebook LM updates and Gemma three, which I think is probably of all these things, even though I'm not going to be using a Gemma three a lot personally, aside from some testing, I think Gemma three, which is a Google's kind of small language model might be the biggest deal out of all of these, right? Um,

And we'll get into that. I'm going to cover that at the end. But this is huge. This is huge. So I'm excited to talk about a lot of these also because Gemma is open source as well. So a lot to cover here. All right, let's get started. And hey, yeah, live stream audience, welcome.

Let me know. So Big Bogey says Gemini will be the wrapper for all your Google products. Yeah, that's a good point. Douglas is saying Gemma 3 is interesting, but I'm seeing a decent amount of issues reported with users having issues of quality on Olamo solutions for local hosting. Yeah, I use Olamo as well. I did download one of the smaller versions of Gemma because I can't do the 27B version on my computer just yet.

Yeah. Now Michael is saying Google doesn't tease like OpenAI. They just deliver. Yeah. Obviously, Google in the whole LLM race, they came out straight up stumbling. Right there. I mean, their transition from Bard to Gemini there. You know, their initial Gemini release snafu. Right. They came out with a marketing video essentially when they released Gemini. Right.

you know, in December of 23, I believe it was. And essentially they showed all these capabilities that weren't possible, right? So Google came like, had the absolute worst rollout possible. I think they were extremely far behind until mid 2024, right around September. And then, you know, in 20 or sorry, and then in December, like I just talked about,

I think Google went from like, oh, okay, you know, they're number two, number three, right? Probably around September of 2024. Until that, I'm like, all right, they're barely top three. But now it's 1A, 1B with OpenAI and Google. And I think they're constantly flip-flopping those spots.

All right, so let's talk about what's new in Gemini 2.0 Flash thinking. Okay, so without going into too much detail, right, if you listen to this show, you know, but there's essentially now two kinds, or two very distinct flavors of large language models, right? You have your kind of quote-unquote old school transformer GPT type models, right? And that's your normal Gemini 2.0 and Gemini 2 Pro.

And then you have your reasoning models, right? These are models that kind of use, you know, chain of thought. They use this step-by-step thinking. They kind of do a lot of the work that humans would do under the hood. So they use more compute, you know, more inference. They take longer.

But generally, they give you much better results. So this is the Gemini 2.0 Flash Thinking is the updated version of that. So a couple of things that are new. Number one, file upload, improved performance, better reasoning capabilities, improved speed. Also, you can try this for free in Google AI Studio. FYI, on Google AI Studio,

It's free. You get to use everything. It's amazing. There's no data protection. FYI, right? On the front end, if you are using Google Gemini as a paid user on the front end, there's

Great data protection, right? It's enterprise grade data protection. Your data is not being shared with Google or anywhere else, right? So if you're using Gemini on the front end, Google AI Studio is more of a sandbox, right? It's not necessarily something where you or your company would go to get work done, right?

If that makes sense. It's more of a sandbox, but I know a lot of people are using it as their main model, which I wouldn't necessarily, but you can. So you can use this new Gemini 2.0 Flash Thinking for free in Google's AI Studio, or it's available for paid users on the front end of Gemini. And paid users now get a 1 million token context window. That's amazing.

love to see it on a reasoning model. Right. Uh, I, I haven't talked about this a lot yet, but you know, I, I, I'd say in, you know, mid 2023 to 2024, you know, all the, the rage was around, you know, rag pipelines and, you know, so this retrieval augmented generation, uh, you know, I think the long context windows are going to make rag a little less, uh, important. Right. Uh, I, I still think, uh,

RAG definitely has its benefits in terms of accuracy, but I don't know if a year or two from now, we're going to be as concerned or talking as much about RAG pipelines just because these context windows. The fact that we have a million token context window right now, that means you can dump in hundreds of documents, hundreds of pages worth of documents for free,

or sorry, you know, for basic paid users, Google is going to be able to, Google Gemini is going to be able to remember it all in this new version of 2.0 Flash Thinking. All right. But I think one of the biggest things, and this is what's on my screen now, that people are going to be talking about is inline image generation, right? Using Google's Imagine model. And it's really good. So I just did this example. I was playing around with it a little bit last night.

So I said, and this is inside Google AI studio. This is just a simple prompt y'all. This is how I actually use models, but it's easier for screenshots, right? So I said, write a long form blog post about why tourists should visit Chicago, make it very detailed and create photos of the most historical stops, right? So this is something that,

People might do. Right. So you might be writing, I don't know, a blog post for your company or updating something, you know, on your website that's super old and it needs visuals. But OK, so not only. And again, I'm accessing Gemini 2.0 Flash thinking in this space.

instance through Google AI studio and the output format, you'll see, you can set to images and text, right? So this is amazing. So then look at this, uh, this output. So it literally writes me a blog post about the top places to visit in Chicago. Then it creates AI images of those historical sites in order, right? Which is so good. It's so good. Right. Um,

This is one of those small features that when I used it, I was like, huh? Right.

Kind of at a loss of words, you know, just because this is something I used to do, right? I've been, you know, obviously I have a background in marketing and content writing and as a journalist, right? And this is stuff I had to do all the time, you know, spend a long time writing a, you know, good blog post, you know, go find photos that you can legally use, right? Sometimes this is a process that would take a day or longer, right?

Gemini 2.0 Flash Thinking just did it in a couple of seconds. And that a very, very high quality, right? You can't see it maybe too well on my screen here because I just have a screenshot, but these AI images were really good, right? And it went along with the content. So think about what that means and can mean for your business, right? I mean, number one, there's no need for you anymore to have those old, terrible stock photos. Like, sorry, you know,

you shouldn't, right? But I mean, also you shouldn't just copy and paste anything, you know, out of a large language model and slap it on your website. You know, we, uh, the internet is, uh, sloppy enough as it is, right? So you would probably want to do a little bit better job, take advantage of that context window, put in a bunch of, uh, information about your company or, you know, your old blog posts, maybe if you want it updated, uh, right. So you,

you do want to do some, some good human in the loop work before and during and after, and you probably want to iterate, but yet this would take a process that would literally take multiple hours, sometimes a day or more and get it down to, you know, maybe 10 minutes, right. And you're probably going to have something better, especially if you spend the time sharing the context and the, the information about your company to improve this. And that's huge. And,

Yes, right now this is available in Google AI Studio. Yes, Google AI Studio doesn't have any data protection, but y'all, anything you put out on the internet, they all know anyways, right? So that's why I'm saying like, this is great for something like company blog posts. If it's, you know, documents that you use internally that don't have sensitive or proprietary information, right?

I would do it, right? Yeah. Google's going to use it for training, but if it's on the internet anyways, guess what? Google and open AI and Microsoft and meta and everyone else has already gobbled it up. All right. So that's, that's a big one. That's a big one. All right. Next.

And this one, even though I think Gemma three might be the biggest update, uh, the one I'm most excited about might be, uh, deep research. All right. Because now we have kind of this deep research 2.0 because it got, uh, Gemini 2.0 flash thinking and the deep research product is completely different now. All right. So the way that, uh, in, in, in first, you know, everyone's like, Oh, you know, uh,

who was first with this deep research? So technically Google was the first to release a deep research product. And then throughout, you know, January, February, March, everyone and their mama released a deep research product, right? It's all the rage, right? So obviously opening eyes is still the best. We got deep research from perplexity. We got deep search, which is the same thing from Grok, which when it first came out, I'm like, this is terrible. It's actually improved a lot. The Grok deep, deep search product.

is actually pretty good now. If you listen to the show, I'm sometimes hard on Twitter slash X slash Grok. The Grok deep search is actually pretty good now. So Google's deep research was the first, even though if you go back and look at reports, OpenAI was reportedly working on this back in May,

So, you know, they were the first company to be tied to this deep research, although Google was the first to market. Their first version of deep research works completely differently. So I've only spent maybe about an hour with this so far because it's brand new, but it works differently. So the first version of deep research

essentially used cached versions of pages and it just kind of ingested it all at once in one giant step, right? Uh, which was great, right there. I think there's pros and cons to that approach. Uh, so the depth in detail, uh,

wasn't always there, but it did do a great job at very quickly, um, kind of synthesizing the information that humans would go out and want to do anyways. Right. Uh, but it didn't take this kind of step-by-step thinking approach, which is what it does now. Right. Which is great because this is more of what,

you know, Grok and Perplexity and OpenAI's deep research do is they take this reasoning approach. They go step by step because what happens, right? If they're looking at some of the most authoritative sources first, right? So let's say, you know, you ask it about, you know, the latest updates, right? This is just an easy one. The latest updates from Google,

If you'd ask the old version of Google's deep research, it would just look at a cache of hundreds or maybe a thousand pages all at once, not knowing that maybe 95% of those sources might not make sense now because Google over the last three days has just released all of these new updates. So now it's going to start in kind of a step-by-step fashion. So it's first going to do some high-level research and it's going to find out, wait, oh,

Google just had a bunch of new updates. So, you know, if the user is asking about new updates from Google, we should then focus our, uh, the rest of the search just on those things. Right. So that's extremely important in a huge, uh, a huge update from Google. Right. Uh, so I haven't used this enough to test it, to see if it's going to be on the level of, uh, open AI's deep research. Um,

I did a whole show on that. So if you're very interested, you can go back and listen to it. Right. Perplexities was pretty bad. Hallucinations were off the charts. The old Google deep research was pretty good. You know, had some hallucinations. Grox, which we didn't compare at the time because it wasn't out. Grox is actually pretty good. Yeah.

and then open ais was just in in a league of its own so uh i'll have to see uh where the new google uh deep research kind of 2.0 with flash thinking uh where it goes but uh it's great and the thing i like as well is it's available um as a kind of like normal

model, if that makes sense. So, you know, in, in LLMs, you have like kind of models and modes and everyone works a little bit differently. Right. Uh, so previously you would have to choose, um, deep research as a dropdown in the menu. Right. So now if you're on a paid account, um, on deep

Gemini's front end. So you're using Gemini as a chat bot, right? So you can just now click, you can still do it as a model, but then there is also a mode. So there's a new icon where you would chat that says deep research, which I like because then you can start another conversation in a different mode or a different model. And then you can click the deep research button and work within the context of that same window. So a big kind of, you know,

So a couple of the things, let's just go over the bullet points. Number one is there is free access to deep research. So it's very limited, right? We'll

We'll double check in the newsletter how many queries you get. But even if you have a free account, you do get some access to deep research. Like I said, it is definitely enhanced with the new Gemini 2.0 Flash Thinking. So essentially, it uses reasoning that improves the planning, the searching, and its ability to synthesize, which ultimately gives you better, faster, deeper report generation that's just...

I mean, the quality is going to be much better. The other thing, which is great, is you can see the reasoning process, right? So you can click and see where it's going. One of the biggest hacks, which I think people don't do, which some of these I talk about on the show, some of them I don't because they're my secrets, right? Maybe I'll do a show on that one day. But go always do a deep research twice. Never do it once, right? Because you should go and look at its research.

So you should go and look as an example. Oh, here's what Gemini Deep Research did when I asked it about the newest Google AI. Oh, I can see it started by searching just deep research or just Gemini 2.0. So maybe you can...

or what you should be doing is you should be reading and learning by looking at that, uh, the reasoning process. And then I always do this. I manually do this. I take notes. Uh, you know, I say my original query, I see what went wrong, what went right, what could be improved. You know, I literally look at its research process and I'm like, yo, it maybe didn't start out right. Or, you know, it got a little sidetracked, you know, halfway through, right. So much of, um,

I think so much of the improvement or what's left to be desired from large language models isn't because the models aren't good. It's because the human's instructions aren't clear enough, right? So that's why I always say take a second stab at anything deep research. But I think, you know, if you're looking for immediate ROI, right?

Deep research products, everyone should be using them, right? And Google's new one here, I think, you know, it was kind of in its own category because it was the first one and it worked a little bit differently. And it's like, oh, okay, it's great. But compared to everything else, not that good. But yeah, now it's instantly back on the map. All right, next. And this one's going to be a mixture for people, all right? It's going to be a mixture. You don't have to use it.

But Google did just release a version of Google Gemini, so a mode. So if you're a paid user, you can click the dropdown and you should see this now. The other good thing, I mean, bless up. Finally, some of these new models came to my workspace account. So you always heard me say like, oh, I have a paid account for my personal Gmail address.

Are you still running in circles trying to figure out how to actually grow your business with AI? Maybe your company has been tinkering with large language models for a year or more, but can't really get traction to find ROI on Gen AI. Hey, this is Jordan Wilson, host of this very podcast.

Companies like Adobe, Microsoft, and NVIDIA have partnered with us because they trust our expertise in educating the masses around generative AI to get ahead. And some of the most innovative companies in the country hire us to help with their AI strategy and to train hundreds of their employees on how to use Gen AI. So whether you're looking for chat GPT training for thousands,

or just need help building your front-end AI strategy, you can partner with us too, just like some of the biggest companies in the world do. Go to youreverydayai.com slash partner to get in contact with our team, or you can just click on the partner section of our website. We'll help you stop running in those AI circles and help get your team ahead and build a straight path to ROI on Gen AI.

And then I have a paid Gemini account for my work, right? So we use Google Workspace, right? Which used to be called G Suite and they've had like 30 other names for, you know, for whatever Workspace is now, you know, in the past five years, right? But I'm finally seeing some of these new modes available in my paid Workspace account, which is great.

because I'm like, what good does all of this do, right? All these front-end Gemini updates if I can't use them with my work data, right? So now many of these updates are available inside my back end.

paid workspace account. So keep that in mind. You should probably go check for yourself, but not all of them. So as an example, the only one I don't have available in my paid workspace account is this one, the personalization. So that's only in my personal Gmail. So maybe they will be rolling this out to workspace accounts. But if you have a paid Google Gemini on your

you know, on your personal Gmail. So that's just like at gmail.com, right? You will have this, or you should have this personalization. So it probably goes without saying like what this does, but it literally uses your Google search history to,

to improve your, and to personalize your Gemini queries, right? So essentially now Gemini can connect with users' search history to improve the context and responses. And there's going to be future integration with Google Photos and YouTube. All right. So yeah, a lot of people are going to look at this and be like, oh, this is a privacy, right? Privacy issue.

I don't want Gemini, you know, having my Google search history. I don't care. Take it right. I'm one of those people. And like, I feel there's this, this is a very polarizing issue.

I don't care, right? Meta, take my data, even though I don't use Facebook or Instagram or WhatsApp, right? But I use Llama. I use meta.ai, right? Meta, take my data. Google, take my data. Microsoft, take my data. I don't care, right? I would love to see more personalized ads, right? I can't wait. Why does my Google TV...

Why does my Google TV, right? That's what I use for cable or whatever streaming. I don't know. Uh, but, or my, my YouTube TV, gosh, am I 90? Uh, what, but why does my YouTube TV not have this feature? Right. I want my YouTube TV to just show me like ads for AI or ads for, you know, I don't know. Like I love North Carolina basketball and Chicago stuff. Right. Um,

So I'm fine and I'm looking forward to trying out this new personalized version of Gemini. But the problem is, like I said, it's only available on my personal account. And for the most part, when I'm using Google, when I'm doing Google searches, I'm doing it under my workspace, my work account. So that doesn't have the feature right now. So I very rarely use my personal Gmail for any searching.

So this also does use the 2.0 flash thinking model. Also users, so there are some transparency tools. So users can view how Gemini uses their data sources, including past chats and search history. And privacy controls do allow users to disconnect, edit, or manage linked data at any time.

Uh, also some other small things in there. Uh, Google gems, uh, was also updated and that's, uh, now available for free users, uh, which is pretty cool. I've never been a big fan of gems. Uh, I'm going to have to go in and read. Look, one of the biggest problems was, uh, it didn't always do a great job at accurately grabbing the data out of my workspace account. Uh, so I'm going to go in and, you know, I'll see if, uh, you know, they didn't really talk a lot about the new Google gems updates. So, uh, if you don't know, uh,

right so you have as an example open ai has gpts right essentially you can create a small uh specialized version of chat gpt right based on your own data uh and you give it kind of custom instructions that's what google gems is uh it took google forever right uh to you know they announced uh google gems and then it was like nine months before they were actually released so i think they kind of missed the boat on that right their go to market kind of on that one stunk

It's improved, right? Like I said, now Google, I love what they're doing now. No shiny announcements. They just come out of nowhere. Like I said, it's March. They choose madness. They bring all these updates to us. But GEMS are available for free users, which is cool. So even if you're not a paid user, you can go use GEMS.

And, you know, gems are also available on workspace accounts as well. Although the data sharing, right, and being able to connect with your Gmail, your YouTube, your calendar, not always as robust or accurate in a workspace account, which I don't understand why. But on the personal account, it does a pretty good job.

All right. There's more, y'all. All right. This might not be for all of us, but I think the implications on this are pretty huge because Google did announce Gemini for the physical world or Gemini robotics based on Gemini 2.0. All right. So essentially...

Google's like, yeah, this is good. You know, Gemini 2.0, here's a version of it for robots. And this is really going to impact the physical world. All right. So Gemini robotics integrates Gemini's 2.0 multimodal reasoning, right? So text images, audio and video with physical actions, which enables robots to understand natural language commands, adapt to changes in real time and interact seamlessly with humans and environments. So, yeah,

It's pretty big, right? Because this, according to my knowledge, this is the first publicly available. And when I say publicly available, it's commercially available, right? But this is the first time one of the big AI labs, right? So if you're saying like, okay, that's Microsoft, Google, OpenAI, Google,

anthropic. You could throw Mistral in there, maybe Cohere. As far as I know, this is the first time a company's like, yes, here is a model for robots. Generally, a lot of these are proprietary and they must be good because even as an example figure, which is one of the biggest, probably three to four names in humanoid AI robots, they dropped OpenAI's model and they are now using their own. I believe that's called Helix. So pretty big news here from Google. So

Some of the features and updates. So it enhances dexterity and manipulation. So robots can now, with this new Gemini 2.0 robotics update, can now perform complex multi-step tasks required for fine motor skills like folding origami, packing lunchboxes, or handling delicate objects like coffee mugs, what's in my hand.

So maybe in the future, I could have a Google Gemini 2.0 robot as I'm doing this show. It can just pick up my coffee mug and put it in my mouth so I don't have to take a break and I can keep typing and moving my mouse. Right. A couple other things.

It has embodied reasoning. All right. So that's Gemini Robotics ER. So it's a new model. So Gemini Robotics ER adds advanced spatial understanding and coding capabilities, allowing robots to plan, detect, and interact with objects.

So very cool there. So, yeah, even though, you know, we don't go hard in the robotics paint here on the Everyday AI Show, this is going to be something that impacts all of us, right? Whether you know it or want it doesn't matter. You know, if you listen to our 2025 AI Roadmaps and Prediction Series, which I love.

So Jess, you all go back and listen to, right? I said embodied AI. So not just humanoid, you know, humanoid AI robots, but I said embodied AI in general is going to be a huge thing in 2025. I think 2024 was too early. But I mean, here we go. Google's getting in the game.

And the biggest thing is now, you know, people might be like, oh, okay, does this mean Google's going to, you know, have all these robots? No, not necessarily. I think this is a big step from Google to compete with NVIDIA for data, right? So what I think many companies are shifting toward, right, as they try to make AI more useful is they are shifting toward world models, right?

AI is great. Large language models are great. Generative AI is great, right? But ultimately, you know, as companies are in a foot race or a robot foot race toward AGI, artificial general intelligence, ASI, which I don't necessarily want, but we're racing there anyways, artificial super intelligence, right? You know, the big labs and the big AI companies now understand we need as much data as we can with AI in the real world, right? Essentially, right now,

large language models are kind of confined to what we do as knowledge workers in front of a computer, right? How we think in front of a computer, how we create content, how we synthesize information, right? So one of the biggest next frontiers of AI in the real world is something like this. It's this data. Think of now all the data that Google is going to be able to

uh, you know, have now, uh, right. And, and reportedly they're, uh, uh, working with groups like Boston dynamics, uh, agility robotics, um, which I think we might be, uh,

Bringing, bringing you all in, in an interview with agility. I think I'm going to have to check my schedule for, uh, NVIDIA GTC. Right. But now Google is going to have all of this real world data. And that even makes the AI that we use today so much more useful. Yeah. It improves creative tools, right? Like, uh,

you know text to video right like you know sora in vo2 right uh because as you get more a better understanding of the actual physical world uh that improves things like you know ai video generation it improves obviously you know humanoid uh robots but it also just improves um how applicable and useful ai and generative ai is in the real world because right now

Large language models, for the most part, don't understand how we interact with the physical world. So actually pretty big announcement, even if you don't care about humanoids or anything like that. All right. Art Tech here from YouTube just said, I want more robots.

Hey, Big Bogey, I think I'm with Big Bogey here. He said, I'm not sure I want my nice, clean, expensive robot washing dirty dishes. Yeah, that's a good point. But also what if it's converse, like on the flip side? What if your robot's super dirty and your dishes are super clean and expensive? I don't know.

um all right a couple more things from google yeah i told y'all there was a lot another small one that google kind of snuck in i don't even know if there's a blog post for this necessarily

So this is just from Josh Woodward. He put out a tweet on the Twitter machine. Also, side note, the Notebook LM team, they've been crushing it. They've been crushing it. They've been putting out great updates. But hey, Notebook LM got that 2.0 love as well, which I think is really going to change how we use Notebook LM. If you listen to the show, Notebook LM was one of our top

AI tools or features of 2024, and it wasn't even close. I'm a huge Notebook LM user. I use it every single day. It is, I think, one of the most underrated still. It is one of the most underrated, underutilized, least talked about, most useful AI tools out there. All right, so some new updates from Notebook LM. Well, the biggest one is now powered by a new model. It's now powered by Gemini 2.0 Thinking.

That's huge, right? Any thinking model, yes, it takes a little longer. I mean, but if you look at any benchmark, thinking models always outperform. They're always going to give you better answers with more nuance, better understanding, higher accuracy, lower hallucination level. So the fact that notebook LM, which is grounded in your data, right? So that means

It's not like Gemini or ChatGPT or Claude where you can just go in and start asking questions. If you go in and start asking questions of Notebook LM, it's like, yo, I don't know anything. You got to give me data. And it only works with the data that you give it. So if I upload a bunch of data about everyday AI and I say, hey, explain how to make pancakes, unless I've talked about that on the Everyday AI show,

notebook glm is like yo i don't know go figure that out yourself i have like i don't have that right so it doesn't just make things up to try to be helpful uh which is huge uh also side note i've gotten real heavy into my pancake making game making them from scratch the past couple of months and i'm like why why have i been making boxed pancakes for you know 25 ish years right

from scratch pancakes gotta love it all right couple other things new in notebook lm so now there's citations inside of your notes let me tell you what that means uh so if you're using notebook lm right uh you upload all your sources that can be youtube videos it can be google docs copy and paste um you know certain urls although that doesn't always work well because a lot of them are blocked right and then you can chat with notebook lm and then you can save what's called notes

Right. But the downside, like, so in the chat window, everything is sourced. Right. So let's say I upload 500 transcripts of everyday AI and I'm like, yo, when did I talk about, you know, mid journey? And it's like, okay, here in episode, you know, 320, you know, Rory Flynn came in and gave these five tips for mid journey. Right. So if I save that for a note,

And then I go back later and look at that note. The citations previously were not there. Okay. So now when you create new notes, it keeps those citations. So previously you can only click right and go in and it will show you the source, right? Which is huge. So, you know, unfortunately there were some downsides previously to using these notes.

inside Notebook LM, but now the citations carry over. That's big. And then also in audio overviews, you can customize the sources. So that's great as well. I used to kind of do that manually. I would get one big notebook with all my sources, then I would duplicate everything, delete sources. So now it's just better. So the Deep Dive AI podcast with the two hosts, which are great,

there's the interactive feature which is not new but it's still fantastic uh so now you have a little bit more customization uh with the sources for audio overviews uh douglas says i like this from scratch pancake recipe from alton brown highly recommend hey douglas what about the uh the from scratch pancake recipe from jordan wilson uh it's honestly i just use chat gpt uh so i can't

I can't claim anything. All right. Hey, maybe episode 500 should be Jordan's secrets to box and boxed pancakes. All right. We'll see. We'll see, Angie. All right. Last but not least, I kept you guys long enough. Gemma 3, which I think...

Even though most of us may not be using Gemma 3, right? I know a lot of us are. We have some tinkerers in the house. But Gemma is Google's small language model. It's open source, which means you can download it. You can fork it. You can fine tune, right? You can do a lot of things with open source models.

But generally, because you're working with them offline, you either, one, have to have an incredibly powerful computer, or two, you're working with a very small variation of this. So Gemma 3.0.

Big news. All right. So it is lightweight, high performance. So it does it. I'll say it's a state of the art, a small language model now designed to run directly on devices. So ranging from phones to workstations. So here's the different model sizes. So it is one B4B12B27B. Those are billions of parameters. So as an example, right. And I'll just, I'll just speak in generalities here.

All right. So let's say you have the newest smartphone. All right. I'd say the newest smartphones could usually run about a 4B model. All right. So that's billions of parameters. All right. So again, what's the, like, why does this matter? Why are you talking about this weirdo? So all the AI that we use right now, for the most part, goes to the cloud.

Right. So number one means it's slower. Number two, it in theory means it's less secure. Although I personally think, and maybe this is because I'm, I'm very lazy fair with, with, with data, right? Um, when you use, if you have the paid version of chat, GPT paid version of Gemini paid version of Claude paid version of copilot, uh,

you have nothing to be concerned about when it comes to your data, because you can turn off model training. You know, it doesn't show up randomly on the internet. It doesn't go to like, I honestly don't understand why people don't understand, like why people don't choose to educate themselves.

on data privacy and protection when you're using a paid version of a large language model, right? People think are like, oh, well, I would never put, yeah, yeah, we have an enterprise version of, you know, Copilot 365 or Gemini or whatever, but I would never put my data in there. It's like, oh, okay, what are you using for cloud storage?

Oh, the same company. Guess what? It's the same data protection, which I don't understand anyways. Right. So these small language models like Gemma three are great because they're it's edge AI. It's, it's local models. It's offline, right? So you don't even need an internet connection. You can download them. You can go in and fine tune them. You can create your own version of them or your company can create the own version of them. And then you can run them locally on device, which is huge.

Number one, even though I know a lot of people don't care about this or think about this, it's better for the environment. That's one thing I think people are overlooking when we talk about small language model. I think people talk about speed. It's faster. It's more private and it's more secure because you're not sending all of this information to the cloud. But yo, how about the environment?

Can we clap for small language models on device AI, edge AI? It's better for the environment, y'all. I know people get all up in a tizzy, right? Like, oh, a chat GPT search takes 10 times more power or consumes 10 times more electricity or 10 times more than a traditional Google search, which...

I will offer the flip point. Yo, like I have to do 20 Google searches to get what I get from one chat GPT query. So is it, does it consume more power? Absolutely. Are you doing fewer Google searches? Yes. Right. I barely do a traditional Google search, uh, anymore, which gosh, Google dropped so much. I didn't even mention Google's new AI mode, uh, that they just released. So yeah, they, they literally chose AI violence by releasing all this. Um,

So getting back to the small models, so hopefully you can kind of understand the difference. It's better for the environment. It's safer. It's faster.

But right now, you know, the biggest version, the best version, the 27B, for the most part, no one can run that on a single PC, right? If you're the IT administrator, you know, and you guys have some compute, right, right? You have a server rack. Yeah, you can run a 27B model. But for the most part, the average person can't yet, right?

Right. So I do think at this time in like, you know, two years, even our phones will be able to run like a 27 B model. Right. Because the chips that we're all using. So the GPUs are improving. They're getting faster and cheaper. The NPUs, the neural processing units, which are like AI chips.

TPUs, right? So all of these kind of AI chips that we use to create and use AI, they're getting more capable. They're getting faster. They're getting smaller in physical size. So that's why things like this Gemma 3 are actually wildly important because I think, right, like I said, in probably two years, we're all going to have the possibility or the option to run a state-of-the-art large language model

locally on our device. Let me show you a graph here why I think this is important. This is the best small language model that has ever been released. If we look at ELO scores, which I know I talk about a little bit,

But an ELO score, if you go to the LM arena, which I think is one of the most important things, we always look at benchmarks. I think benchmarks are great in some regards, but they're also, they can be a little deceiving because companies can essentially overfit or overtrain their models to perform well on benchmarks. But then it's like humans hate them.

Right. So ELO scores. So if you go to the LM arena, it's I always say it's like the blind Pepsi taste test. You put in one prompt, you get two outputs. You have no clue which models they are. You choose which output is better. And that gives you what's called an ELO score like chess. Right. So the higher the score, the better the model. And Gemma three, I still can't comprehend. This is the twenty seven billion parameter version.

It is a top 10 model on the ELO scores, which is nuts, right? Yeah, there's dozens and dozens of models, but it is a 27 billion parameter. So as an example, DeepSeek V3.

is a 671 billion parameter model, right? Everyone's going crazy deep seek. And yeah, by the way, they didn't tell the truth about their training and how much it costs. Right. Uh, I love this. Google puts together a little chart. It said NVIDIA H 100 GPUs required to, to train these models. And they're like, nah, deep seek. You didn't do this for, you know, $5 million in your backyard. You needed like a huge cluster of GPUs, right? Like

a lot of the reports said, right? I think DeepSeq was intentionally miscommunicated or under communicated, right? And then everyone else in the world called them out and like, yeah, that's not how it works. Anyways, GEMA3 is a 27 billion parameter model. So it is significantly smaller than all of these other models and it did a better, it had a higher ELO score. So this is human preferences, right?

humans preferred it over as an example, deep seek V3. That is a 671 billion parameter model. So I'm not great at math, but that is 30 times.

times larger. So humans preferred Gemma three, which is a downloadable 27 billion parameter model. They preferred the responses from Gemma three over deep seek V3, which is 20 times larger. Okay. We're not talking small multiples. It's 5% smaller, 20%. No 30 X.

30X. Gemma 3 is one of the most, I think, exciting advancements that we've seen in small language models potentially ever. Because I think this really changes not just the LLM race, but it changes what's possible. Because there's a lot of things...

When we talk about the future of working with large language models locally that we thought maybe were three, four, five years out, nope.

It's today, right? Because as an example, yeah, 27 billion parameter, you got to have some juice to run that. But I believe that would run on a single NVIDIA Digits. All right. So NVIDIA Digits is kind of a new like supercomputer from NVIDIA. But for $3,000, you have a supercomputer in NVIDIA Digits and you can run now,

on one NVIDIA Digits, which you can use as its own computer, or you can hook it up to your existing computer, right? For $3,000, you can run locally Gemma 3, a state-of-the-art small language model. It's going to be fast. Humans prefer it over things like DeepSeek V3, Lama 3, O3 Mini, Mistral Large, right? So it performs very, very well. So y'all, three years ago,

If you would have said, hey, how much is it going to cost in 2025 to run a state-of-the-art model that's one of the top 10 preferred models in the world on your own device, I would have said a couple million dollars. And I think most people would have said a couple million dollars. You can do $3,000.

This changes the race completely. So although I'm not going to be using Gemma 3 every day, and I think a lot of our audience isn't going to be using it as well, right? Like I said, unless you're a developer, unless you are someone making decisions on the IT side, and if you're more technical, you might be using Gemma immediately. But y'all, that is completely changing. I think it's like not just changing the LLM race, but

Like we're on a different track now, right? It's not just like, oh, we have like a new, you know, oh, Gemma 3 is now leading the pack. They're leading the race. No, it's a new race. It's a new race. I mean, just because a model that small performing that well in human preferences completely changes how we look at AI and its usefulness and how and where and why we use it. All right, y'all. I hope this was helpful.

If so, let someone know about it. All right. Yeah. Google surprised us all. Uh, I hope you were delightfully surprised with, you know, getting a little bit of value out of today's show. Uh, if you did, if you're listening on the podcast, uh, appreciate it. Check out the show notes, please. Uh, you know, maybe, Hey, I always put my, uh, our email. Uh, I put my, uh,

my LinkedIn reach out, let me know what's your, what's your secret to pancakes or, uh, let me know what's your, uh, you know, favorite released from Google or do you not want to use any of them? Uh, I love hearing from y'all. Uh, my, my responses are going to be a little delayed since I'm going to be at Nvidia for a couple of days and probably be behind on some of my normal day to day work. Uh, so make sure you tune in next week. We're going to have special shows, uh,

I'm going to be reporting live at NVIDIA, be talking with NVIDIA leaders on what they announced at the keynote. I'm going to be talking with other partners. I think I have some...

interviews lined up with some startups, with some leaders at enterprise tech companies. So it's going to be an exciting week. So make sure you tune in for that. Also make sure you go to youreverydayai.com, sign up for the free daily newsletter. If this was helpful, yes, subscribe on the podcast, leave us a rating and review. I'd appreciate that. If you're listening here live on the Twitter machine, on the LinkedIn machine, please repost this. I know I've been told that Everyday AI is your cheat code.

But please, that doesn't pay the bills. If you keep this to yourself, share this with someone, share this with your team. If you're giving a presentation, I'd love throw the podcast on your presentation. That's what I wanna do. I wanna keep AI education free. I wanna keep it unbiased. I wanna keep it accessible. And hopefully keep it, I don't know, a little fun, maybe a little less boring than reading a bunch of research papers. All right, thank you all for tuning in. Hope to see you back.

Later for more Everyday AI. Thanks, y'all. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

EP: 482 Google’s surprise AI releases: What’s new and how it changes the LLM race. Gemini 2.0 Flash Thinking, Deep Research 2.0, Gemma 3 and more. 53:54 Share

Everyday AI Podcast – An AI and ChatGPT Podcast

Deep Dive

Shownotes Transcript

EP: 482 Google’s surprise AI releases: What’s new and how it changes the LLM race. Gemini 2.0 Flash Thinking, Deep Research 2.0, Gemma 3 and more.