本周AI大事记：从NVIDIA超级计算机到Claude联网，AI浪潮席卷而来

本周AI领域可谓是风起云涌，重大新闻层出不穷。作为一名AI领域的深度观察者，我将为你梳理本周最重要的AI新闻，并分享我的解读。

NVIDIA：AI计算的未来已来

NVIDIA在GTC大会上发布了两款重磅产品：AI桌面超级计算机DGX Spark和DGX Station。这两款产品都搭载了Grace Blackwell平台，能够提供强大的本地AI计算能力，让开发者、研究人员和数据科学家能够更便捷地进行AI模型的原型设计、优化和运行。DGX Spark是Digits系统的升级版，而DGX Station则是一款全新的、更强大的系统，拥有高达784GB的相干内存。这标志着本地AI计算能力的显著提升，也降低了企业采用大型语言模型的门槛，尤其是在数据隐私和安全方面。更重要的是，各大PC厂商，包括华硕、戴尔、惠普、联想、超微和微软，都将生产和销售搭载DGX Spark和DGX Station的电脑，这将进一步推动本地AI计算的普及。

此外，NVIDIA还公布了其到2028年的AI数据中心和GPU路线图，包括Rubin平台的更新、全新的Vera CPU以及未来的Feynman GPU架构。这份路线图展现了NVIDIA在AI计算领域的持续投入和技术领先地位，也预示着未来AI计算能力将得到指数级的提升。

Claude终于联网：姗姗来迟的“互联网+”

Anthropic的Claude终于获得了互联网访问权限，这无疑是一个重要的里程碑。虽然这一功能的推出比其他竞争对手晚了数年，但这仍然是一个显著的进步，将极大地提升Claude的功能和实用性，使其能够访问最新的信息和数据，并提供更准确、更及时的答案。然而，我仍然认为Claude在联网功能的推出上反应迟缓，错失了巨大的市场机遇。此前，由于缺乏互联网访问，Claude在许多应用场景中存在局限性，这使得许多企业用户望而却步。

ChatGPT和Gemini：功能升级，持续进化

ChatGPT获得了新的语音模型，进一步提升了其文本转语音和语音转文本的能力，使其在语音交互方面更具竞争力。与此同时，Google的Gemini也增加了两个值得关注的新功能：Canvas和音频概述。Canvas允许用户与Gemini实时协作创建和编辑文档，提高了文档处理效率；音频概述功能则可以将文档转换成易于收听的音频摘要，方便用户快速获取信息。这两个功能的推出，都体现了Gemini在提升用户体验和实用性方面的努力。

其他重要AI新闻：多元化应用，加速发展

本周的其他AI新闻同样值得关注：宾夕法尼亚州的AI试点项目显示，ChatGPT能够显著提高政府部门的工作效率；Adobe和Microsoft的合作将进一步推动AI工具在办公领域的应用；OpenAI的Sora AI视频工具的访问限制放宽，也表明AI视频生成技术正在快速成熟；美国联邦政府推出的生成式AI聊天机器人，则反映了AI技术在公共服务领域的应用趋势。

结语：AI技术持续演进，机遇与挑战并存

本周的AI新闻充分展现了AI技术的快速发展和广泛应用。从NVIDIA的硬件创新到大型语言模型的功能升级，再到AI在各个领域的应用探索，AI浪潮正在席卷全球。然而，我们也需要关注AI技术带来的伦理和社会问题，例如数据隐私、就业冲击等，并积极探索应对策略，确保AI技术能够造福人类社会。

Shownotes Transcript

Translations:

中文

This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life.

Nvidia changed the future of AI computing at their GTC conference. I was there. I'll tell you what that means. Claude finally gets the internet, but is it too little too late? The federal government is rolling out their own AI chatbot. I'll tell you what that means.

AI services are going all in on MCP. ChatGPT got new voice models, even though it had the leading voice models. And Gemini added two pretty small new AI features inside Gemini that I think are actually going to be really big. There was a ton of AI news this week, as just about every week. But

I don't want you spending hours every single day, like going down rabbit holes and being like, what, what is all this? What does it mean for my company, for my career? Don't do that. I do it for you. All right. So welcome to Everyday AI. What's going on y'all? My name is Jordan Wilson and I'm the host of Everyday AI. This is your daily live stream podcast and free daily newsletter, helping us all not just keep up with

AI, but how we can use it to get ahead, to grow our companies and our careers. So if that sounds like what you're trying to do, you are in the right place. Almost every single Monday, we bring you the AI news that matters. Yeah, I do this every single day. So on Monday, I'm like, y'all, here's what you need to pay attention to. Here's what doesn't matter, right? And then hopefully give you some good advice that you can take that

back to work and be the smartest person in your company or in your department when it comes to AI. So I'm excited to go over some of these huge stories. Love to see our live stream audience in the house. If you have any questions, get them in. I'll try to get some at the end if you do have any. But thanks for joining, you know, Samuel and Sandra from YouTube. Brian joining us from Minnesota. Parimi, thanks for joining from India. Sandra, Renee, Marie,

Fred, holding it down in Chicago, just like me. All right. Love to see it. All right, guys. So as a reminder, if you haven't already, please go to youreverydayai.com. I don't know if you knew this, but the podcast and the live stream, it's one thing. That's where you learn what's going on. If you want to leverage this, you do that in our newsletter. So make sure you go to our website and sign up for that, as well as you can go and listen to like 400, I don't know,

85 now episodes of Everyday AI, all for free. You can go watch it. You can go listen to it. You can go read about it all on our website, sorted by category. So no matter where you're at in your AI journey, our website is going to be your best friend, your BFF. All right.

And as a reminder, I am going to be talking a little bit about NVIDIA to start off the show, actually. But make sure you check out in our newsletter. Even though the GTC conference has wrapped up, you can still access everything online for free online.

for a limited time. So I'm going to have that link in today's newsletter. So make sure you go check that out. So yeah, thanks again to NVIDIA for partnering with the Everyday AI Show. We're actually going to have a couple more fantastic interviews this week. Yeah, had so many that we could even do them all last week. So with a telecom leader, a health leader, a

Dell leader, you know, so many good and new shows coming to you that were recorded from GTC. I'm still putting them together actually. And we might have a kind of a March madness style AI startup tournament. It should be pretty cool. So live stream audience, let me know. Do you want to see something like that? I talked to a,

eight different AI startups. And I was thinking of, you know, I just recorded their little five minute pitches. I think it's all kind of tools and services that many of you all could use. So let me know, yes or no. Should I bring a tournament style kind of AI startup pitch competition and have you all vote for the winners? So let me know. All right, enough chit chat. Let's get to the AI news that matters for the week of March 24th.

A lot to go over today, y'all. Let's get to it. So NVIDIA has announced a groundbreaking new suite of AI desktop supercomputers, the DGX Spark and the DGX Station. So these were announced at their NVIDIA GTC conference, and they are designed to empower developers, researchers, and data scientists with local AI capabilities.

So during his keynote, NVIDIA CEO Jensen Wong introduced two new personal AI supercomputers during the keynote. So the DGX Spark and the DGX Station. And both are powered by the Grace Blackwell platform.

So these systems are tailored for running neural networks and AI applications locally. So the DGX Spark, this one was actually just kind of an upgrade and a refresh because previously this was the Digits system. So now Digits is DGX Spark.

Spark and that features the GB 10 Grace Blackwell super chip delivering up to 1000 or sorry. Yeah, let me get this right. 1000 trillion operations per second. So yeah, 1000 tops. So that's for AI tasks, making it a compact yet powerful tool for prototyping and refining AI models and running local AI models.

And then you also have the big boy, the bigger version of this, which is brand new and was just announced. That is called the DGX Station. And that is a more advanced system. So, you know, essentially the DGX Spark, it's like, you know, it's like a big hockey puck, right? You know, if you've known these, you know, if you know the Mac Mini or something like that, it is that size. The DGX Station is more something, it is kind of a plug and play in Microsoft's

modular system, but you know, you can get the full desktop version of this thing as well. And this thing ready, y'all has 784 gigabytes of coherent memory. That that's,

It's wild, right? Like I thought like a couple of years ago, you know, it's like when I got a laptop with like 16 gigabytes of memory, I'm like, oh my gosh, I'm in the future. No, this thing has 784 gigabytes of memory. So yeah, like in terms of what you can run locally, right? Cause that's, that's what this is all about.

This is all about giving, you know, power users and everyday people, I think as well, the ability to run, you know, state-of-the-art open source models on your computer, right? Because proprietary models like, you know, ChatGPT and Gemini and Cloud Anthropic, you can't download those and run those, right? But there's some fantastic open source models like Meta's Lama,

Meta Lama, uh, Google actually has an open source model that I think is very impressive in their Gemma three, uh, mistrals models, uh, even, uh, NVIDIA's own, uh, Nemo Chan based on Lama. There's so many very capable models now. Uh, right. So the, um,

As well as, you know, DeepSeek for those of you DeepSeek fans, I'm not a big fan, right? But the gap between, you know, proprietary cloud models and open source models, it's down to like almost nothing, right? So the fact that you can now run, so the DGX Spark is having a price tag of $3,000. I don't believe we have pricing yet.

on the DGX station, although that could drop in the next couple of hours. And I will double check that when we put out today's newsletter. But I mean, y'all, this allows anyone in your organization, right? I think one of the biggest reasons some companies haven't gone all in on the large language model, the AI bandwagon, is because they don't understand, number one, data privacy and security. Again,

Again, I'm not going to go off on a rant, but it's like, hey, if you use cloud storage, right, that's the exact – it's more or less the exact same thing as if you're uploading documents to a large language model. Anyways, this allows anyone to run –

AI models locally, very powerful ones at speeds you can actually use. So on my not so powerful computers, I'll download and run llamas, but they're smaller versions and it's kind of slow, but still fast.

Y'all like me, like being able to even run a llama mile on an airplane when there's no Wi-Fi, right? Like it's something extremely powerful to have. So I think this is pretty big as well as, you know, major PC manufacturers, including Asus, Dell, HP, Lenovo, Supermicro, Microsoft.

will produce and sell these systems as well. So it's not just right, oh, you just have to buy it from Nvidia. All of the major players are gonna be coming out with DGX, Spark and Station equipped PCs. I actually have a great interview coming up with a leader from Dell talking a little bit more about what this announcement and what these capabilities actually mean specifically for enterprise users, right? That's y'all like, I can't get over

The amount of power and how small these things are, right? Even three or four years ago, I mean, you couldn't fit all of this compute in a room, right? Like you literally couldn't. And now it is, you know, in theory, the DGX Spark, you know, fit in the palm of your hand, right? Jensen Wong literally had it in his hand. And the station, the DGX station, a little bigger, but, you know, you can still carry it around. And that thing has...

784 gigabytes of RAM. So there's, there's very few local open source models that you cannot run on that thing. I'm pretty, I'm pretty impressed. Yeah. Muhammad just said, wow. Yeah. Yeah. I agree with that one. You know, Marie says some cutting edge technology.

All right. Our next piece of AI news also from the NVIDIA conference. So NVIDIA did reveal their plans for the AI data center and just their GPU roadmap through 2028. So, yeah, if you don't follow too closely, essentially NVIDIA powers the AI industry, right? Their GPUs.

You know, all the companies essentially use them to train their models. So, you know, if you love using any generative AI, there's a good chance that NVIDIA's GPUs were used at some point or actively being used by these companies to run and train their models.

So also, NVIDIA announced updates to the Rubin platform launching later this year that will significantly boost performance, delivering 3.6 exaflops of FP4 compute. Yeah, so that's probably above, you know, my head and many others, right? But yeah, it's stinking powerful. As well as NVIDIA talked about their upcoming Blackwell B300 platform

So Rubin will introduce next generation HBM4 memory technology, which just means faster interconnects and greater bandwidth to support increasingly complex machine learning models. Also, there is the entirely new Vera CPU that will accompany the Rubin GPUs, replacing NVIDIA's Grace CPUs. All right, so faster CPUs to accompany GPUs.

Also, Rubin Ultra is slated for 2027 and will take performance to very new levels with a new rack configuration featuring up to 576 GPUs. Yeah. So unless you're running the IT department, this might not necessarily pertain to you personally. Although this is, like I said, a

All of the AI systems that we're using will be benefiting from NVIDIA's new and refreshed GPU roadmap. So NVIDIA also talked about the growing demand for AI factories capable of handling vast amounts of data and compute intensive tasks.

So looking beyond Ruben, NVIDIA also teased its next GPU architecture named after physicist Richard Feynman, suggesting even greater advancements in compute density and efficiency for 2028. So yeah, we should see, like I said,

Some Ruben updates in later this year. Then some Vera Ruben updates in 2026. Robin Ultra, 2027. And then the Feynman GPU series coming in 2028. All right.

Our next piece of AI news, two pretty big companies are going all in on MCP support. All right, don't worry. I'm going to break that down and tell you what it means. But Zapier and Microsoft have both announced support for MCP, which is the Model Context Protocol integration.

So MCP was developed by Anthropic and it's an open source protocol designed to facilitate the integration of AI models with external data sources and tools. So MCP is a protocol that enables your AI assistants to securely connect with thousands of apps and perform actions such as sending messages, scheduling events, and updating records with any other complex coding tools.

So think of it, it is sort of like an API, right? It's technically, I believe, a layer on top of an API. But this just allows all of these different AI tools to talk to each other. So, you know, keep an eye out for what other big companies start offering MCP support. So on Zapier's side, so Zapier MCP connects to over 8,000 apps worldwide.

without complex integrations. So Zapier's MCP enables AI assistants to perform real-world tasks like sending messages, managing data, and scheduling events across 8,000 apps and 30,000 actions. All right. And then Microsoft introduced Model Context Protocol or MCP support in Copilot Studio. Pretty interesting. So building your own low-code or no-code AI agents

So Microsoft has launched MCP in Copilot Studio, enabling seamless integration of AI apps and agents with just a few clicks. So MCP simplifies connecting to knowledge servers and APIs, allowing real-time data access while maintaining enterprise security like virtual network integration and data loss preventions.

So on the Copilot Studio side, users can access pre-built MCP-enabled connectors in the marketplace, dynamically add tools to agents that you build, and reduce maintenance effort through automatic updates.

Are you still running in circles trying to figure out how to actually grow your business with AI? Maybe your company has been tinkering with large language models for a year or more, but can't really get traction to find ROI on Gen AI. Hey, this is Jordan Wilson, host of this very podcast.

Companies like Adobe, Microsoft, and NVIDIA have partnered with us because they trust our expertise in educating the masses around generative AI to get ahead. And some of the most innovative companies in the country hire us to help with their AI strategy and to train hundreds of their employees on how to use Gen AI. So whether you're looking for chat GPT training for thousands,

or just need help building your front end AI strategy. You can partner with us too, just like some of the biggest companies in the world do go to your everyday AI.com slash partner to get in contact with our team, or you can just click on the partner section of our website. We'll help you stop running in those AI circles and help get your team ahead and build a straight path to ROI on gen AI. So, uh, let me know, should we be doing a dedicated MCP show? Y'all, um,

It's something, it is a newer protocol. Uh, to be honest, it's something I'm still even learning about myself, uh, in experimenting with, but, uh,

I think, and I hope maybe that's what everyday AI is good for, right? There's no, you know, there's no experts in MCP technology. It is brand new. But I do think that it's going to be a very important protocol moving forward in the same way that most businesses, most enterprise businesses nowadays can't run without APIs or maybe web hooks, right? That just allow all your different

apps to talk to each other, right? So now you have essentially this AI version of APIs. I'm simplifying it there, uh, that allow your different AI tools and software, both to talk to each other and to talk to your data. So I do think this is pretty big. It is a little bit technical, uh, but I just think, you know, kind of how, you know, we've, we've had this, uh,

these different periods of generative AI, right? So we have the large language models, right? The AI chatbots. And then we had RAG, retrieval augmented generation. And now we have agentic AI, right? I think one of those next big steps forward is probably going to be this MCP protocol. All right. Pedro says, definitely. Joe says, I'm down with MCP. Yeah, you know me.

All right, 90s rap reference for you. All right, our next piece of AI news. Pennsylvania's AI pilot program is saving workers eight hours a week, according to their governor. So Pennsylvania Governor Josh Shapiro revealed some promising results from the state's groundbreaking pilot program integrating chat GPT into government services.

So Pennsylvania's ChatGPT pilot program saved state employees an average of eight hours per week, according to early results shared by Shapiro. So launched via an executive order in January of 2024, the program initially provided 50 licenses for ChatGPT Enterprise and has since expanded to 175 employees across 14 agencies.

So despite nearly half of participants having never used ChatGPT before in this first wave of the state study, 85% reported positive experiences using the tool, underscoring its accessibility and effectiveness.

So employees participating in the program reported saving approximately 95 minutes a day, so just over an hour and a half, which leads you to that eight hour a week figure, allowing them to focus on more complex tasks and direct interactions with Pennsylvania citizens.

So specific successes included simplifying job descriptions, which reduced hiring and onboarding times from 90 days to 60 days and consolidating 93 IT policies into 34 for the state streamlining their operations. So roles such as state attorneys and construction project managers benefited from AI assistance, showcasing its versatility across different sectors.

So Governor Shapiro emphasized that AI served as a quote unquote job enhancer rather than a replacer, reiterating the importance of keeping humans involved to ensure nuanced decision making. So the program's first phase will conclude on May 31st with plans to expand access to more employees in the second phase. My take on this only eight hours per week.

I don't know. I don't understand that. I don't understand. Like anyone that's not saving at least –

two to three hours a day, right? If you're not saving two to three hours a day minimum by using AI operating systems, right? I think there's a difference. And I think that ChatGPT is really right now the only AI operating system that runs, you know, quote unquote in the cloud as an AI chatbot. I obviously think Microsoft 365 Copilot is its own beast. But aside from that, I think Google Gemini will get there. I think Anthropic Cloud may get there. But I think right now,

ChatGPT is the only what I would call AI business operating system. And I think they are in a league of their own. But the fact that these Pennsylvania state employees are only saving about 90 minutes a day leads me to believe they need to be trained. And so many organizations, companies reach out to us when they want to train their employees. So whether it's 50 employees, 500, they reach out to us. And I'm usually pretty shocked.

Well, first, it's good that companies reach out because this is what we do every day, right? We live inside of ChatGPT. That is my personal and our team's kind of home base when it comes to AI, large language models or your AI business operating system.

But I can't see a way that most employees aren't saving at least two to three hours a day. If so, that means your employees don't know what they're doing, right? I mean, yes, it depends on what their work actually is. It depends on data access, data security, right? But at the enterprise system, I will say if you use any cloud storage, it's the exact same thing. So yeah, I'm personally shocked.

at how little it was, you know, eight hours per week means, hey, those Pennsylvania state employees need some, need a little bit of training. Like I think most companies do. Adobe.

which I don't get why all these big companies have their conferences on the same day, right? Adobe had their conference right in the middle of NVIDIA's GTC. So I think they had some pretty exciting announcements that maybe got slept on. But Adobe and Microsoft announced a major collaboration to integrate AI tools from Adobe directly into Microsoft 365 apps.

So Adobe unveiled their Adobe Marketing Agent and Adobe Express Agent, enabling marketers to create content, analyze data, and collaborate without leaving Microsoft apps like Teams, PowerPoint, and Word.

So the Adobe marketing agent aims to simplify tasks like audience targeting and campaign tracking while the express agent allows users to generate high quality visuals directly within Microsoft apps, eliminating the need to switch platforms.

So like I said, the Adobe Express agent lets users generate images for presentations, social posts, and documents through a conversational interface. And the Adobe Marketing agent helps refine audience targeting, pulling insights from Adobe Analytics tools, and creates reports directly within Microsoft apps.

So yeah, if you are a Microsoft organization and you are a heavy Adobe user, this is going to be huge news for you and just for any marketers. Also, integration with Adobe Workfront streamlines project management and boost collaboration across teams.

So it should be interesting to see what other big companies start partnering with Microsoft kind of for these, you know, kind of brand or company specific AI agents. So yeah, I think in total, and we shared this in our newsletter in the middle of the week last week as well, Adobe also rolled out 10 different kind of pre-made, pre-built AI agents as well. So

I think pretty exciting news. If you're a marketer, if you use Adobe, time savings right there, right? And then being able to run all of these presentations and generate reports just from within Microsoft Teams, Microsoft PowerPoint, Microsoft Word, not having to jump around. I mean, that's a huge boon for productivity as well.

All right, our next piece of AI news, nothing new necessarily, but an update that I think is worth mentioning.

So over the weekend, OpenAI has made Sora, its AI video tool, unlimited for paid users on the $20 a month plan. So this is pretty big because originally OpenAI's Sora, their AI video generating tool, was only rolled out to chat GPT providers.

users on that $200 a month plan. And then plus users got very limited access. So now OpenAI just quietly took away all limits. So if you are a paid ChatGPT user on any tier, you should now have unlimited access to Sora.

So this is a pretty, pretty exciting. I'd say previously they did have that credit system, uh, for both, uh, plus and pro users, technically just the pro, uh, limits were much, much higher in the, you know, $20 a month chat GPT plus limits were pretty low. Uh, well, why did they do this? Well, I think it's because of competitors, right? So, uh,

Right now, I think when OpenAI initially teased Sora, which was now almost like a year ago, I don't think that there was another AI video tool that was in the same conversation. However, it took OpenAI a super long time to actually release Sora.

Here's the trick. Google, Google, uh, Bayo. I always forget if it's Bayo VO, uh, even though I talked to Google people about it, sometimes I just forget things. Uh,

Baio 2, Google's version, Google's AI video generator is much better. Google's version is in a class on its own. However, you do right now have to access it through third-party platforms. So even Google doesn't have it available kind of as for a front-end user. If you just want to go in and use Google's Baio tool, you can't. You have to use either a third-party, obviously a paid service, or use their API, which is pretty expensive.

But there's been some other great AI video kind of companies, especially out of China. Kling AI is one here in the US, you know, runway. So like essentially, I think all of these other AI video generators have gotten so much better in the last year. And OpenAI, even though when they first, you know, kind of teased Sora, right, it broke the internet because we hadn't seen anything like that. But it took them, you know, what, like eight months, right?

to actually just start releasing it. I do think Sora and OpenAI has just better user interface, user experience than a lot of the other tools. And there's a lot of these cool remixing features and the ability to build a storyline very simply, right? Piecing together a lot of these different AI video generations. So I do think even though Sora is not the best model out there, I do think now between...

these more unlimited usage and some of these unique features. I do think now OpenAI is kind of in that 1B slot next to Google Veo, which is definitely in the 1A slot. And then I think

You have everyone else, kind of, you know, your Kling, your Runway, your Pika, your Luma, all of those, I think, are right underneath Sora. Although, you know, I think at that point, after Google Veo, it's kind of up for interpretation. It's up for your personal taste after that and what you really value in an AI video generator. But y'all, I'm telling you, the technology on the video space is so fast. So yeah, I actually...

I believe talk about that with Adele leader in a conversation that I'm going to be debuting here pretty soon. We were just talking about it is unbelievable how far this space has come. So if you haven't really looked at it in three to six months, maybe you don't think that AI video is something that your company will use. You're wrong. You're wrong, right? Let me tell you this.

The days of hiring super expensive videography companies, those days are unfortunately dwindling down. I'm not saying that's going to be something that doesn't exist anymore. Obviously, you're still going to have your high-end video production and creative agencies. But I think more and more small and medium-sized companies are going to be using these AI video tools, right? Because you can also start with an image, an AI image, right?

Right. And now we have these capabilities through the AI image tools where you can upload an image. Right. I could have uploaded an image of me interviewing someone at the NVIDIA GTC conference. And, you know, you can use that as a beginning point. Right. So I could just create videos based on real images. Right. If you have a stock photo that you've been using right on your blog post that looks like it's from 1998, right.

You can finally update it and bring some life to it. So I do think if we were having this conversation a year ago, I'd be like, yo, AI video is not for everyone. AI video is for every business. If you haven't already started using it, you need to get in there. You need to understand it because consumers demand more.

video. They want video. And if your company is not already using video, right, you probably already know, you know, maybe you just don't have the talent on your team. Maybe you don't have the budget. Well, these videos or these AI video tools are really leveling the playing field. Yeah. Good question here from Joe saying, when is OpenAI going to focus its attention on Dolly 4? So yeah, yeah.

Actually, Grok just added the ability to edit images and updated their AI image generating to be available via an API. I think Google in their Gemini 2.0

I might have to do a dedicated episode on this. They kind of are killing Photoshop, right? You can upload inside Google Geminize 2.0. Let me know, live stream audience or podcast audience. I always put my LinkedIn information, my email, even though I'm a little bit behind. So sorry if you've reached out to me in the last couple of weeks. But...

The Google Gemini 2.0, what you can do with images, it is wild to me. As someone that's used Photoshop for more than 20 years, what you can do with simple text commands inside Google Gemini 2.0, it is mind boggling, right? You can just upload an image of yourself and anything that you could think that you could do in Photoshop, you

you can essentially do in there, right? Changing what you're wearing, right? Maybe you're doing a fashion shoot, a product shoot. You know, maybe there's just something annoying in the background and you don't want to have to learn Photoshop or run something super, you know, processor compute heavy on your computer. You can literally just go into Google Gemini now 2.0 in AI studio and just with text prompts. So I did, and the reason I bring that up, Joe, is because Apple's

slash grok, uh, is starting to roll out a similar feature that we saw in Google Gemini 2.0, as well as we just, uh, rumors, uh, that chat GPT may be opening this up as well. Uh, so that is not confirmed yet. Uh,

But there have been some rumblings on the internet over the last 24, 48 hours that ChatGPT is offering is starting to test out image editing, which would lead me to believe that they will be rolling out an improved image generation as well. We know that Sora, OpenAI's video tool that we just started this AI news piece with, can generate photos. So I don't know if the next...

you know, version of Dolly might just be called Sora photos. Or if as an example, Dolly four may just be powered by Sora. We'll see, but I do expect,

some image generation updates from chat GPT soon, especially y'all. Like I, I talked about this on the show, I think last week, I mean the ability for Google Gemini 2.0, where you can in one shot, you can create as an example, a blog post, right? My example was I did a blog post, you know, the top five tourist attractions in Chicago. I had it write a blog post and then it did a image, uh,

for each of those five, all in one shot and in line, right? That's wild. It is wild. The increased capabilities of these multimodal kind of AI chatbots. So yeah, hoping we'll see that soon from OpenAI. All right.

Let's keep going. A couple more AI stories. This one, not a huge fan of. So Apple is reportedly working on integrating advanced AI capabilities into its wearable devices, including cameras for future Apple Watch and AirPods models. And that's according to Bloomberg's Mark Gurman, who is kind of the leader in getting all scoops on all things Apple. So according to reports, Apple is developing multiple,

multiple versions of future Apple Watch models equipped with cameras to enhance AI functionality, allowing the device to quote unquote, see the outside world. So this aligns with Apple's focus on expanding its visual intelligence technology.

So right now, visual intelligence, which currently relies on third-party AI models like ChatGPT, is being repositioned to use proprietary Apple AI systems. So this shift could reduce dependency on external AI providers and strengthen Apple's control over its AI ecosystem. So cameras on standard Apple Watch models may be embedded within the display, potentially using under-display technology or a camera cutout.

This would give users discrete access to visual AI features directly from their wrists. So the Apple Watch Ultra, which has more design space or more room to play with, is expected to feature a camera embedded near the digital crown and a side button. This placement would make it easier for those Apple Watch Ultra users to scan objects or interact with the environment using their wrists.

Apple aims to bring similar camera-equipped AI functionality to future AirPods as well, further integrating visual intelligence across its product lineup. So the release of these AI-powered wearables is not expected until at least 2027. And I guess if we're following the cues here, if reports right now are saying something different

Apple intelligence related is coming out in 2027. You might as well just tack like another three years on that, right? A lot of things that Apple promised last year and it's WWDC keynote address talking about Apple intelligence have not even begun to roll out yet. Even though Apple was running marketing commercials for

on a huge scale, promoting features that literally are not available, which now they're facing some class action lawsuits on. So again, I'm not going to turn this into the, I'm going to accidentally poo-poo on Apple intelligence and go off on a side rant, but you know, I'm not personally a fan of cameras in watches. I don't know why.

I still think we need some semblance of privacy and trust when it comes to AI and just when it comes to cameras, right? As an example, right? I was on the NVIDIA GTC show floor for all of, I think I had like six minutes to spare, literally. I was jam-packed with interviews. So, you know, I had my Meta Ray-Ban glasses on, which is great. So, yeah.

But I think when people are wearing those, they can kind of see and understand that you're probably recording something, right? There's a little status indicator. So I don't know, like,

Having this on the watch, cameras on the AirPods, I don't know. I'm not a fan of it, even though I use and I enjoy Apple technology. Not a huge fan of throwing cameras and trying to bring AI to my watch. That's just me. Like first, Apple, get it figured out on the phone, right?

You're like 30 years behind. Yeah. What? Yeah. Fred said new Apple watch camera. Watch out locker rooms. Yeah. So many like just like intrusive, like just like red, like red alarms blaring. There's just like, I don't know. Like, does anyone actually want this? This just seems like a,

bad idea. So I don't know if Apple is just trying to put out, you know, a bunch of new, you know, AI enabled, you know, products. I mean, the biggest thing is everyone wants to collect more visual data to make their AI models better because essentially, right, I was having a conversation with this at the NVIDIA GTC conference. You know, obviously large language models aren't at the point where they've hit the ceiling in terms of data available that they can learn and be trained on. But essentially I like to say it like this.

You know, today's current models. And by today, I mean, you know, years past models have essentially been trained on knowledge worker data or, you know, data that you would presumably, you know, read on a screen, text images, et cetera. Right. Uh,

The next big frontier for data is real world data, world model data, right? Data with how us humans interact with the real world. And that is what is ultimately, I think, going to be that next big leap in terms of AI, right? I talked about that with the agility program.

labs CTO at the GTC conference was a fantastic conversation. If you didn't listen to that one, but you know, the, the, the, the next big piece is, is, you know, companies are going to try to get more and more data from the real world. So they want all of us essentially out there with as many cameras as possible, you know, training their models, uh,

So when we talk about humanoids, when we talk about AGI, ASI, all of that doesn't happen without a ton more data from the real world. So these new versions or these newer models can understand how we interact with the world around us.

All right, next piece of AI news. The General Services Administration, the federal government, so that's the GSA, has unveiled a new generative AI chatbot designed to improve efficiency and automate repetitive tasks.

So the chatbot, which is now available to GSA federal staff, utilizes large language models from companies like Anthropic and Meta to assist with basic tasks, including writing. So according to Wired reports earlier this month, the Department of Government Efficiency or DOJ, right? Is it DOJ or is it DOJ? I think it's DOJ.

Deployed a similar chatbot called GSAI. That's not confusing. Deployed that to 1,500 workers. So the tools released coincides with the closure of GSA's 18F digital services team and downsizing of the technology transformation services, raising questions about the future of federal tech innovation teams.

So GSA officials clarified that the chatbot is not for...

replacing jobs, although that's kind of what it's doing. They did say it's not also intended for official agency decisions and it operates separately from GSA knowledge bases. So safety controls are in place to prevent sharing sensitive information and prompts are logged, but not classified as federal records.

So the agency aims to measure the tool's success through adoption rates rather than workforce reductions, signaling a focus on cultural integration rather than immediate cost savings.

I don't know. It just seems like to me, right now, the stance of the federal government and Doge is to get rid of as many federal workers as possible and integrate AI tools in their place. So although they're not really coming out and saying, yeah, we're using AI to replace jobs, that is literally kind of what the US government is doing with these huge

cuts across the board, across federal agencies. So you're seeing thousands and thousands of humans being laid off and you're seeing more and more AI tools being used in the US federal government.

All right. A couple more stories. I'm personally excited about this one. So Google has launched a few new features in Gemini. They may be small. You might not have seen them, but I'm personally excited to use them. So Google has unveiled two new AI-powered features for paid Gemini users, Canvas and audio overviews.

So Canvas, like we've seen from some other companies, you know, ChatGPT has their Canvas tool. Claude has...

their version of it, uh, which is called artifacts. So, uh, canvas inside Google Gemini enables real time collaboration with Gemini for document creation and editing. So users can upload, I mean, whatever documents you might want. So class notes, research ideas, and have Gemini draft speeches, edit content or suggest improvements. Uh,

The feature allows users to adjust tone, length, and other aspects of the text directly. So yeah, if you use ChatGPT Canvas, I'd say that is the closest other AI tool out there right now to Google Gemini's Canvas. So it also supports coding projects,

offering an interactive learning experience. So Google highlighted how Canvas can help users learn by creating simple coding projects like a tic-tac-toe game, complete with explanations and previews to guide learners through the process. All right. So that's feature number one. Feature number two is audio overviews. So if you use Notebook LM, you are definitely familiar with Google's audio overviews tool. So, yeah.

I'm not saying I had anything to do with this, but I do DM and talk with the Google team a lot. And I told them about, I don't know, six months ago, I'm like, you guys really need to be rolling out the audio overviews inside of Gemini. I got kind of a positive affirmation. So I'm sure it was already on the roadmap, but I even told them like, hey, six months ago, you know,

You should be doing this. This, this, uh, you know, the audio overviews is amazing, right? It should be rolled out, uh, across Google's suite of products. So cool to see this. So, uh, audio overviews can transform documents into podcast style audio clips for easy listening. Uh, this feature lets users upload PDF files, slides, or research reports and generates conversational audio summaries, making complex information more digestible.

So the audio overviews tool builds on Google's previous AI experiments. So like I said, this was initially introduced through Notebook LM.

And this is now available for paid Gemini users in both the mobile and the web platform. So here's how you use it. So in the new, so like I said, you need to be on a paid plan first and foremost. For Canvas, there should be a new Canvas button where you would normally put in your text prompt.

And for audio overviews, I'm guessing there's going to be a way to do this a little bit better visually. But the easiest way to do that right now is you can just upload a bunch of documents and just say, you know, please create an audio overview for these documents.

So, you know, I have a little screenshot here for our live stream audience showing you how to trigger. So it would be cool. And I'm sure Google will get a visual way to kind of trigger the audio overviews because I'm sure most people aren't going to notice. So literally you can just, you know, upload one long PDF or a couple of text files and, you know, say create an audio overview of that. And then you're going to get that, you know, very cool overview.

kind of podcast style, but it doesn't have the interactive feature that you have in Notebook LLM. So obviously Notebook LLM, I think is still one of my most used AI tools and there's still benefits to using it versus the new features inside Google Gemini.

All right. Two last stories. So first, OpenAI has unveiled three new text-to-speech models. So those models are called GPT-4-0 Transcribe, GPT-4-0 Mini Transcribe, and GPT-4-0 Mini TTS.

uh, designed to improve transcription and text to speech capabilities. So open AI, uh, their new models are available immediately through APIs for third-party developers and on a new website that they launched, which I think is pretty cool called open AI.fm. So that's a demo site for individual users to test and customize voice inputs. So this essentially you're like, okay, what's this mean? Well, uh,

Essentially, now you can integrate this new technology into any of your apps. So if you are a developer, software engineer, or a big company, this model, it's extremely, extremely impressive. OpenAI was already the leader in this space with their Whisper technology.

I love this, right? When a company is like, yeah, even though we're the leader in the space and we could sit by and, you know, maybe not even update whisper for another couple of years. Nope. They completely, uh, change the text to speech game. So, uh,

It's almost so good it's scary, but the model allows users to change accents, pitch, tone, and emotional qualities of AI voices via text prompts. So these models are built on the GPT-4-0 base model, which was launched in

May of last year, but have been post-trained for superior transcription and speech performance across 100 plus languages. So yeah, you can do both text to speech, but you can also transcribe audio.

or speech to text, right? So the new model boasts an impressively low word error rate of 2.4% in English, outperforming OpenAI's older Whisper model, which was already a leader in that category. So OpenAI has introduced streaming speech to text for real-time transcription. I'm going to have to build

uh, myself a little app using that to help me ask better questions for my guests on the everyday AI show. And that makes conversations feel a little bit more natural. So pricing for the model start at $6 per 1 million audio input tokens and scales down for smaller models. So that is the more, uh, powerful model. So this is now essentially open AI is competing with 11 labs and Hume, uh,

And like I said, OpenAI.fm is hosting a very rare, for OpenAI, a very rare public competition to find the most creative uses of its demo sites. So yeah, we'll be sharing the links and more information in our newsletter. All right, last but not least, Claude.

Welcome to the 2020s, Claude. Claude finally has the internet, WTF. All right, so Claude, the AI tool developed by Anthropic now offers web search capabilities like five years after everyone else. Not actually, but multiple years.

So the new feature is currently available in the U.S. for paid users only and expands Claude's ability to deliver actionable insights across various industries. So Claude's web search feature enables users to access the latest events, trends, and information by integrating real-time data from the internet into its responses. So when using web search, Claude provides direct citations, which is important,

for sources, allowing users to fact check information easily and ensuring transparency in responses. So web search, like I said, right now it is a feature preview. So you have to enable it at the bottom of your screen. And there are plans to access to, there are plans to roll this feature out to other parts of the world, as well as free users in the future.

So according to reports, Anthropic is using the Brave browser for this web access integration as revealed by updates to its sub processor list and identical citations found in both tools. So.

Claude dropped the ball on this one. The Anthropic team dropped the ball, right? Having real up-to-date information is essential, right? This is one of the three reasons why about a year ago, I put out a podcast. I said, don't use Claude.

Like enterprise companies should not be using it as a front end chat bot. It's different if you're using it on the backend via the API, right? Because then there's a lot more that you can do to ensure accuracy. You can ensure, right? Set up your rag pipelines, all of those things. I still, up until this web search,

Kind of dangerous, right? Kind of dangerous. If you were rolling out Cloud Access, I mean, unless you were using it for coding, which I think it's by far the best AI model for coding software development, not even close, right? But for the most part, for everything else, having an offline large language model in 2025 was straight up asinine, right? I think ultimately...

Like Anthropic, it was leaving billions of revenue on the table, right? And you all might think I'm crazy by saying that, but when I did that show a year ago, I got messages from multiple Fortune 500 C-suite people and they were like, you are spot on. We will not touch Claude on the front end because it does not have internet access. Because in those instances, you are relying on...

very old data, right? You always have this knowledge cutoff, right? Oh, August, 2024, you know, October, 2024, but that is the best case scenario. So many large language models are trained on these huge data sets. Uh, and the data in there might actually be multiple years old. So, so many people just blindly trust what an AI model spits out, not knowing it might be pulling in data. That's two, three, four, five years old. Uh, so, uh,

I guess you got to tip your hat, I don't know, to Anthropic for finally bringing this out, right? I made a joke. I'm like, this is like if you're a phone manufacturer and you just released text messaging for the first time, it's like, how did you still exist without having this? Wild that, you know, Anthropic just rolled this out now. But, yeah.

With that being said, I probably will use Claude for some more tasks that I just haven't touched with Claude for that very reason, because it can actually be a little dangerous, especially if you don't know what you're doing, to blindly use Claude for a lot of things prior to it having web access. So that's at least good, but kind of nutty that we got there in the first place. All right.

Let me very quickly recap the top AI news stories, the AI news that matters, uh, for this week. Uh, so first we talked about Nvidia unveiling their new AI focus desktop systems, GTX, uh, or sorry, uh, DGX, uh, that were available that were unveiled at the GTC conference. Also at the GTC conference, Nvidia revealed, uh, it's GPU roadmap going all the way to 2028, uh, Zapier and Microsoft, uh,

now officially support the MCP, the Model Context Protocol. Pennsylvania just released some initial findings from their AI pilot program where they're rolling out chat GPT access to state workers. Adobe partnered with Microsoft and is rolling out some

Adobe marketing AI agents inside the Microsoft 365 co-pilot platform. OpenAI quietly made Sora unlimited for paid chat GPT users. According to reports, Apple is planning to bring AI powered cameras into its wearables like watches and AirPods.

The federal agency GSA has launched a generative AI tool amid concerns over worker layoffs and surveillance. Google has announced, I think, two really good features in their Canvas and audio overviews for paid front-end Gemini chatbot users.

Open AI has unveiled three new voice AI models and text-to-speech models, even though their whisper model was a leader in the space. And last but not least, Claude has finally unveiled

real-time web search capabilities. All right, y'all, I hope this was helpful. If so, please let me know. Repost this, all right? It doesn't help, right? People are always like, oh, Jordan, I've learned so much. What can I do? Share this. Click that repost button, right? I think so many people are scared of AI and so many people are getting lost

and really not taking advantage. So if you don't stay up to date, you risk falling behind. And y'all trust me because I do this, this is my job. It is nearly impossible for you to have a real job and also stay up to date with everything in the AI world and how it's going to impact

How you do your work, how it's going to impact your company, how it's going to impact your future career. So we do that for you. So please help me out by telling someone about this email, your department, right? If you're giving a presentation on AI, throw our podcast up there as a free resource. Um,

you know, on our website. So make sure you go sign up for our free daily newsletter to recap today's show. But on our website, you can now listen to more than 480 episodes, the podcast, the videos. We put up text recaps as well. It is a free generative AI university. So I hope this was helpful. Please make sure you join us tomorrow and the rest of this week. A lot of exciting,

announcements and shows and interviews that I did at GTC, as well as I think we're going to do that tournament style thing. I saw some of our live stream audience be like, heck yeah, bring that. So we're going to do that. So thank you for tuning in. Hope to see you back tomorrow and every day for more everyday AI. Thanks y'all.

And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Ep 488: NVIDIA’s big AI advancements, Claude gets the internet, ChatGPT gets new voice models, Gemini goes Canvas and more aI news that matters

Everyday AI Podcast – An AI and ChatGPT Podcast

Deep Dive

本周AI大事记：从NVIDIA超级计算机到Claude联网，AI浪潮席卷而来

Shownotes Transcript

We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Ep 488: NVIDIA’s big AI advancements, Claude gets the internet, ChatGPT gets new voice models, Gemini goes Canvas and more aI news that matters 57:51 Share

Everyday AI Podcast – An AI and ChatGPT Podcast

Deep Dive

本周AI大事记：从NVIDIA超级计算机到Claude联网，AI浪潮席卷而来

Shownotes Transcript

We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Ep 488: NVIDIA’s big AI advancements, Claude gets the internet, ChatGPT gets new voice models, Gemini goes Canvas and more aI news that matters