We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Claude AI Can Now Control Your Computer for Tasks

Claude AI Can Now Control Your Computer for Tasks

2025/3/24
logo of podcast LLM

LLM

AI Deep Dive Transcript
People
S
Sam (Anthropic 研究员)
播音员
主持著名true crime播客《Crime Junkie》的播音员和创始人。
Topics
播音员: Claude AI 的最新升级版能够控制电脑执行任务,这在人工智能领域是一个巨大的突破。它能够自动完成各种任务,例如填写表单、创建网站、规划行程等。虽然在执行任务的过程中可能会出现一些分心行为,但这并不影响其整体的强大功能。未来,Claude AI 将能够执行更复杂的任务,例如完整的营销活动和应用程序构建。 总的来说,Claude AI 的出现标志着人工智能技术发展的一个新阶段,它将极大地改变人们的工作方式和生活方式。虽然一些人担心 AI 会取代人类的工作,但我认为 AI 更多的是将人类从繁琐的任务中解放出来,让人们能够专注于更高层次的工作,例如系统架构和战略规划。 Sam (Anthropic 研究员): 我们很高兴能够推出 Claude AI 的电脑控制功能。这是一个早期版本,但它已经能够完成一些复杂的任务,例如填写供应商请求表。在这个演示中,Claude AI 能够在不同的应用程序之间切换,获取所需的信息,并自动填写表单。这展示了 Claude AI 在自动化办公方面的巨大潜力。 我们相信,随着技术的不断发展,Claude AI 将能够完成更复杂和多样化的任务,为用户带来更高的效率和便利。我们也在不断改进 Claude AI,以提高其稳定性和可靠性,并减少其分心行为。

Deep Dive

Shownotes Transcript

Translations:
中文

Anthropic has just unveiled upgrades to Claude 3.5 and the ability for it to control your computer. This is absolutely insane. And it's funny because when they came up with this announcement on X, they kind of had this like thing showing their new model benchmarks and showing how their new upgrades to 3.5 Sonnet were making Claude better, more capable. But the real, the real thing that I've been excited about is all of

the abilities they announced and they've added some of these to their API as well for Claude to actually control your computer. I want to walk through a bunch of demos and show you exactly what this is capable of doing. I think this is absolutely insane. We've heard this exact same thing from Apple for Apple intelligence, but they essentially said that they were delayed and it wasn't going to work for a while. So we're not actually seeing that. We have other companies that have been working on this for ages. Claude

Cloud in my view is the first major company that has actually rolled this out in a huge way. These demos are absolutely insane.

So let's get into them. And before we do, I wanted to say if you're interested in using some of these AI tools to grow your business, to help with your work, or to start an online side hustle, I would love for you to be a member of the AI Hustle School community. This is a place where I release exclusive content every single week that covers how I'm making AI, I'm using AI tools to make money, and

all side hustles that I'm doing, how everything works. It's $19 a month and the price will raise to around a hundred dollars a month eventually. So if you're interested in joining this community, there's a link in the description. I would love to have you as a member. Um, let's get into some of these demos. So the first one I want to talk about, um, I'll actually just play out, uh,

and I'll talk through some of the things that they're doing. But this is absolutely fascinating. Claude is taking control of a computer. Check this out. So I'm Sam, and I'm one of the researchers here at Anthropic. Computer use is something that we felt was going to be important for a while now. And so today we're going to be talking about a very early version we have of computer use and talking through a representative example of the things we think it's going to be useful for. We're going to be

Going through a quick demo today. In this fictional demo, a customer, in this case the Ant Equipment Company, has come to us and asked us to fill out a vendor request form. The data I need to fill out this form is scattered in various places on my computer. What we're going to do is ask Claude to look at the spreadsheet, check if Ant Equipment is in there, and if not, move over to the CRM and try and find some more information there.

Once it has this data, Claude's going to then fill out the form for us and hopefully transfer the information across to the vendor form. The first thing that's going to happen is Claude's going to start taking screenshots of my screen and quickly realizes that the Ant Equipment Company isn't actually in the spreadsheet. So the first thing it does is it swaps over to a CRM and searches for the company we're interested in.

Okay, this is absolutely phenomenal. In the rest of the demo, it's able to literally go to Google Chrome, do a search on there, find this vendor database. It's not able to find it. They essentially are prompting it or it's essentially searching and finding it somewhere else. They're filling out the whole vendor request form. The whole thing is completely automated on his computer. It's absolutely phenomenal. They dropped a few other examples. One that was really interesting was done by...

So, yeah, I think that's a great question.

So it's amazing because we're now in these prompts, we're actually getting it to the, all of these multiple steps where it's literally, you know, accomplishing things on your computer and you're directing it, like go to this app, do this, go to that app. Uh, he showed this amazing demo where I literally went to the website, uh, and

And it was the one that said, please create a personal website for yourself in a 90s style steam. Include elements like animated GIFs, a visitor counter, bright background colors, and the basic under construction banner. Write the HTML code for this website. Okay, we're getting into the agents now because he never told it to put that prompt in. He just said,

Create a 90s style personal website. And it's the one that came up with the prompt to do bright background colors, visitor counter, under construction banner, HTML code, and all of this. So it was able to go accomplish this in Claude's side panel viewer that they have for Artifact. It shows exactly what the website looks like. But there was an issue with the website where

there was like some, a missing file button and some things weren't quite exactly what he wanted. So he was able to then go and prompt it further and say, he said, look at the screenshot, I can see the file listed under the down, or sorry, I guess he said, he said just to go fix the errors and Claude went through step-by-step. It's kind of interesting because in the side panel when Claude's doing all this, it's listing out exactly what it's doing and why, which is really nice. It also lists out what's,

where the mouse is moving to on the screen, so the exact coordinates, and what it's clicking on. So really, you have like a literal receipt of exactly what to do or what it has done, what it's clicked on, what it said. You can follow its entire process. If something goes wrong, you can see why. But it was able to actually go and fix the website, the error that he saw on the website, and

It was able to complete this site for him that he could run. So absolutely phenomenal with code. Someone else used it for kind of a more simple task, but it was still interesting. She essentially wanted to help her plan a trip or plan like someone was visiting and she wanted to go see the Golden Gate Bridge at sunrise or something like that. And so it came up with a prompt. It searched Google. It found the times and places, added it to her calendar. It did a bunch of interesting things. So that was kind of cool.

The last one was hilarious. Okay, hilarious and creepy. So the last thing I want to talk about is the fact that they essentially were using the computer to...

complete some tasks. And while it was completing those tasks, they said in the middle of trying to work on some stuff, it just like opened a new tab, went to Google and searched for Yellowstone National Park and then just started looking through pictures of Yellowstone National Park.

Um, the speculation here is that when these things were trained, they're trained off of people actually doing stuff. And so often when we're actually in the middle of a project, we get distracted and we go look up some random picture of Yellowstone National Park. Uh, and then we, you know, we're, oh, okay, man, I got to get back to my thing. Or, you know, you swipe through Instagram for a couple of minutes. You're like, oh man, I got to get back to like whatever my project was. So anyways, it's kind of funny because built into this training data, it would appear that there is, um,

examples of people slacking on the job and now the ai models are doing that so it's funny because they actually push some some code and some updates uh while rolling this out to essentially stop it from being sidetracked and taking breaks now funny some people say it's creepy because it's like oh my gosh even ai models are so bored of our tasks they want to take a break i mean it's just doing whatever it's training data was i showed it to do but definitely there's some funny uh

There's some funny sidetracking going on with this. Overall, this is absolutely phenomenal. This is the first time we're seeing some major updates out of an AI model that's actually able to take control of your screen and accomplish things. I think this means a lot for the future. You know, they showed some examples of like, oh, it can do some code and it can plan some trips

This thing's going to be able to execute on entire marketing campaigns. It's going to be able to execute on entire build outs of certain apps. These things are only getting better in their demo. You know, they said, we hope that these things get significantly better in the coming months. We're getting to a place where essentially all of our tasks are going to be able to be completed. Now, it's kind of interesting because on the one hand,

there's still a person that's directing it to do everything, right? The person is still not removed from the situation. And so I think it's interesting, the future of work, some people are saying like we'll be completely automated, but at the same time, I think that humans and human ingenuity is constantly shifting. And because of this, we always need a person to figure out a process and to figure out an edge and to figure out an arbitrage, right? That's kind of what my whole career has been all about is figuring out

What are the arbitrage opportunities in marketing? Because that's been kind of my background. Or what are the arbitrage opportunities in building software? And that is something that I think requires a lot of creativity. You have to kind of be in the system. You have to understand what people are talking about doing, building, what's trending. And maybe we get to a point where AI models can do that, but I don't see that happening for quite a while. And so I do think we're going to get AI models that completely automate tasks and jobs. But I think...

the people are just going to be, you're just going to be doing higher and higher level things. And it'll essentially, what I've been saying for like two years with this is that AI will not replace people. People just, um, turn into systems architects. And your job is to architect the AI system, architect your agents, decide what all your agents are doing. Just like a manager with employees. It's just, everyone is getting, um, put into like some sort of management position, uh,

and architecting what these ais are doing and managing them and working on what they're doing so

It's a fascinating time to be alive. This is really crazy, everything that's going on. If you're interested, like I mentioned, in making money with different AI tools, I would love to have you as a member of the AI Hustle School community. There's a link in the description. Incredible people from $100 million companies that they've started to brand new companies. You get a really wide range of perspectives and feedback on anything you're working on and great ideas to help make you successful. So thanks so much for tuning into the podcast. If you enjoyed the episode, please leave it a review and I'll catch you next time.