We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

EP 536: Agentic AI - The risks and how to tackle them responsibly

2025/5/30

Everyday AI Podcast – An AI and ChatGPT Podcast

#generative ai#ai entrepreneurship challenges#biotechnology and neuroscience#ai research#emotional intelligence in the workplace People

Sarah Bird

Topics

@Sarah Bird : 随着生成式AI和代理AI的快速发展，以及多代理AI的出现，负责任的AI面临着新的挑战。我的团队致力于识别新兴AI系统中的风险，并开发工具和技术来测试、缓解这些风险，并使其他人能够成功地做到这一点。人们对负责任的AI的意识和参与度显著提高，这与几年前的情况大不相同。以前，客户认为在AI旅程的早期阶段无需考虑负责任的AI，但现在他们首先询问负责任的AI。媒体和组织在提高人们对AI风险的认识方面发挥了重要作用。我没有预料到这个领域在成熟度和理解力方面会增长这么多。代理AI之所以快速发展，是因为它更完整地实现了AI技术的承诺，能够执行任务，减轻人们的思考负担。代理AI能够代表用户执行任务，这意味着出错的可能性增加，潜在的影响也更大。由于代理AI在没有人工干预的情况下工作更长时间，因此失去了人工监督这一重要缓解措施。需要调整系统来管理、保护和管理这些新的实体，就像管理用户、应用程序和设备一样。代理AI带来了新的风险，例如偏离任务或意外泄露敏感数据。目前实践中的多代理系统将任务分解为子任务，使每个代理都能擅长于较小的任务。这种方法允许对每个代理进行专门测试，并设置护栏以确保它们仅执行特定任务。我对这种多代理模式感到非常兴奋，我们认为人们可以利用它取得成功。当管理更复杂的系统时，需要将系统分解为组件，以便仍然管理每个单独的代理。Foundry observability提供了一个代理监控系统，可以查看代理是否偏离任务，以及是否难以找到合适的工具。需要在单个代理级别保持可见性，以确保安全实践，例如最小权限访问。即使是多代理系统，也应该像管理单个代理一样，对每个实体进行管理并设置护栏。以前，人类在内循环中执行小任务并进行检查，但现在人类更多地转移到外循环中。用户需要更多的测试和监控工具，以便在部署前确保代理系统正常工作，并在代理偏离任务时进行干预。Foundry observability中的监控和护栏可以检测代理是否偏离任务，是否理解用户意图，以及选择工具的准确性，并在出现问题时提醒管理员或用户。人机协作仍然需要，只是人类干预的阶段发生了变化，更多地是在开发前进行测试，以及在部署后进行监控和管理。需要测试组件和系统，并构建测试范例，但不同之处在于需要测试不同的行为和风险类型。我的工作是构建测试系统，以便人们可以测试新的风险类型，例如系统是否理解用户意图，是否容易受到提示注入攻击，以及是否产生版权材料。根据不同的应用场景，需要进行特定的测试，以确保系统适合其用途。许多组织没有意识到应该进行多少测试，因此在最后阶段才开始投资测试，以建立信任。我们在构建AI系统时，应该从系统应该做什么开始，并在系统开发的每个迭代过程中共同开发测试，以评估不同的风险和系统质量。在生成式AI时代，负责任的AI的重点是系统是否产生有害内容，用户是否可以越狱，以及是否意外产生版权材料。代理本质上是部署在系统中的新实体，因此需要像保护和管理用户一样，保护和管理AI。重点更多地放在如何像保护和管理其他事物一样保护和管理AI，而不仅仅是AI的新风险。我们发布了一个新的Entra代理ID，以便可以在系统中跟踪代理，就像其他任何事物一样，并确保将此治理和安全平面与开发人员正在做的事情联系起来。随着代理系统的能力越来越强，负责监控和观察这些代理的人也需要改变他们的技能。我们需要更多的人工智能界面创新，以便人类能够在代理世界中进行有意义的干预。Microsoft Research发布了一个名为Magentic UI的研究系统，用于实验用户与代理交互的不同界面。我们需要更周到地考虑如何让人类以他们能够理解的方式进行表达，并以他们能够做到的方式进行管理，同时仍然发挥他们的作用。我们可以提供工具来弥合AI系统正在做的事情与用户或管理员想要指定的内容之间的差距，从而帮助使这些界面感觉自然，并使AI和人类协同工作。国际安全报告中提出的分类方法分为三种风险类别：故障、滥用和系统性风险。故障是指AI系统做了一些不应该做的事情，例如产生有害内容、混淆并偏离任务或意外泄露敏感数据。滥用包括因不理解AI系统而误用，以及故意滥用这些系统。通过教育和周到的用户界面来解决因不理解AI系统而造成的误用，并使用护栏、监控和防御措施来解决故意滥用。系统性风险是指与AI相关的更广泛的社会和经济影响，例如劳动力准备不足。解决系统性风险需要政策、教育和技能提升计划，而解决故障则需要更多的技术手段。激励人们尝试新工具至关重要，领导力和文化因素在激励人们和让他们兴奋地尝试新工具方面起着重要作用。教育至关重要，工具在某些方面效果很好，但在所有方面效果都不好。在团队内部分享经验，学习哪些模式有效，哪些模式无效，并共同学习。在技术更加成熟并且我们拥有更多标准模式时，更容易说这是你如何使用它的，去做吧。必须睁大眼睛进入，了解风险，理解风险，并适当地选择用例和事物。必须进行投资，不要在不投资测试的情况下就发布这些系统。制定一个有意的负责任的AI计划，这是我们看到的最有效的组织。

Deep Dive

Shownotes Transcript

Translations:

中文

This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Responsible AI used to be much more straightforward.

I don't think it was ever simple necessarily, but with the rate of change when it comes to generative AI and everything we've seen from big tech companies everywhere with agentic AI and not even that multi agentic AI, I think it changes responsible AI

Right. Because it used to be, hey, one human goes in, talks to an AI chat bot, and you could probably more accurately understand what guardrails that organizations need to put in place in order for this thing to work. But what about now when we talk about agentic AI? How does that change things?

The ethics, the governance, the responsibility that we as business leaders need to have in order to make this thing work the right way. And then when it comes to multi-agentic AI, when agents are talking among themselves, divvying tasks up and executing on our behalf, how does that change things? These are big questions. I don't have all the answers, but today I have a fantastic guest who does. All right.

I'm excited for this conversation. I hope you are too. What's going on, y'all? My name is Jordan Wilson and welcome to Everyday AI. This is your daily live stream podcast and free daily newsletter, helping us all not just keep up with AI, but how we can use it to get ahead, to grow our companies and our careers. So if that sounds like what you're trying to do, it starts

right here with this live stream podcast. But where you need to go is our newsletter. We're going to be recapping the most important insights from today's conversation. So make sure you go sign up for that free daily newsletter at youreverydayai.com. We'll also have everything else that's happening in the world of AI. So most days before we get the show started, we go over the AI news. Not going to do that today. So you can just go make sure to check out the newsletter for that.

All right, enough of me. Let's bring on our guest for today. This is one I'm very excited for. Multi-agentic systems and how we do it responsibly. Believe it or not, it's something I think about probably every day. And I have so many questions and maybe you do too. And I'm happy to have someone now that can help us answer those. So live stream audience, please help me. Welcome to the show.

Sarah Bird, the Chief Product Officer of Responsible AI at Microsoft. Sarah, thank you so much for joining the Everyday AI Show. Thanks for having me. I'm so happy to be here. All right. I'm excited. But before we dive into this topic, could you please just let everyone know, like, what the heck do you do as the Chief Product Officer of Responsible AI at Microsoft? It seems like a gigantic job, right, in terms of what it could cover. But what do you do?

Yeah, you know, it involves doing a lot of different things, but at the core, we look at kind of risk we see emerging in new AI systems and then figure out how do we actually go address those risks? What does it take to test them? What does it take to mitigate them? And then how do we make it easy for everyone to do that? And so my team builds tools and technologies once we figure out those patterns that allows everyone to do this successfully.

I kind of talked about it a little bit here in the beginning about how

AI has changed so much, obviously, right, over the last couple of years. But how has your role changed, right, from maybe five years ago when we're first getting glimpses of generative AI technology to co-pilots been out now for almost three years, I think, right? How has your role changed in the products that Microsoft has been building around this technology? How drastically has it changed the last couple of years?

Yeah, I think, you know, in some ways it's pretty similar. We're trying to figure out how to make sure that AI matches our principles, that we can build it responsibly, that people can use it responsibly. But I think the big thing that's changed is people's awareness

awareness of how important this is and level of engagement. So I feel like before generative AI really took off, I was working in this space and I would meet with Microsoft's customers and share what we were doing. And they're like, it's so great that Microsoft is doing this, but we're really early in our AI journey. So we have to get a lot more sophisticated before we even think about responsible AI. And now the first thing that people do is ask about responsible AI before they even get started with AI, which is excellent. And so I really did not expect the

the field to grow so much in maturity and understanding. And actually, I credit media and organizations such as yourself who are helping get the message out there so that people understand the risk and why this is important. But it's a, you know, it's a big change.

Yeah, I think, you know, risk and change are, you know, two topics that are on any business leader's mind right now. Right. And hey, as a reminder to our live stream audience, if you have any questions for Sarah, now's a great time to get them in. But, you know, let's even just look at

what was just announced, right? What Microsoft just announced at its Build conference, which last week, which seems like so long ago now. One thing in particular is the agentic AI and everything around it. It just seems like it's everywhere now within the Microsoft ecosystem and within Copilot. But how has even just the growth of agents, how has that changed what responsible AI even means?

Yeah, I think the thing that's amazing about agents and why I think we're seeing so much growth is they really are, I think, a more complete fulfillment of the promise of this technology. I don't want a system that I just chat with. I want a system that's going to go and complete tasks for me so that I don't have to think about it. And so it's not surprising that we're seeing this huge excitement and people really starting to get value from these systems. But the challenge is,

That now if an agent is actually able to go and do tasks on your behalf, then there's more that can go wrong because it can actually take an action. And that can be bigger surface area that can be higher implications because of the action it's taking.

And we've also lost kind of one of our most important mitigations, which is having the human just directly in the loop having oversight, because now you're going to have agents working for longer periods of time without a human in the loop. And so it really kind of changes the game. And there's a couple of ways where we're thinking about how we go and address this, which is first, agents are...

a little bit like an application, a little bit like a user, but not exactly like either. And so we need to adapt our systems to manage these new entities and secure and govern them in the way that we do users and applications and devices and all sorts of things today. But we also need to address those new types of risks that agents bring in terms of being able to go off task or

accidentally leak sensitive data. And so it's a pretty exciting time, I think, with this new technology coming out and the potential of it, but also pretty fun to think about how we really do this well.

And maybe help our viewers and listeners here better even understand what this means, right? Maybe if you could break it down. So specifically these multi-agentic systems, right? It just sounds like we just add like another buzzword, right? Like every couple of quarters, it's AI, then it's agentic AI, then it's multi-agentic AI, right? With orchestration. But like what

the heck does that mean? So whether it's a co-pilot studio, I think this is also maybe in the Azure AI foundry, but how does that actually work? Multi-agentic orchestration.

Yeah, I think, you know, there's a lot of exciting visions where agents are just inventing things and talking to each other. And you have these really crazy multi-agent systems. I think what we're seeing in practice right now is something much simpler, but still extremely powerful. And I think much easier to deal with from a responsible point of view, which is that people are really using multi-agent systems to break down.

their task into a bunch of subtasks. And what's really great about that is that you can make each individual agent really good at the smaller tasks that it

is doing. And so you can test it specifically to do that task. We can put guardrails around it that ensure that it's only doing that task. And then you can have them coordinate and work together to complete a bigger picture. But every single thing is, you know, a component like in an assembly line doing what it needs to do. And so I think that actually I'm really excited about this multi-agent pattern is something that we think people can be really successful with.

I think the opportunity and the upside of multi-agent orchestration is pretty, pretty obvious, right? How does the risk change, right? When you're not just working with one agent, right? And you kind of mentioned how human in the loop changes, but how does the risk change when you're working with a series or you're orchestrating multiple agents versus just working one-on-one with a single agent? Yeah.

yeah you know i think probably the biggest thing is exactly that you're going to have a more complex system that you're trying to govern and so you do need to break it into these components so that you're still governing each individual agent and not just the system as a whole because you still need visibility into what's happening in there on one of the areas that we released at build was foundry observability and this is exactly giving you uh

a monitoring system for agents so you can see, did the agent go off task? Is it struggling to find the right tool for the job and all that? And so we still need visibility at the individual agent level. You don't want to look at just the multi-agent system and the boundaries, or it's going to be much more difficult to debug it. It's going to be difficult to ensure that you're doing important security practices like least

privileged access and everything. And so even though they're multi-agent systems and they're combining together, I don't think that that has to look different than a single agent working with a human, right? You can have different types of entities in the system and you want to make sure that you're governing each of them and you're having guardrails around each of them.

And, you know, haven't even started to talk about the underlying models. Right. So, you know, this is another thing that, you know, it seems like everything's kind of happening at once because, you know, now these models that are powering the agents can reason and they can plan ahead. Right. And then when you combine that with this multi-agentic formation, you like, again, just the, the, the

The potential and the challenges are just jumping out. But, you know, one other thing that, you know, I kind of heard you say there is being able to work for longer, right? So I think we've always been trained, right? At least, you know, in the first like year or two of generative AI, it's like, okay, I sit, I talk with an AI, I look, I see what it sends back. But

But now it might be many minutes or multiple hours in the very near future where these agents are going and doing work. So how does that change? You talked about the observability and in Foundry observability, but how does it change kind of the human role when now these agents are going to be working longer and deeper together for a much longer duration?

Yeah, the way we think about it is, you know, in the previous era, we had humans really in the inner loop. And it's like you did a small task, and then the human checked, and you did a small task, and the human checked. And so what happens now is we're really moving humans more to the outer loop. And that can still be extremely powerful, but the tools that users need to use then are a little bit different. And so, for example, you

you want to test your agent system a lot more before you deploy it. So you know that it's really working well, that it works well on the task that you're expecting. But then you also need to be able to monitor it. So if it's going off task, then a human can come in and intervene. And so that's some of the new technologies that my team has been developing are specific monitoring inside of foundry observability or guardrails that look and say, okay,

how well is the agent doing at staying on task? How well is the agent doing in understanding the user intent? How accurate is it in picking the right tool for the job? And if any of those seem to be not performing well, our system in Foundry is going to detect that issue and go and alert the human, either the human administrator or the human user, depending on the application and what

makes sense there. And so then the human knows that they should come in and intervene. And they may want to come in and intervene for a specific task, or they may want to come in and intervene and say, oh, I need to make adjustments to my system overall so that it's doing the series of tasks well. And so we still have to build these same human-in-the-loop

It's just where the human goes in the loop is different now. And it's much more in the pre-development setting, what you actually care about and appropriate testing and in the post-deployment monitoring and administration of the system.

Yeah. And I love how you described it there, kind of the inner loop versus the outer loop and how even observability is changing a lot. But one other thing that you just talked about there, Sarah, was testing, right? Which, you know, I think unfortunately some organizations at times get

can glaze over that part in my experience, because when you get these new capabilities, it almost like it's like getting a new toy, right? In Christmas. It's like, you know, if you get a Super Nintendo in 1992, the last thing you're doing is reading the manual. You're putting the game in and you're playing it. How should organizations be testing, you know, just multi-agentic systems? It seems like a Herculean, you know, task.

Yeah, you know, we're going to test the same way that we are testing anything, which is you test the components and then you test the system, right? And you build up testing paradigm. I think what's different is that what we need to test for are different behaviors and different types of risk. And so a lot of what we've been building and the part of my job about making it easy for everyone to do this is building testing systems inside of Foundry that

people can build on top of and people can use to test for these new types of risks. So I mentioned some of the categories that are specific of did the system understand the user intent, right? That's a test that we can run. But also, is the system vulnerable to new types of prompt injection attacks or

Is the system producing copyright material? These are all different things that you want to test for in your application, and you're going to have application specific things you want to test for. If you're doing this in a financial setting or a healthcare setting, there's going to be specific

test that you want to run to see is this fit for purpose. And so we've built the testing platform and many built-in evaluators to make it easy for people to test for this, but also for the ability for them to customize so that they're really getting at what's important for their application so that they can trust the system. But I completely agree with your point that one of the things we've seen is most organizations just

don't realize how much they should be testing right now in this space and so uh they get to the end and they're like okay we're ready to ship this thing we're so excited and then someone points out it does something that they don't quite like and they're how do we know it's not doing this all over the place how do we know that this is ready to go and then that's when they start going and investing in testing and so you know we hear from a lot of customers exactly that that they uh that the

thing that is sort of delaying them into getting into production is building that trust. And, you know, some of that comes through testing. And so one of the things that we are trying to do and, you know, I try to do is educate people earlier because you should start this like when we're building an AI system, we start with what is the system supposed to do? And we build the testing

Right alongside with the development of the system. So we don't wait till the final last mile and then test and find out we have an issue. We're co-developing, you know, looking at the different risks, looking at the quality of the system in every single iteration. And so, you know, I hope that other organizations do that as well. But we're all on a learning journey on how to do this well.

So a lot of these, you know, capabilities that we've been talking about here when it comes to, you know, multi-agentic orchestration, you know, they're fairly new for the general public, at least, right? But, you know, I'm curious, what have you learned or maybe what were you even surprised by on the responsible AI side, you know, as you've been building out

these systems, which I'm sure you've been testing them internally for many months or multiple years. But maybe specifically when it comes to responsible AI and agentic AI in your own internal testing, what has been maybe the biggest surprise or learning that you think would be helpful for business leaders to understand?

Are you still running in circles trying to figure out how to actually grow your business with AI? Maybe your company has been tinkering with large language models for a year or more, but can't really get traction to find ROI on Gen AI. Hey, this is Jordan Wilson, host of this very podcast.

Companies like Adobe, Microsoft, and NVIDIA have partnered with us because they trust our expertise in educating the masses around generative AI to get ahead. And some of the most innovative companies in the country hire us to help with their AI strategy and to train hundreds of their employees on how to use Gen AI. So whether you're looking for chat GPT training for thousands,

or just need help building your front-end AI strategy, you can partner with us too, just like some of the biggest companies in the world do. Go to youreverydayai.com slash partner to get in contact with our team, or you can just click on the partner section of our website. We'll help you stop running in those AI circles and help get your team ahead and build a straight path to ROI on Gen AI. You know, I think that when we, like,

when we switched, when like generative AI started and we were in the era of the chatbots, right? Um, I think that a lot of the focus and responsibly I was just, is this system producing harmful content? Can my user jailbreak it? Did I accidentally produce copyright material? And so a lot of the energy for us was about developing guardrails and testing for these kinds of new AI specific risks we're seeing. I think, um, when it comes to agents, uh,

Agents are, as I was saying earlier, basically a new entity that you're deploying in your system. And people are pretty excited about this. Our Work Trends Index, for example, showed that I think 81% of employers are looking at deploying agents alongside their workforce in the next 18 months, right? And that's a really different paradigm if you're starting to have users in your system and you have agents and you have applications. And so, yeah.

Once you have a new entity, they're saying, okay, we've already figured out how to govern users, right? We have, for example, Entra, which gives every user an ID and helps you have access control on that. And we have Defender that's monitoring your systems and making sure that, you know, you don't see threats coming in on your devices. And, you know, the first question people start asking is, well,

how do we secure and govern agents in that same way? And so a lot more focus actually on not just the novel risks that we see with AI, but just being able to secure and govern AI like any other thing. And it sounds really basic, but I think there was just

much less energy in that before agents came along. And so a lot of what we released at Build, for example, is a new Entra agent ID so that agents can be tracked in your system just like anything else and making sure that we're connecting this governance and security plane with what the developer is doing so that when I build a developer,

when I build an agent in Foundry or in Copilot, it just has an identity already attached. So I've done the right thing for my organization and my organization can govern it the way it needs to. And so I don't think, you know,

Maybe 12 months ago, I expected that I would be spending as much time kind of learning about how all of these traditional sort of security patterns work today. But that's where we're at. And I think that's one of the most important things with agents. Yeah. And a kind of related question here from Cecilia that I think maybe a lot of people don't

are thinking. So she's asking here on LinkedIn, does this human in the loop model create a different level of users with higher skill levels to understand the hallucinations and derail it? Yeah, that's a great question. Like as the agentic systems become more capable, how do the human in the loops need to maybe change their skill set in order to better monitor and observe these agents?

Yeah, I love this question because I think that's exactly right. If you're in a different point in the loop, you are doing a different job, right? And so we mentioned that one of the things that you want to look at is testing. Well, testing is not...

looking at a single example and saying, did the system do the right thing? Yes, I'm going to approve this. You're often then looking at aggregates and looking at numbers overall and saying, okay, you know, if 99.8% of the time it does the right thing, is that good enough for my task? And so you are making decisions with like different type of information. And

And so, you know, I think the answer to the question is we're still figuring it out. And this is a place where I'd actually love to see a lot more innovation. And it is this human AI interface and how we design it for the world of agents where humans are farther out in the loop. And so actually one of the things we released at Build coming out of Microsoft Research is a system that is called the Magentic UI.

And it's a research system for people to play with and experiment with different interfaces for users to interact with agents, basically. And so you can try different interaction patterns for all of us to learn which ones are really working, what is the best way for the human to intervene in a way that's meaningful for them, for the skill set that they have.

And this isn't a new problem, though, because one of the things that's exciting about this, especially this recent wave of technology, is it's so much better at coding. And so you have people now who can essentially complete coding tasks who are maybe not experts at coding and reintegrate.

We had, you know, when we, inside of Microsoft, you know, teams come forward and say, we built this great thing. It's for people that can't code and it's going to code for them. But if there's a bug, they're just going to find the bug. And you're like, how are they going to find the bug? The whole thing is that they don't code. And so we have to be more thoughtful about how can the human speak a, you know, and a way that they have the ability to and govern in the way they have the ability to and still kind of play their part.

So one of the things is that's where AI is a really helpful tool for this. So one of the things we've built, for example, is a system that looks at

does it understand the user intent and then alerts you if it seems to be confused about what you want. And so you can focus on just really specifying your intent and our system's going to check if the agent is confused. And so you don't have to go look at that. And so where we can provide tools that bridge the gap between what the AI system is doing and what the user or the administrator wants to specify, then we can help make those interfaces actually feel natural and AI and humans work together. Yeah.

Speaking of humans and AI working together, I think a big part of responsible AI is being able to understand and maybe even categorize the different risks that businesses are facing. Sarah, could you walk us through what are the big, you know, maybe risk categories that we need to understand when it comes to proper agentic AI implementation and just doing it responsibly?

Yeah, so I really like, and we've been using these at Microsoft, the categorization that came out in the international safety report that came out at the AI conference.

Action Summit in Paris. And it has three categories. So the first is malfunctions. And that is the AI system doing something that it's not supposed to be doing. And that could, for example, it be producing harmful content, or that could be it getting confused and going off task.

It could be that it's leaking sensitive data accidentally, right? And those are some of the big ones we see with agents. It's vulnerable to prompt injection attacks, right? Those are all types of malfunctions. The next category is misuse.

And you can see kind of two types of misuse. I might misuse and use an AI system because I don't understand what it does. I don't understand if it's appropriate for my task. And that we address through things like education and thoughtful user interfaces so that people really understand what the AI system does and what it doesn't do.

Then of course, unfortunately, we live in a world where people are also going to intentionally misuse these systems. And there, you know, we look at where we can have guardrails and monitoring and defenses and traditional security approaches for that. And the last risk and something that's very top of mind for me is the systemic risk with AI. And so, for example, with agents,

I mentioned that people are going to deploy these alongside their workforce. And that's really exciting because people are going to get to focus on much more interesting tasks and have agents do the very undifferentiated tasks in their work. But that is a different way of working. And so preparing the workforce to actually be ready for this new skill set and

collaborate with these tools, that's, you know, some systemic type of risk that we need to go and address. And so when we think about responsibilidad, we have to look holistically across all of these, but it's pretty different tools that we're using to solve each of these risks with malfunctions being much more technical. And then with systemic risk, we're looking much more at policy and education and upskilling programs and a very different type of work to address those risks.

That's one of my favorite topics to just think about is, you know, this concept of, you know, getting the workforce ready and upskilling, right? Preparing for the future. How should business leaders be doing that?

Because it's one thing I even struggle with because even agency, it seems like is changing so much, right? As these models, you know, now they can, they can think and they can plan and they can reason on top of, you know, you know, working with each other. And a lot of times, you know, business leaders are like, this is what I've spent.

you know, 20 or 30 years doing, right? Like this is my agency, right? This critical thinking, right? So how do we need to get the workforce ready for working hand in hand with agentic AI? Yeah. So first I think it's motivating people to want to do it. I mean, as you said, if you've been doing your job the same way for 20 years, you're

Depending on your personality, you might not want to just go randomly pick up new tools. You might not even know that new tools are available. And so, you know, important part is the leadership and the cultural element of incentivizing people and making people excited to try new tools. Now, I personally have found that AI is.

is a huge boost in my work. Most of the systems I told you that we were developing to address these different risks, those are AI powered. Those are all things we couldn't have done five years ago without generative AI. And so my job and what we're able to do and build is totally changed because of AI.

And so I'm very excited to pick up the latest and greatest new AI thing, but we need kind of everyone to have that moment where they see how it changes their job in a good way. And then that makes you want to try more. And so one is definitely incentivizing it. Number two, I think is education.

The tools work well for some things, they don't work well for everything. And so one of the things that we do even within my own team at Microsoft is have learning sessions where people share, look, I built an agent that did this and isn't it cool and it worked really well. And I've tried this and I'm struggling to get it to work. And so having people learn from each other

about the patterns of what's working and what's not, and really having this be a shared learning journey and not something that just everyone is on in their own. And so I think those are some of the important parts of this. And as the technology gets more mature and we have sort of more standard patterns, then it will be easier to say, this is how you use it, go do that. But in the earlier days right now, it's a lot of also getting people to experiment and find the things that are gonna work best for them in their job.

I think that's a great way to look at this as we all grapple with what this means. But, you know, Sarah, as we wrap up the show here, I mean, we've covered a lot when it comes to responsible AI, right? We've talked about new capabilities with this multi-agentic orchestration, the increased risk and responsibility and opportunity. And also we dove in a little bit on the future of human agency and

preparing the workforce for a more agentic future. But as we wrap here, what is maybe the one most important takeaway that you have for business leaders when it comes to understanding the risk of agentic AI and doing it responsibly?

Yeah, I am. You know, I think you have to go into it eyes open, like you need to know that there are risks and understand the risk and pick use cases and things appropriately. And you have to invest, like the idea of you're just going to go ship these systems without investing in testing. I just really don't recommend that. And so, you know, really going in with an intentional plan for responsible aid. That's the organizations that we see the most being the most effective.

I love that. Being intentional, investing, and testing. I think that's a great way to end today's show. I love it. So thank you so much, Sarah, for your time and for helping us all understand a little bit better the risk and how to tackle them responsibly when it comes to agentic AI. Thank you so much for your time and joining the Everyday AI Show. Yeah, thanks for having me.

All right, y'all, that was a lot. We just got so much in like so many insights just dumped on our heads. If you didn't catch it all, don't worry. We're going to be recapping it all in today's newsletter. So if you haven't already, please go to your everyday AI dot com. Sign up for the free daily newsletter. Thank you for tuning in. We'll see you back for more everyday AI. Thanks, y'all.

And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

EP 536: Agentic AI - The risks and how to tackle them responsibly 31:45 Share

Everyday AI Podcast – An AI and ChatGPT Podcast

Deep Dive

Shownotes Transcript

We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

EP 536: Agentic AI - The risks and how to tackle them responsibly