We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

AI and Safety: How Responsible Tech Leaders Build Trustworthy Systems (National Safety Month Special)

2025/6/26

AI and the Future of Work

AI Deep Dive AI Chapters Transcript

People

Ben Kus

Eric Siegel

Navindra Yadav

Silvio Savarese

Topics

Silvio Savarese: 作为Salesforce的首席科学家，我认为构建对用户安全的AI至关重要。这意味着我们的AI必须符合信任AI原则，确保准确性、安全性、透明度和可持续性。我们绝不使用客户数据进行模型训练，并通过技术手段防止数据泄露和偏见。在企业环境中，我们需要构建更专业化、更小的模型，以便更好地控制数据和输出，从而减少幻觉、毒性和偏见。我们还应该认识到AI应该在支持人类的角色中发挥作用，而不是取代人类的决策。 Ben Kus: 作为Box的CTO，我认为在企业级应用中，权限控制至关重要。AI需要尊重不同的访问权限，不能期望AI模型自行理解权限，需要特别处理。我们需要安全地使用AI，避免成为最大的数据泄露源。同时，不应使用AI直接做最终决定，而应让人来做决定。在利用AI提高生产力的同时，也要对当前技术保持负责任的态度。我们的文化是不作恶，并认真对待客户的信任。即使演示看起来很酷，也要不断尝试，找出AI的失败边界。 Eric Siegel: 我关注的是预测领域的道德问题，特别是与民权相关的歧视性决策。我认为人工智能和机器学习模型被完美地设计用来复制人类的偏见。通过量化和突出显示偏见，我们可以看到当前世界不公正的量化体现。通过透明地记录模型的活动，我们可以调整不平等现象，并在模型部署中实现社会正义。调整假阳性率的差异类似于平权行动。 Navindra Yadav: 作为Theom的CEO，我认为我们正在保护数据，无论数据位于何处。我们面临的技术挑战是在不安装代理或任何类似东西的情况下，将数据与数据存储一起移动。我们需要以经济高效的方式进行数据保护，所有数据都保留在客户环境中。我们的分析会自动对数据进行分类，并确定数据的关键性。我们还会评估数据的价值，参考暗网上的交易价格。我们使用NLP来查看数据的上下文，以更准确地标记数据。我们的目标是减少假阴性，并不断改进产品以提高精确度和召回率。

Deep Dive

Chapters

A brief introduction to the episode, highlighting the focus on AI safety during National Safety Month and introducing the four guests who share their insights.

Episode focuses on AI safety during National Safety Month.
Features four experts discussing AI safety in various contexts.

Shownotes Transcript

Translations:

中文

This is Dan Turchin from AI and the Future of Work. Welcome to this month's highlight episode featuring four great previous guests discussing AI safety. As you know, we're using 11 Labs to digitize my voice for these special episodes. The Real Me approved the content and, of course, approved the digital twin. Let me know what you think in the comments. ♪

Welcome to this special National Safety Month edition of AI and the Future of Work, observed every June. National Safety Month is a reminder that creating safe workplaces is a shared responsibility. For those of us building AI systems, it's a moment to reflect on how technology can reduce risk, not add to it. You'll hear insights from four amazing past guests, Silvio Savarese, Navendra Yadav, Ben Kaz, and Eric Siegel,

Each of them brings a unique perspective on how AI is helping humanize work and protect employees in their data. Let's start with how large organizations are approaching AI safety at scale.

And our first guest is Silvio Savarese, Executive Vice President and Chief Scientist at Salesforce. He leads AI research focused on building trusted systems that prioritize privacy, security, and responsible innovation. Listen as he explains how Salesforce structures its research efforts to make sure AI development stays aligned with core safety principles, even in highly complex environments.

One of the other things that we've taken into account when we're designing AI for humans is what could go wrong that might have unintended implications and potentially ways that could harm humans. Sometimes we talk about, you know, the ethics of AI and

being very cognizant of what could go wrong and how it might impact the end users what's your perspective or even your team's perspective on what it means to exercise ai responsibly yeah in short it means that we want to use you want to build

and AI that is safe for users to consume. So this is the short answer. The longer answer is we want to make sure that the AI is compliant while we define the trust AI principle. And this is actually the work that we are doing in collaboration with our partners in ethics teams with Paula Goldman and other collaborators in Salesforce.

where we really work closely to understand what are the most important principles to guiding a safe AI, a trusted AI. And some of the principles include, first of all, accuracy. So deliver verifiable results that can be balanced, accuracy, precision,

recalls of these models. So this is very important. And this is very important for enterprise because at the enterprise level, you cannot actually afford mistakes. So if you want to generate an email that advertises a certain product, you don't want to tell wrong information about this product, right? So it has to be factually correct. This is very important. The second thing is about safety. So mix.

all effort to mitigate biases, toxicity, any kind of harmful output. And this requires a lot of work behind the scene in conducting biases, explainability, and robustness assessment. And that is also where red teaming is very important, this kind of phase.

It's also very important to protect the privacy of any information that can be potentially exposed during the process of generating content. And this is particularly true when it comes to generative AI because we don't know exactly what kind of content can be produced and the type of data that we inject in the training process cannot absolutely contain any private information about users or customers.

There's also an aspect on instant transparency. So when collecting data to train and evaluate models, we need to respect data provenance, ensure that we have the right consent of using data. So with customers, you know, we definitely make sure that we, first of all,

default is that we don't use any customers data for training for building models you know mark banoff has been saying you know customer data is not our product this is something that we take very seriously there are situations where we do have pilots with customers in this case you know the custom customers might opt in in sharing some of the data but we do make sure that this data is never used for training models because we never know how this journey model can spit out

what we call regurgitate. It's not a very appetizing word, but it actually makes the concept clear. So regurgitate the prior data back to the output of a generative AI.

Also, we care about empowerment, recognize cases where AI should play a supporting role to the humans, augmenting humans, helping humans in situations of where there is additional help, additional support. And finally, sustainability. This also happens to be one of the values of Salesforce, but we strive to build models which are not only accurate, but also have an opportunity to produce or contain the carbon footprint. Great.

Completely unsolicited to compliment, but Salesforce is way ahead when it comes to AI customer safety. A really good model. I'd encourage, we can put a link to this in the show notes, but the Einstein Trust layer lays out a really good, really mature framework for how to think about these things. So I like the comment you made about Benioff and customer's data is not our product.

But through Einstein, you do let customers, I don't know if it's through rag techniques or fine tuning a model, et cetera, but you do let them introduce their own data that they own, which obviously can't then be shared with other tenants. My question is, one of the challenges that AI first vendors have these days is that there's a lot of credible finger pointing that can happen when something goes wrong.

The customer says, Salesforce, I trusted you to not let the model whatever, hallucinate, introduce bias, et cetera. And the vendor, certainly not pointing a finger at Salesforce, but you can credibly say, we own the platform, we own the algorithms. We told you how to use it, but you introduced the data that had the encoded bias in it.

Everybody's pointing a finger, and yet ultimately it's the end user, the customer, etc., that ends up being the unfortunate kind of where the outcome of these biased outcomes ends up that the user suffers. How do you think about who owns ultimate responsibility when there's plausible deniability all around the table?

Yeah, this is an excellent point. But I think that it's important to make a clear difference between the consumer space and the enterprise space. So to actually create separations between these two aspects, although, of course, there are some areas of overlap between these two spaces. But the thing is, most of the large language models that have been deployed have been very popular so far, have been devolved.

devoted dedicated for in the consumer space applications and you know for instance you know for companion ai or for as an assistance you let me write an essay let me find this kind of information from web you know give me a bright summary for me so this is you know helping consumer in performing certain tasks you know making them more productive more effective more efficient

In this case, large language models need to absorb a lot of information, because in order to span from one task to the other, in order to span from literature to Shakespeare to science,

to how to grow a bonsai, you need to have all possible knowledge in this planet. And typically, this knowledge comes from the internet. That's how those large language models train from. So from internet data, some of this data might be subject to some copyright laws that leads to some issues with some legal issues. Some other might contain some biases. Others might contain some toxicity. So at the end of the day, it's very difficult to control what the output can be when all the data gets fed into the models.

In the enterprise space, the needs are different, right? You need to, you have very use case specific situation. The use case is very specific. The tasks are specific. Domain can be specific, right? You are, you operate in a CRM, you operate in a financial service, you operate in healthcare. So you don't have to include all possible, you know, domains in one single model. So there's an opportunity for building models, which are much more specialized. In fact, it'd be smaller than those large, huge, you know,

you know, gigantic, large language models. And when you build smaller models, you also have more control of the data that you use in training. So you still want to have this kind of conversational capabilities, which are important because that's what makes these models, this generative AI effective in practice, this ability to converse with AI, which is new, which is the new, you know, the new breakthrough that we have seen over the past years. But at some time, you know, the type of output that we are expecting from the models can be, can be,

align with a specific task. And in this case, you know, you can use much smaller models and we can train those models in data that is much more controlled. So this allows us to have, you know, to reduce hallucination, reduce toxicity, reduce biases, and align, you know, those models with the customer's expectations. When it comes to privacy, that's actually, it's a different story. So we never use any customer's data for training models.

The way the Trust Layer works in the iCityTelnet 1 platform is that all the customer information sits in Data Cloud, which is our own data platform, our infrastructure platform. And when there is a prompt, when there is a need for injecting some information, personal information, private information, this information comes from Data Cloud and is added to the prompt.

And through grounding, through drag, through these other techniques, it allows us to reach the prompt.

and then this gets fed into the model. And the model has zero retainance, so it doesn't retain any information, any data that goes into the model. It's more like a faucet, all the water goes to the faucet, and the only thing the model does is to regulate the flow of the model, but the faucet doesn't keep the water, right? So all the water goes through. The same thing here, through these policies we established with the external vendors, or with our own models,

the data doesn't sit in the model, doesn't use it for training. So there's no risk that this data that we are the prompt gets used by in training. So then this allows us to preserve privacy, confidentiality, and eventually after the model produces an output, there's another layer that checks for toxicity, for biases, for hallucinations. There is some very important approaches and methods that allows us to assess the confidence

of the model to say, "Okay, I think that this is the answer, but I'm not sure what percent. In this case, I need extra human validation to assess the quality of the output." So these are some of the steps that we're making to ensure that we mitigate those issues as much as possible. Silvio emphasized the importance of trust and transparency in enterprise AI. Data leaks rank high among national safety month hazards.

Let's learn from an expert how AI is preventing them. Navendra Yadav is the CEO and founder of FIOM. His team uses AI to protect sensitive data where it lives. In this clip, he explains how natural language processing helps detect personal information, assess its value, and block unauthorized access. What's the hardest problem that you and the team have solved so far?

The first and the hardest thing is this is a paradigm change because now we're protecting data wherever the data lives. So how do you move along with the data into each one of these data stores without scaring people about, oh, there's now some engine running inside my data?

data stores. So the whole technical challenge of implementing that without people installing agents or anything of that sort without installing proxies was the biggest technical challenge. And that's where a lot of the IP that Theom has built resides.

Second thing was how to do this in a cost efficient manner, because now we are crawling all of the customer's data inside their environment. Everything lives with the customer. So think of this as Google built for an enterprise, but runs inside the enterprise because it has to crawl and figure out, okay, what's happening to the data and so on and so forth.

Yeah, that was the hardest technical challenge. The first one being run it wherever the data is so that we don't get bypassed. And the second part is about maintaining the cost structure and no data leaves the customer's environment. I referred in the intro to assigning a criticality score to data. Is that something that the customer manually does or is it semi-automated or fully automated?

it's fully automated. So what these crawlers essentially do, or this analysis essentially does is it classifies the data, figures out whether this piece of data, it breaks them down into small pieces of tokens and figures out, okay, what this token maps to. For example, a really simple toy example would be, this is a credit card number, or this is social security, or this is something that's specific to the healthcare industry or specific to the financial industry, or this is a trading pattern or so and so forth.

Depending upon the industry vertical, we automatically default what the criticality of this item should be. Further, what Theom also puts an estimate around is the dollar value of that data. So we look at the dark web

uh and see what are attackers actually willing to pay for this data what is this data actually trading for and so and so forth so that forms the seed of uh this pricing information for tm so tm now starts attaching dollar values to different assets this is the low lower estimate that we put on the data and tell the customers okay this is

at the minimum what we believe your data is worth in this particular store and again customers can drive their own apis and make it more our apis and uh drive hit our apis with more custom information and then we can price it more appropriately for uh their environment so really criticality of the data and the financial value of the data is what we expose out to people

Why it becomes interesting is now CIOs know what cyber insurance they should be taking for different stores. Are they underinsured? Are they overinsured? It leads to all those interesting conversations from there.

That single use case of being able to identify or essentially label data based on whether it's a regex or something like that to figure out if it's a credit card number or a birthday or an address or a phone number, things like that, whole companies are just doing that.

Talk us through, I mean, I've spent some time on this problem. I mean, it's actually a very hard problem to solve when you look at spanning geographies and, you know, there's all sorts of complicated parameters. How mature is that ability to, before we even get to the dark web and the value of the data, how mature is the technology being able to accurately assign a label to it?

great topic or great point what we see in our uh we're using the same technology or similar technology that chat gpt uh and others uh bird transformers and so on so forth use so we are using this these besides regex and things regex is good cheap good for some sort of problems uh what we do is we bring nlp into this equation we try to look at the context of the data for example someone says

the last four digits of my SSNR or last four digits of my social are or things. So now when you look at that context, you see four numbers. You know this is social security number because the context of the string or this unstructured data said something about that. So we let our NLP engines detect things like that. So wherever they pick up things, it all depends on your training sets and how well they've all been crafted. That's one.

There are other things we actually do with whenever we find something like classic examples of credit cards or ISINs, we do a land check on top of it. So in this industry, there are CRC checkers that we do semantic analysis of the data and then we say, OK,

this truly is a credit card valid credit card number because it's past this particular lunch check and so on so forth and we feed it in so we do a whole bunch of mitigation actions before we even report any piece of thing as a certain entity all of this is essentially to reduce false positives are there zero false positives no no such technology is exists or as far as i know uh exists it's

It's always this game of actually improving your training set so that you come up with better arguments. Whenever we see a human tells us, okay, you've gone and mislabeled this, reinforcement learning happens in our engines. So we take it and take it from there. That's really where we are at this point of time. So the cost of a false positive would probably be that your financial assessment of the value of the data may be wrong. That's correct. Because you're going to end up assigning, right?

I was thinking more about the cost of false negative. If there is, let's say, financial data that you're not identifying, that could potentially be much more damaging. Bingo. That's exactly what I wanted to say. Sorry I interrupted you. But yes, false negative is my worry. When they have something

really sensitive and we did not classify that as really sensitive that's the risk around that so we put quite a few guardrails around this and that's where it goes into our thing to try and reduce our false negatives as well as false positives it's all about precision and recall again it's a constant game of improving the product and taking for there foreign

We've explored how AI can protect data and prevent system failures. But what happens when the models themselves introduce risk? Meet Eric Siegel, founder of Predictive Analytics World and author of the AI Playbook. He spent his career examining the unintended consequences of AI in high-stakes decisions. In this segment, he unpacks why predictive models often mirror human bias and why transparency and fairness must be built in from day one.

Learn how AI guardrails turn risk into resilience. My ethical concerns are in the predictive area, which has been the main focus of my career. And it has to do with civil rights. Is the model making discriminatory decisions? Because all of these operational decisions it's driving can potentially be very consequential. It's not just optimizing business and number crunching.

It's governing a model that makes predictions on per individual basis is essentially a kind of a mini policy that's then either directly informing, maybe being the deciding factor for, or in some cases, literally, as I discussed, automating decisions. And if it's making decisions that discriminate based on protected class, like race or religion, ethnicity, or is discriminating

showing a certain kind of bias, and people call it machine bias, and there's a famous ProPublica article called Machine Bias that's very often cited, where the costly errors, and the model's going to make errors, just like humans do. It's just hopefully it'll make fewer of them. It's not a magic crystal ball.

But if the errors that are costly, such as that end up keeping somebody in jail longer or denying them credit or housing, happen with higher proportional frequency for certain underprivileged protected groups or racial groups, for example...

than other, you know, from one group to the other, that's a really serious civil liberties concern. So I've actually taken a lot of interest in that, written a bunch of op-eds in like the San Francisco Chronicle and the Scientific American blog. I put them all together. There's like 12 op-eds. If anyone wants to read my work, I put them all together at civilrightsdata.com. We'll link to that in the show notes. One of the things I frequently say is that AI and machine learning models

are perfectly designed to replicate human bias. And a lot of eyebrows go up when I say that from people who either are naive about the potential harm that AI models can create or just choose to turn a blind eye.

What's your perspective on, as a data scientist, how do we mitigate the impact of having human bias seep into these models, given that, to your point, they can make

predictions that have significant implications on, for example, who dies, who lives, who gets loans, who gets educated, things that really have a dramatic impact on people's lives. That's the central question to what I'm obsessed about in this area. And I think what you said that's perfectly designed to replicate human bias is spot on. But actually, guess what? It's kind of good news. And follow me on this.

By putting it on the table and quantifying and putting a spotlight on exactly what it is, what you're seeing is a quantification of the injustices that persist because of the world as it stands now. The historical inequities that put some groups in less advantageous positions than others. Now that manifests, and for example, it manifests in the proportional relationship

rates of those kinds of costly mistakes that I was referring to. To put it into technical terms, what we're talking about is...

false positive rates where it says, oh, this person's high risk of committing a crime again if we release them for prison. This person's at high risk of not paying back their loan. In the case that those predictions are wrong, those are called false positives or false alarms. If the false positive rate is higher for one group than the other. So by quantifying, putting a spotlight on it, then now, instead of it's a bunch of people doing it inside the

human head, you can see my head on the video, I know we're not going to video here, but my head is the ultimate black box, right? We can't see into each other's minds or brains what

What we can do is quantify what happens when there's this transparency. And although models can be opaque to a certain degree, we do have transparency because we can record their activity. So this provides us the opportunity to adjust for those inequity and actually implement for social justice in the way models are deployed. Now,

Unfortunately, that does open a can of worms. And essentially, the way you would adjust for the difference in false positive rates would be analogous to affirmative action.

I'm pro-affirmative action. I see those differences in false positive rates as a manifestation of historical injustice. I see the opportunity to adjust for it, but doing so means reintroducing the protected class. So then the overall, the model might be colorblind in the sense of not directly making a decision based on race, for example.

but you do need to reintroduce the protected category, just the same as with affirmative action, to adjust for it. So it doesn't eliminate today's polarized kind of political debate around the topic, but at least it helps us look more concretely what the issue is and what we could potentially do about it. We've heard how predictive models can magnify human bias.

Now let's look at what it means to build enterprise AI systems that are secure, compliant, and useful. Ben Kaz is CTO at Box. In this segment, he shares how his team builds AI features that respect customer trust, avoid data leaks, and prioritize long-term value over hype. You'll hear why speed alone isn't enough. Real impact comes from finding failure points early and knowing when to slow down,

On this podcast, we talk a lot about the future of work in the context of what it means to practice AI responsibly. And in the seat that you sit in, I imagine you're constantly thinking about the art of the possible, but perhaps constrained.

by not just what we can do, but what we should do. What does responsible AI even mean to you? So for me, the word is a big and very important word, responsible AI. But in terms of what it means for me and my job at Box as a sort of a major user of AI, one of the key things for us is to make sure where the content cloud lies. We have hundreds of thousands of customers. We have, you know,

hundreds of billions of files. We have hundreds of petabytes of data, and we're trusted to keep some of the most valuable data from these organizations and keep it safe, keep it secure, make sure it's used properly. And that's fundamental to everything that we do. And then the moment, though, that you add AI to it, you just get these really giant questions, these really difficult, like, how is this going to work? I think a lot of people who are doing not necessarily enterprise-focused apps, they sometimes miss this idea that permissions...

are really critical and sometimes very hard. So for instance, let's say you have AI and it has access to all of your employee data.

It has access to all of your financial data, including an upcoming earnings report. Let's say you have access to maybe an upcoming M&A deal you're doing. What's going to happen when somebody starts asking the questions of the AI? And we've seen people who go build their own thing or maybe companies that are smaller in scope. And they sometimes just forget this really critical point that every single person in your enterprise has access to different things and AI needs to respond to that. And that can be hard. You can't just train a model and expect it to figure that out.

And so for us, we want to, when we're thinking about practicing AI responsibility, the first thing is to make sure that it is safe and secure and it respects permissions. And we use this term enterprise grade, and that's a huge part of it. Enterprises, in many cases, the customers we talk to are very excited about AI, and they're also very scared of AI.

because of these kinds of challenges. It's so new that then the technologies are like, even the companies are very different. And so you have to make sure that you're focused on, like, you do not want your AI to become the biggest source of data leakage that you've ever experienced. And if you're not careful, that can happen. Now, on the responsibility side, like, as people are using it, like, one of the challenges that, like,

as people are dealing with business critical processes and they're thinking through, you know, this, the content that they have workflows inside a box and others, we, we often give them recommendations about what to do or what not to do. Even though we go out of our way to really make sure that you're not, the AI is not hallucinating. It's giving you very accurate answers. One of the challenges is it will be wrong. Like everybody, like no one should ever guarantee you that the AI is always right. Similarly, like no, the person is always right. And

And so some people, when they're thinking about like these kind of like decision-making processes, like we recommend don't use the AI to actually make the final decision on things.

And to us, this is actually a big part of our responsibility is to help train people on how to use like our, especially our content and workflows. And we tell people, let's say if something's important enough that like a classic example would be like the AI is probably capable of looking at a resume and you're able to like give it like a bunch of information about like what you want from a candidate. And it can probably tell you whether it thinks the person is a good fit.

But you shouldn't do that. You should have maybe the AI pull out some key facts for you, and then you should have the person make the decision. And somewhere along there is the line about being responsible with the current state of technology, while also getting the benefits from sort of the productivity and other sort of ways that you can work better using this new technology. As technologists, we're so accustomed to only asking what could go right as we get excited about new technologies. And I think that this is the first

time in decades when it's equally important that we ask what could go wrong.

And as the leader of a technology organization, I just wonder, you know, how do you think about the personal responsibility that you have to create a culture where you may be perceived as putting the brake or putting your foot on the brake in terms of innovation, but because you want to enforce that as a culture, you also espouse the principles of doing the right thing, not just doing what you could do.

I think this is one of the benefits of being at a company that takes very seriously what our customers trust in us. We have some of the most important things in their company. So we take that very seriously. And to me, the concern that we have is a little bit less of, are we going to do something irresponsible? That's very much in the culture. It's one of our key things is to not...

do harm in any of these ways. It's just, you know, we're, we're enterprise grade from that perspective. But then the other part of it is sometimes you get very, the new technologies are so innovative and so different that you sometimes stop, you sometimes just try to keep doing things without necessarily thinking through like how this will end up. And to me, this is one of the things that like for AI, like it's just this common thing is that like normally when you're building stuff and you have new features or whatever, you have a hackathon, you're thinking of new ideas and

It takes so long to get something that actually works and works well. And with AI, like generative AI, it's very fast.

you go, you almost collapse the first like 60% of a product development cycle until a day or two. And that I think throws people off sometimes because like, then they believe that, oh, it's so close to being done. Then let's just keep going and finish it. And instead of a lot of the work in making a really good AI oriented product is around what happens next. What's the quality of it? What's going to happen in the corner cases, something goes wrong. Is it actually giving you what you want? And so even though the demo looks cool, keep

keep trying it, try to get it to fail and explore where the failure boundaries are. And that's because generative AI is somewhat probabilistic. It has that sort of challenge that you have to manage. National Safety Month is a reminder that progress in AI isn't just about speed. It's about responsibility, whether it's protecting sensitive data, detecting bias or building resilient systems. Safety must be designed into the process of evaluating and deploying AI models.

If any of these ideas resonated with you, listen to the full episodes. Links are in today's show notes. And if you know anyone who needs to learn more about AI safety, pass this one along or let us know in the comments. Thanks for listening to this special edition of AI and the Future of Work. Until next time, stay safe and stay curious.

AI and Safety: How Responsible Tech Leaders Build Trustworthy Systems (National Safety Month Special) 31:08 Share

AI and the Future of Work

Deep Dive

Shownotes Transcript

AI and Safety: How Responsible Tech Leaders Build Trustworthy Systems (National Safety Month Special)