We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

#105 AI UX is Broken – How Do We Measure a “Good” AI Experience?

2025/3/20

Future of UX | Your Design, Tech and User Experience Podcast | AI Design

AI Deep Dive AI Chapters Transcript

People

Patrice Reinhers

Topics

Patrice Reinhers: 我认为目前AI产品最大的挑战之一是如何衡量其用户体验。传统的UX指标，例如可用性、参与度和任务完成率，在AI产品中并不适用，因为AI系统具有适应性和动态性。 AI产品的用户体验不仅要考虑其功能性，更重要的是要关注其公平性、透明度和可解释性。亚马逊的AI招聘案例就是一个很好的例子，它说明了如果AI系统缺乏公平性和透明度，即使技术上准确，也会导致糟糕的用户体验，甚至造成伦理问题。为了解决这些问题，我们需要重新定义AI的用户体验，并开发新的衡量指标。这包括评估AI系统的公平性、透明度、可解释性以及用户对AI系统的信任程度。此外，我们还需要考虑是否需要为AI产品制定一个通用的UX认证标准，就像无障碍性或安全合规性一样。这将有助于确保AI产品在发布前满足一定的质量和伦理标准，从而提高用户对AI系统的信任度。欧盟AI法案虽然不是直接针对用户体验，但它为AI产品的安全性和伦理规范设定了标准，这在一定程度上也影响了AI的用户体验。然而，该法案也存在一些争议，例如限制过多可能导致创新速度放缓，以及对小型AI公司的不利影响。总而言之，衡量AI用户体验是一个复杂的问题，需要我们从技术、伦理和用户体验等多个角度进行综合考虑。未来，我们需要开发新的评估方法和标准，以确保AI产品能够提供安全、公平、透明和令人信赖的用户体验。

Deep Dive

Chapters

This chapter explores the challenges of measuring AI experiences using traditional UX metrics. It highlights the Amazon hiring AI scandal as a prime example of how biased algorithms can lead to poor UX and the need for a more comprehensive evaluation method beyond technical aspects like accuracy and efficiency.

Traditional UX metrics fail in AI-driven products.
Amazon's biased hiring AI amplified gender bias.
AI needs to be fair, transparent, and user-centered.
Current AI benchmarks focus on technical aspects, neglecting user experience.

Shownotes Transcript

Translations:

中文

Welcome to the future of UX, the podcast where we dive into the trends, the challenges and innovations shaping the future of design and technology. I'm Patrice Reinhers and in each episode, we explore what's next in the world of UX so you can stay ahead of the curve. And today we are tackling a huge question. How do we measure a good AI experience?

AI is shaping everything. From hiring decisions to medical diagnosis, from content recommendations to even self-driving cars. But here is the thing: We don't even have clear UX standards for UI yet.

And if we don't measure UX in AI products properly, we risk building products that are functional but maybe untrustworthy or accurate but unethical

Powerful, but maybe frustrating to use. And in this episode, we will explore how traditional UX metrics fail in AI-driven products. We are also going to talk about the Amazon hiring AI scandal, how biased algorithms can create terrible UX. We'll talk about the problem of explainability, so why we often don't understand AI decisions.

And we will talk about whether we need a universal AI UX certification, like accessibility for performance ratings. I would say, let's dive right in. A few years ago, Amazon tried to use AI to automate hiring decisions, which sounds like a great idea, right? So the goal was to find the best job candidate without human bias. But here's what's happened.

The AI was trained on historical hiring data. And guess what? Historically, Amazon hired more men than women for tech jobs. So the AI learned that men were better candidates and automatically downgraded resumes that contained words like women's chess club or a female leadership team. And the result was, instead of eliminating bias, the AI amplified it.

And yeah, Amazon scrapped the project. But what's the UX lesson here? AI products don't just need to be technically accurate. They also need to be fair and transparent and user-centered. We need better ways to evaluate AI experience before they go live.

And the big question now is, how do we define, how do we measure UX experience? So, yeah, UX that is not just functional, but also ethical and fair. And to address these challenges, organizations have started developing AI performance benchmark.

I'm going to talk a little bit about an example, which is the Stanford CRFM Transparency Index. And this measures how well an AI model explains its decision.

Because the big problem with AI is, when AI recommends you something or presents a result, it's very difficult, almost impossible to understand how the AI came to this decision. And even if the AI would be transparent and explain how it came to the decision, it would be basically calculations that you would see. Very, very complex and very difficult to understand.

So the AI benchmarks today, AI fairness and bias. Does the AI treat all users fairly? The performance, of course, is the AI fast and efficient, as well as explainability and users understand why AI made a decision. This is super important. These benchmarks focus on more or less technical aspects.

but not so much the user experience. You know, when we think about like bias and fairness and performance and explainabilities. And an AI might be accurate, but still feel unreliable or unhelpful to users. It's a little bit like evaluating a restaurant only based on the calorie count of the food, ignoring the taste, the service and the atmosphere.

So do we need an AI UX score? Let's talk a little bit about why traditional UX metrics fail in AI products. First of all, usability. AI adapts and changes, making predictability a challenge. As well as engagement. AI-driven platforms optimize for time spent.

But is more time spent always a better UX? You know, when we think about the social media or TikTok's addictive algorithm, it might be good for business to keep the user as long as possible on the platform. This is really a good user experience. As well as task completion. AI may complete tasks in probably sometimes unexpected ways.

But does that mean that the UX is good? Imagine an AI that perfectly identifies the most qualified job candidates. But what if it prefers candidates with third names or backgrounds? Or what if it excludes non-traditional career paths? UX is not just about efficiency, it's about trust and fairness and transparency. So how do we build trust in AI?

Imagine you're using an AI-powered medical diagnosis tool. You enter your symptoms and it says, yeah, you have a 65% chance of cancer or another serious illness. But it doesn't tell you why. No explanation, no reasoning. Would you trust it? Probably not. You might be a little bit worried, but you probably wouldn't trust it, right?

And here it gets really obvious by the explainability matters. So AI decisions sometimes feel random if users don't understand them. There's also a really good takeaway for you if you are designing AI products. Help the user to understand why a certain decision was made. Because the lack of transparency can lead to distrust even when AI is correct.

And some AI models are so complex that even their creators don't fully understand them, which can happen because AI is not, you know, you get from A to B, but sometimes you get from A to D and you don't really know how this happened. So the big solution to that is user control. Let users also adjust preferences. Keep the user in the loop, basically.

as well as transparency, we already talked about it, show why AI made a certain decision, as well as feedback loop. Always ask for feedback. We know this from traditional digital products, but for AI, it's even more important because this, that users correct the AI mistakes helps the system to learn and to become better. A really good example here is Google's search algorithm. It's constantly changing.

But users have no control over it. So imagine if Google let users see why third-mer results were ranked higher. Wouldn't that improve the trust? I would definitely think so. Although from the business perspective, it might be a bit tricky because Google also earns with sponsored ads that are at the top part.

where of course it would be a tiny bit difficult to explain why this is shown to a user because this is an ad so companies paid for it. So the really big question now is should AI products have a UX certification basically? So I'm thinking what if AI products had to pass a UX certification before launching just like accessibility or any security compliance? Potential benefits could be

that the product is fair, it's transparent, and it is user-friendly. That it hopefully creates trust between the user and the AI-driven systems. And this would force companies to think about AI UX from the start, to include UX designers in the whole process. But let's also think about the other side, which are some of the challenges.

which is a big question. Who sets the standards for an AI UX score? Someone needs to come up with criteria, certain rules. So we don't have that yet, but this might be helpful. Also, how do we measure ethics and transparency? This is still a bit challenging, but I think there potentially would be some solutions. And would it slow down innovation? Hmm.

Think of food safety regulations, for example. We don't just allow any food on the market, but food needs to have, especially in Europe or in Germany, in Switzerland, where I'm based, food need to have certain regulations. This comes to certain things that are used for production, antibiotics, all these kind of things.

So why should AI, why should digital products be any different? So we also have these safety regulations. And would you trust AI more if it had a UX certification? Here in Europe, we actually do have some very, very strict rules when it comes to AI. So for those who are based here, you probably heard about it. We have the EU AI Act Act.

And this actually became a law, I think, August last year. And it's basically the world's first big set of rules for artificial intelligence. Only for the EU. It aims to make sure that AI is safe and ethical and represents people's rights. This is not so much about the user experience, but it goes a little bit in that direction. It's like...

With this, or I would say there's quite a controversy around this regulation because on the one hand,

It protects people's rights. It prevents AI from being used in unfair or harmful ways. It also builds trust in AI. So you have clear rules to help people feel a little bit safer using AI products. And it sets a global example, a little bit like the GDPR privacy laws. These AI rules could influence laws in other countries.

And it basically works like this, that it categorizes AI products in three different categories and in different risk levels. The first one is the banned AI, so unacceptable risk. For example, AI systems that are dangerous to people's rights, like social scoring systems similar to China's system. They are completely forbidden in the EU. Then we have the high-risk AI.

AI that is used in hiring, in healthcare, in finance or law enforcement. They must follow very strict rules to make sure it's safe and it's fair. And then we have the limited risk AI. Those are chatbots and generative AI like ChatGPT or Midjourney. They must be transparent and people need to know they are interacting with AI.

So who must follow these rules? The law applies to AI developers, to businesses and users inside the EU. And even companies outside the EU must follow these rules if they sell AI products in Europe. I also mentioned it's quite a controversy because it also comes with some problems and concerns. As you've seen, there are a lot of restrictions. So some tech leaders worries that the rules are maybe a little bit too strict and will slow down AI innovation in Europe.

They're also a little bit unclear. So the laws broad definition of AI makes it very hard for companies to know which tools are affected and it could potentially hurt startups especially. So smaller AI companies might struggle with high compliance costs and making it harder for them to compete.

The AI Act, I think, from my perspective, is an important step toward making AI fair and responsible, but it could also make it so much harder for companies to develop and use AI in Europe. And yeah, I am not so sure if we will be able to really close the gap with the US and China and position Europe as a leader in AI innovation.

Although I think it's definitely a step in the right direction. So when it comes to, I mean, this law is not about UX, it's just about AI regulations. But, you know, UX would be a next step. But I'm thinking more globally, not only the EU. So a little conclusion to summarize. Measuring AI UX is one of the biggest challenges in design today. The traditional UX metrics definitely aren't enough.

AI needs transparency, it needs trust, it needs fairness baked into its design. And maybe the future includes a universal AI UX standard. But yeah, we are not there yet. I'm curious to hear your thoughts. What do you think? Should AI products be required to meet UX standards before launch? Let's discuss. Feel free to share this in the show notes.

And if you like this episode, feel free to rate it. Give me a five-star review. This helps me to find amazing people for the podcast that I can interview. This also helps me to produce the episodes. And yeah, this is just a nice support if you are listening to the episodes, if you're enjoying it. So thank you so much for listening. And I would say see you in the next episode.

you

#105 AI UX is Broken – How Do We Measure a “Good” AI Experience? 16:16 Share

Future of UX | Your Design, Tech and User Experience Podcast | AI Design

Deep Dive

Shownotes Transcript

#105 AI UX is Broken – How Do We Measure a “Good” AI Experience?