Today, we're airing an episode produced by our friends at the Modern CTO Podcast, who were kind enough to have me on recently as a guest. We talked about the rise of generative AI, what it means to be successful with technology, and some considerations for leaders to think about as they shepherd technology implementation efforts. Find the Modern CTO Podcast on Apple Podcast, Spotify, or wherever you get your podcast.
- Machine learning can detect fraudulent activity, but it may also miss good behavior. Find out how one platform company manages these risks and trade-offs on today's episode. - I am Nava Banerjee from Airbnb, and you are listening to "Me, Myself, and AI." - Welcome to "Me, Myself, and AI," a podcast on artificial intelligence and business. Each episode, we introduce you to someone innovating with AI. I'm Sam Ransbotham, professor of analytics at Boston College.
I'm also the AI and business strategy guest editor at MIT Sloan Management Review.
And I'm Sherven Kodubande, senior partner with BCG and one of the leaders of our AI business. Together, MIT SMR and BCG have been researching and publishing on AI since 2017, interviewing hundreds of practitioners and surveying thousands of companies on what it takes to build and to deploy and scale AI capabilities and really transform the way organizations operate.
Today, Sherva and I are speaking with Nava Banerjee, Director of Trust, Product, and Operations at Airbnb. Nava, thanks for joining us. Thank you. A pleasure to be here. I think most people know Airbnb, but maybe tell us a brief overview of the company and what you do.
Airbnb stands for a place which fosters connection and belonging. You know, millions and millions of hosts open up their homes to complete strangers. And you and I and our families get to travel to an exotic destination. And instead of staying at a hotel, you actually get to truly immerse yourself in the local culture.
And at times, if you're really lucky, you get to stay at the house of a host and actually experience hosting as it's meant to be, thereby creating and fostering connection and belonging. And in this world that is increasingly becoming insular and, you know, connection is becoming a rare thing. I am so glad to be part of a company that is trying to create more connection and
And what my team does, the trust and safety team in Airbnb, is along the entire customer journey, right from the time that you create an account on Airbnb. Throughout the journey, at times it is possible that something could go wrong. A fake account could get created or your account could get taken over. The listing you're trying to stay at could be a fake listing. The reviews you're looking at may not be exactly genuine.
But our job is to make sure that you can focus on your magical stay. The host can focus on being that perfect host. And we try to anticipate some of these risks and we minimize the risk of those kind of issues from happening. And using technology and data to enable that magical user journey is what my team does. I think we all hear headlines where something goes wrong. I mean, how much goes wrong in the course of the gazillions of transactions that you do?
That's a great question. In 2021, we had around 70 million trips that happened on Airbnb. And less than 0.1% of those trips resulted in a host or a guest reporting potentially an issue that happened in that stay.
And when our team went in and investigated and actually looked at where real harm happened or somebody had to be removed from the platform, that number is even smaller. I know that even one bad incident is probably too bad and should not be happening. But in the bigger scale of things, what you come back to realize is majority of the people are actually good. 99.9% of people are meaning to just go about their daily business, enjoy a great stay or be a great host instead.
But sometimes, that very rare occasion when that bad thing happens, we want to make sure that we reduce the risk of that. If you really look at where a person is causing harm to another person, those incidents are like a fraction of a fraction of a fraction. But when we talk about safety, we are looking at a carbon monoxide gas leak. It could be slip and fall. It could be property damage where glass broke and you probably had a stain on your carpet. We are also looking at those kinds of issues.
What I'm hearing from what you're saying is your role is, number one, a good role. You're the protector of the good and the warder-offer of evil, right? I love that. You should be part of our branding team. This is a good thing, right? It's a very purposeful role. And the second part of it is you're talking about microscopically small incidents. Yeah, needle in a haystack. Right? And those are fun problems.
Tell us more about, like, under the hood, what happens. I would say it's three parts. One is we start with the why, which is you framed it really well, that we are protector of the good and trying to ward off the bad. We can never say that we stop everything bad. Like, to Sam's point,
You cannot do that. But how are we learning and getting better? And that's where AI and ML comes in, where how can we use the power of data and technology to learn and get better? And every year, the risks don't stay the same. The data keeps growing. And then the second thing is, like, how do we use technology responsibly? Which is, you could go too blunt and try to block as much as possible to specifically look at fraud and safety incidents and keep them down.
But then if the data that you're using to make these decisions has bias in it, is potentially creating unfairness, then you could close off your doors to a lot of good people who genuinely are just trying to have like a good stay.
So using technology responsibly. And then lastly, it is having self-governing mechanisms, which is we have our privacy team, our data privacy team, our InfoSec team, our anti-discrimination teams, which are, I think, really interesting, almost like mirrors to the trust team, as well as other teams in Airbnb, where every time we are really laser focused on this particular fraud vector, this bot detection, this account takeover detection or party detection,
They are constantly making sure that are we using the data in a safe way, in a transparent way, giving users control and then not introducing bias.
And then lastly, having ways for us to give people a path back, which is we are going to, based on the data available to us and the technology available to us and the maturity of that technology, make our best possible decision. But if we get it wrong, can we create transparency and a process for users to appeal and get back on the platform when we do block them or remove them? So that again, we can learn and get better. So that at the highest level is how my team is working under the hood.
And what are some of the use cases? One of the use cases that we have talked about quite a bit is, you know, I'll take you on a little story, which is when I joined the trust team in 2020, the world had just shut down. The pandemic was raging. And very similar to how the world was waiting for a vaccine. And while waiting for a vaccine, we went into lockdown mode. Like that was a blunt instrument, right? Like to prevent people from exposing themselves to situations which would make them unsafe.
Similarly, Airbnb became an unfortunate product market fit for unauthorized house parties. Like when the hotels and bars shut down, unfortunately, that less than 1% or 0.1% of users who were looking for places where they could throw these parties, they started renting out Airbnbs. And as a result, we realized that it was causing a lot of pain to our communities of hosts and guests who use this platform, which was based on a foundation of trust.
So initially, we had to use some very blunt instruments. We first instituted our global party ban, saying that Airbnb does not allow for any kind of party whatsoever. Second, we started looking at some patterns and found that someone under 25 booking an entire home for just one night or two nights, less than 50 or so miles away from their home,
Those seem to be all signals that lead to parties. When parties happen, we usually find one or that account was created yesterday. Then we implemented a blunt rule called under 25 and we put that in place. Now, we had to be careful because those rules don't exactly work globally, but that worked in North America. And we saw right away that there was a reduction.
in the number of unauthorized parties. But over time, just as it happens with rules, we saw that people started to game those rules. They would get an older friend to book the reservation for them. Instead of one night, they would book it for three nights and so on and so forth, as well as the impact to good users was so high.
that we said that we kind of need to move to machine learning. We need to get smarter. And so we did our first pilot in Australia, where we segmented the place and looked at the place that had this new party model that we built versus the other. Do we see a reduction? And again, this is like looking for a needle in a haystack. But we still saw 35% fewer parties in the area that had the party model versus not. And when we started experimenting in North America, we did see that
It was as effective as the heuristic with fewer lesser impact to good users. And that kind of gave us the confidence to keep moving forward because not doing anything was not an option. Like we had to do something and had to be smart about it. That was like one of the use cases where we could go out and talk about it, not claiming that we are stopping all parties, but at least Airbnb is taking a strong stance about it.
One common thread that I'm hearing in how you're describing the problem and the solution is a need for constant adaptability and ongoing learning and using the right instrument for the right job. At times it's a rule, then maybe it's unsupervised learning. Then there's a human intervention. It does sound like quite multidisciplinary learning.
including the part where the rules change or the fraudsters mode of operation changes. And you also have the issue of, is it too blunt? Does it limit some unintended but good or benign behaviors? How does the human and machine collaboration or interaction work in these situations?
You summarized it really well, Shervin, which is it's not a one and done thing. You build it and then you have to constantly learn and optimize from even the kind of decisions you're making, as well as how the world around you is evolving. And that is an area where humans are really good at, which is, you know, you do have to have the discipline of training your models with the latest data, with the right data, making trade-off decisions on what is the optimal threshold at which you're going to say, OK, this is risky and we are ready to use that risk.
At Airbnb, we have been constantly on this journey of how to leverage humans in the loop. I have a trust operations team in addition to the product and planning teams. This operations team constitutes of full-time employees as well as agents globally. And whenever we build a new model, in the beginning, we often have the model make a risk threshold-based decision that this is the top 1%. This is the next tranche. This is the next tranche.
And at times we will route that to human beings to then create labels on top of that to say, I agree with the model's decision, I don't agree with the model's decision, which then serves as training data back to the model. And then when we have more confidence in the decision-making capability of the model, we would go towards auto-decisioning. This party model, to your point, right now is completely auto-decisioned. But what ends up happening is
When we get an appeal back from a user saying, I was incorrectly blocked, that then serves as data that's coming by the customer telling us. And then an agent looks at that appeal and says that actually, yes, this decision was incorrect. And if the agent made a correct decision, that customer appealed correctly and no incident happened, we know that that was a false positive. But let's say the agent approved and then that customer went on to throw a party, that was a false negative. We are learning from that continuously and
But we do it iteratively. First, we try to have the model direct to a human. And again, there is no guarantee that the human is better at decision-making than the model, but we constantly measure the performance. Then we go towards higher auto-decisioning, lesser human decisioning. And sometimes we will also have, let's say the party model makes decisions. People go about their merry way, have the reservation. Sometimes after the reservation has been booked also, we will have humans look at still the top 1% risky reservations to try and correct anything that they can.
So it is trying to balance that out as to when are humans the right people to make the decision? When should the machine make the decision? It's been a not easy, nonlinear journey that we have been on. You mentioned at first you're maybe reviewing more cases and then learning from those cases.
But at the same time, you have people, I mean, these 70 million transactions are happening constantly. And for example, when you gave the illustration of the hotel egregious shutdown at the beginning of the pandemic, I mean, that's time where it's happening constantly.
to individual hosts and customers. How do you balance that trade-off of needing to get something in place quickly at the same time needing to go through a process? That probably is the hardest thing and the hardest thing in my entire career, I would say, that I have dealt with. Like before this, I was managing all of product for samsclub.com, which is part of Walmart. And I thought my job was hard then.
And here, while we were serving customers, helping them check out, it still felt like we were on a growth path and we could choose how fast we grow or how slow we grow. And I always had teams that were looking at people who were trying to check out couldn't and serve them while we were building long-term architecture. But people's lives were not at stake. People's money, fraudsters using stolen credit cards and swindling the company of millions of dollars. There is a team at Airbnb that is trying to protect the good and ward off the bad and
What we do is that we have to get really good at prioritization, which is also very hard to do in the world of trust and safety. And in many ways, the choices are what, Sam, you said in the beginning. I mean, these kinds of economies are not risk-free. They cannot be risk-free. And the good...
collectively outweighs the evil. So then it becomes all about prioritization. But I have to say, we've talked to many people in analogous roles that you have in other organizations. You mentioned Sam's Club, others where they're using technology, digital, AI, ML to drive a variety of use cases. I think yours stands out in the sense that
It is a continuous balancing act and the stakes are high.
And then you have this sort of threat of purpose and the humility of saying, well, we believe that we'll make mistakes and the system cannot be mistake free. Yeah. But over time, things are getting better. Which brings me to my next question. How do you measure this, like your effectiveness? Is it in terms of number of incidents and some severity and frequency of bad things happening?
I'm glad you asked this question because I was naturally going to go towards this, which is at the highest level, the metrics that we measure ourselves on are
Number of fraud incidents per million trips, number of safety incidents per million trips, as well as good user impact. You know, that's the balancing metric to see how many good listings did we block? How many good hosts and good guests did we prevent from moving forward to truly understand our false positives and false negatives? But at the end of the day, it's good trips that we are looking at across the board. We are also looking at dollars in terms of like fraud loss in dollars that is potentially happening.
We are looking at customer support tickets that are coming in, user NPS resulting from anyone who encounters a friction that is thrown by our teams. So at a basic level, we are looking at these metrics. But one of the challenges that for the first time again in my career I've run into, not so much in the fraud world where we can A-B test, we are constantly iterating our models. You are probably also, when you do that, preventing quite benign metrics
behaviors and intentions as well. But as you said, there is a point where the human judgment needs to trump and say that this is something where even the risk to one person might be too high. And yes, I mean, we always talk about
exploration and exploitation when it comes to machine learning and sort of ongoing feedback loops. But there's a cost to learning always, right? But when I send you a marketing message you may not like, the cost of learning is very minimal. So I will send you many of them and I'll do all kinds of ABX testing, right?
But when I do that kind of A-B testing to groups or populations where the stakes are higher, then you might as well make that human judgment call of not doing it and relying on retrospective data. And I think you elucidated that point quite well for us. Yeah. And one question that we ask ourselves is, what makes a human being trust another human being? What makes a human being trust Airbnb? Yeah.
And we are realizing that we can run all kinds of models in the background and you wouldn't even know whether you were in the risk threshold 1% or 2% or marked as a good user.
But there is also a dialogue that needs to happen between Airbnb and you as a guest or you as a host or between the host and the guest that truly instills trust. Like when we ask someone, would you be comfortable sending your 17-year-old girl to a stranger's house in Greece when she wants to backpack before joining college? You can probably tell this is a personal experience. I have a 17-year-old who's about to go to college.
And I realized that the first thought is no, like not happening. And I run trust and safety at Airbnb. But then when I think that, well, if I could talk to the host, if she's going to stay with someone else in a house, if that person is a mom like me. Yes. And we as Airbnb could say that we are running all these models in the background. Don't worry, your daughter is safe. I don't think that's going to fly. I think as a parent, you will want to have the information you need to see, even though your judgment may not be better than the machine's.
So we are also working on what is it that we need to say, that we need to do. And sometimes when we say things like Airbnb is not going to allow, you know, one night stays that are booked at the last minute within the same neighborhood. I think it makes a lot of parents feel better, too, that, OK, at least this option is gone. You know, they will probably figure something else out.
And that message cannot be convoluted. That message has to be simple. That message has to be clearly understood. And in addition to asking people what not to do, we also need to encourage people on what good behaviors they should do, like talk to the host, talk to the guest, ask questions. We have launched a program for solo female travelers because we were starting to see a slightly higher rate of maybe personal safety incidents in private rooms with solo female travelers.
that we started encouraging, like find out, is there going to be a lock in your room? Is that lock going to be working? Will you have access to the bathroom just by yourself? Which spaces are shared, which are not? So this was not a model that was stopping anything. This was simply an educational module that we had published.
in multiple languages, which was really helpful for our solo travelers. Part of my team's work is not just to build invisible defenses, but also to create trust and the perception of trust upfront through education and messaging. Very well said. I have to say, you mentioned you're a mother of five, and I have to say, I think you stand on very solid ground when it comes to telling your kids that
what they can and cannot do because you have so much data and so much experience. And versus me, when I say something and they're like, well, how would you know? And you could say, well, trust me, I do 70 million transactions a year and I know what's going on. And growing. Yes. I'm happy to talk to your kids, by the way, anytime. Yes, that would be great. I'll take you up on that.
One thing about your measurements that I thought was interesting, maybe come back a little bit to that. You mentioned some positive metrics too. So I think it's a little tempting. And I think even in this conversation, we've done that is that we titrate towards the negative. And that's the news problem in general is that bad news sells papers. Well, you know, I think in the course of this conversation, we've shifted towards that. But some of the metrics you mentioned, I thought had a more positive ring to them, didn't they?
Yeah, absolutely. And this was an epiphany that, you know, we have been having over the last few years because any kind of machine learning model typically does well when there is a lot of data to learn from.
And with these kinds of cases where out of 70 plus million trips, when 0.1% of those result in even a report and a fraction of a fraction of that results in an actual incident where someone gets removed, there is very little to learn from. And as a result, these models take time to mature. Of course, with technology evolving, that is going to get better. But we realize that we have a lot more data about good user behavior. We have a lot more users who are actually coming in
Booking their stay like months in advance. They are probably checking in on time, leaving the property even better than they found it, communicating with the host and the guest, leaving honest reviews. So if we can actually flip the switch and while we need to continue to look for the anomalies and the trends and the bad actors, if we can get really good at learning what good behavior looks like,
while making sure that we are also measuring for potential bias or discrimination and privacy compliance in how we collect and use this data, then that can be really powerful in informing when something does look risky.
Like, for example, if an account that typically is accessed from the US, suddenly we see that that IP is now accessing from Philippines. And it feels like, oh, this is an account takeover. But if this is a good user who has typically been traveling around the world quite a bit, maybe this is not an account takeover and there is history there from the good user behavior for us to be smarter about what does normal look like and what does anomaly look like, not broadly, but very specifically based on the user segment.
So this is not new stuff, but I think we sometimes tend to obsess about just really learning what bad behavior looks like and getting good at detecting that as opposed to complementing what an anomaly looks like to what normal behavior looks like. Yeah. And you have such wealth of insights and information in a field that is really an evolving field. I mean, it's not like
a typical collaborative filtering situation where users like you also bought this. I mean, there's so much more goes on in terms of matching. This must be a very interesting and worthwhile problem to get your arms around. Yeah, and it requires us to get out of our trust and safety silo and work really closely with our search relevance and personalization teams.
Because our job, unlike my job when I used to work in e-commerce, is not just to do the collaborative filtering and saying people like you bought this so you should buy this or you travel to Paris so maybe next you want to go to Rome. It's about like
You're traveling with a family this time. You have little kids. You probably want daycare. And we know that this host offers daycare and this location might be great for you. Or with the same customer, maybe traveling alone, maybe we need to offer up different recommendations. And having these trust and safety signals embedded into our search relevance algorithms can get really powerful in matching the right person with the right listing.
Nabi, you didn't start at Airbnb. Tell us a bit about how you got there. You referenced Walmart and some background there, but how did you end up in your current role? I was the first female engineer in my family. My father took a chance on me and said that, you are curious, you're a forever learner, do you want to do engineering? And I had no idea what engineering was at that time, but I signed up anyway, went on to become the first female engineer in my family. I worked with Tata Consultancy Services, then went on to join Cognizant.
came through Cognizant to the US and worked for some time at AAA. And I would just keep going to different projects and different jobs that would help me learn something completely new. And Walmart happened around 2006. And I was asked to work on supply chain at Walmart, which is probably if you want to learn supply chain at a company, that's the company to go to.
And while I was doing supply chain every year, I kept getting bigger and bigger responsibilities and projects till I found myself from Walmart leading almost all of the supply chain teams to going to Sam's Club, which is a part of Walmart leading all of product and front end and cart and checkout and marketing and pricing, which I had never done before. But it gave me a chance to get an all-rounded experience of learning how the back end of a large retailer works, as well as front end customer experience works.
And right after Sam's Club, I was asked to lead Search. Right around that time, I was starting to get really interested in a world that runs on machine learning and AI. And Search was a great landing ground for me. I did a course from MIT, interestingly, to learn about application of AI in ML. And that gave me the courage to take on the Search job. And from there, Airbnb happened after 13 years almost at Walmart.
I was so curious about the travel industry and this marketplace that was so different from anything I had done in Walmart. So the theme here is that I have always gravitated towards something that helps me leverage the skills that I have, but then use my curiosity and learning to do something completely new and build myself up as I go along. That's kind of my background.
No, but we have a section now where we ask you a series of rapid fire questions and just tell us the first thing that comes to your mind. What is your proudest AIML moment?
You're asking me to choose between my different teams, which is career limiting for me. But I will say that the work that we did on party detection and party risk reduction was groundbreaking. And there's no one else in the industry who did it the way we did. So that is really proud. But I am proud of all my teams, just for the record. That's a great answer. What worries you about AI?
What worries me is how powerful it is. We had a chance to get to see Sam Altman up close and in person as he came to Airbnb. And just how fast this technology is improving and how much capability it has. I worry about it falling in the wrong hands. I worry that while my team has this technology, the fraudsters have this technology too.
And so I worry about if today we are worried about fake IDs and fake spam messages being sent to our hosts and guests, like how much more advanced this technology is going to get and how much more difficult it is going to get for us to detect the good from the bad. And I think it will be AI to the rescue as well in detecting fake AI. Your favorite activity that involves no technology? Painting with watercolors.
I love painting nature. And so paper, water, colors, and just peace and tranquility. No technology is my favorite thing to do. And of course, hugging my kids and spending time with them, just listening to their day, trying to make them not be on their cell phones while they talk to me. But yes, those two. Very well said. The first career you wanted, what did you want to be when you grew up? I wanted to be a teacher. I come from a family of teachers and
I feel so much joy when I'm seeing someone's eye light up with the gift of knowledge. Like I am a forever learner. Every job that I take has been like completely different from my previous job is because I love to learn. And so, yeah, that's what I wanted to be. And you probably do a fair amount of teaching in your current role anyway. I do a lot of mentoring. You've taught us quite a lot. Thank you. Thank you. I do do a lot of mentoring because I feel like if someone else can see me do what I do, it's
That'll make them feel that they can do it too. Say if I can just do that, you know, that's my life's mission accomplished. What's your greatest wish for AI in the future? My greatest wish for the world actually is to not be so afraid, to give it a chance. Because I think sometimes our fear of the bad holds us from embracing the good. There is so much wasted effort that goes into activities, you know, that
should be automated through AI. Like so many patients who are not getting treatment, you know, so many companies that probably need help and need so much funding to stand up basic things that can be done by AI. So many countries probably who are underdeveloped can get so much advantage. I know that when it falls into the wrong hands, it can be used for bad, but the world has more good people than bad people. And I believe in the power of us using AI for good, using our collective goodness.
Well, I think everyone will probably resonate with your idea that the world is better than it is bad. We do tend to hear most of the negative stories, but it's refreshing to hear how, one, how much attention you're paying to trying to prevent those, but also to hear something about the good stories too that your platform enables. Thank you for taking the time to talk with us today. We've enjoyed it. Thank you, Sam. Thank you, Sherbin. My pleasure. Thanks for listening. Next time, we're joined by Zan Galani, Principal Product Manager at Duolingo.
Please join us. Thanks for listening to Me, Myself, and AI. We believe, like you, that the conversation about AI implementation doesn't start and stop with this podcast. That's why we've created a group on LinkedIn specifically for listeners like you. It's called AI for Leaders. And if you join us, you can chat with show creators and hosts, ask your own questions, share your insights,
and gain access to valuable resources about AI implementation from MIT SMR and BCG, you can access it by visiting mitsmr.com forward slash AI for Leaders. We'll put that link in the show notes and we hope to see you there.