We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Less Algorithm, More Application: Lyft’s Craig Martell

2021/3/16

Me, Myself, and AI

AI Deep Dive AI Chapters Transcript

People

Craig Martell

Sam Ransbotham

Topics

Craig Martell: 在实际应用中，算法本身的重要性正在下降。现在，更重要的是算法如何融入更大的工程管道中，以确保其有效地实现商业目标。这涉及到数据的清洁度、数据的及时性以及模型的延迟等工程和运营方面的问题。算法本身可能只占问题的15%，而其余85%则取决于工程和运营效率。 Sam Ransbotham: 随着AutoML和打包工具的出现，未来数据科学家的培训应该更加侧重于数据应用和工程实践，而非算法本身。 Sam Ransbotham: AI正日益成为一个商业问题，而非单纯的技术问题，需要从商业策略和流程改进的角度来解决。

Deep Dive

Chapters

The episode introduces the concept that as algorithms become commoditized, their importance diminishes, shifting focus to their application and integration into business processes.

Shownotes Transcript

Translations:

中文

Today we're airing an episode produced by our friends at the Modern CTO Podcast, who were kind enough to have me on recently as a guest. We talked about the rise of generative AI, what it means to be successful with technology, and some considerations for leaders to think about as they shepherd technology implementation efforts. Find the Modern CTO Podcast on Apple Podcast, Spotify, or wherever you get your podcasts. Are algorithms getting less important?

As algorithms become commoditized, it may be less about the algorithm and more about the application. In our first episode of Season 2 of Me, Myself and AI, we'll talk with Craig Martell, Head of Machine Learning at Lyft, about how Lyft uses artificial intelligence to improve its business. Welcome to Me, Myself and AI, a podcast on artificial intelligence in business. Each episode, we introduce you to someone innovating with AI.

I'm Sam Ransbotham, professor of information systems at Boston College. I'm also the guest editor for the AI and business strategy Big Idea program at MIT Sloan Management Review.

And I'm Sherven Kodabande, senior partner with BCG, and I co-lead BCG's AI practice in North America. And together, MIT SMR and BCG have been researching AI for five years, interviewing hundreds of practitioners and surveying thousands of companies on what it takes to build and to deploy and scale AI capabilities across the organization and really transform the way organizations operate.

Today we're talking with Craig Martell. Craig is the head of machine learning for Lyft. Thanks for joining us today, Craig. Thanks, Sam. I'm really happy to be here. These are pretty exciting topics. So, Craig, head of machine learning at Lyft. What exactly does that mean and how did you get there? So let me start by saying I...

I'm pretty sure I won the lottery in life, and here's why. I started off doing political theory academically, and I have this misspent youth where I gathered a collection of master's degrees along the way to figuring out what I want to do. So I did philosophy, political science, political theory, logic, and I ended up doing a PhD in computer science at Penn. And I sort of

thought I was going to do testable philosophy. And so the closest to that was doing AI. And so I just did this out of love. Like I just find the entire process and the goals and the techniques just absolutely fascinating. All part of your master plan. Not at all. I just fell into it. I fell into it. So how did you end up then at Lyft?

So I was at LinkedIn for about six years. And then my wife got this phenomenal job at Amazon and I wanted to stay married. So I followed her to Seattle. I worked for a year here at Dropbox and then Lyft contacted me. And I essentially jumped to the chance because the space is so fascinating. I love cars in general, which means I love transportation in general. And the idea of transforming how we do transportation is just a fascinating space.

And then in my prior life, I was a tenured computer science professor, which is still a big love of mine. And so I'm an adjunct professor at Northeastern just to make sure I keep my teaching skills up. Craig, your strong humanities background in philosophy, political science, you mentioned logic, all of that. How did that play for you in your overall journey?

So that's really interesting. When I think about what AI is, I find the algorithms mathematically fascinating, but I find the use of the algorithms far more fascinating because from a technical perspective, we're finding correlations

correlations in extremely high dimensional nonlinear spaces. It's statistics at scale in some sense, right? We're finding these correlations between A and B. And those algorithms are really interesting and I'm still teaching those now and they're fun. But what's more interesting to me is what do those correlations mean for the people? Like I think every AI model launched is a cognitive science test. Like we're trying to model the way humans behave. Now,

For automated driving, we're modeling the way cars behave in some sense, but it's really, we're modeling the right human behavior given these other cars driven by humans. So for me, I just, I think the goals of AI, I look at them much more from humanity's perspective, although I can nerd out on the technical side as well. Can you say a bit more about how Lyft organizes AI and ML teams?

We have model builders throughout the whole company. We have a very large science org. We also have what we call ML suites, so ML software engineers. I run a team called Lyft ML, and it consists of two major teams. One is called Applied ML, where we leverage expertise in machine learning to tackle some really tough problems, and also the ML platform, which drives my big interest in operational excellence on getting ML to make sure it's effectively hitting business metrics.

What do you think? Because I think, Craig, you're still teaching, right? Yeah, I adjunct teach at Northeastern University here in Seattle. So what do you think your students sort of should be asking that they're not, or maybe stated another way, what would they be most surprised when they enter the workforce and actually do AI in the real world? The algorithms themselves are becoming less important. Yes.

I'm hesitant to use the word commoditized, but to some degree they're being commoditized, right? You could pick one of five, one of seven. You could try them all. Model families for a particular problem. But what's really happening or what I think is the exciting thing happening is

is how those models fit into a much larger engineering pipeline that allow you to measure and guarantee that you're being effective against a business goal. And that has to do with the cleanliness of the data, making sure the data is there in a timely way, classic engineering things like, are you returning your features at the right latency? So the actual model itself has shrunk from say 85% of the problem

to 15% of the problem. And now 85% of the problem is the engineering and the operational excellence surrounding it. And I think we're at a point of inflection there. So do you believe with the advent of AutoML and these package tools and your point about over time, it's less about the algo, more about the data and how you use it. Do you think the curricula and the training and just the overall approach

orientation of data scientists 10 years from now would be dramatically different? Should we teach them different things, different skills? Because it used to be a lot is focused on creating the algorithms, trying different things. And I think you're making the point that that's sort of plateauing. What does that mean in terms of the workforce of the future?

Yeah, I think that's great. I'm going to say some controversial things here and I hope not to offend anybody. Well, that's why I asked. So I hope that you will. So if you look just five or 10 years ago, in order to deliver the kind of value that tech companies wanted to deliver, you needed a fleet of PhDs, right? The technical ability to build those algorithms was extremely important.

I think the point of inflection there was probably TensorFlow, 2013-ish, where it wasn't commoditized. You still needed to think very hard about the algorithm, but the actual getting the algorithm out the door became a lot easier. Now there's plenty of frameworks for that. I wonder, this is a real wonder, I wonder the degree to which we're going to need specialized machine learning, AI, data science training going forward. I think...

CS undergrads or engineering undergrads in general are all going to graduate with two or three AI classes. And those two or three AI classes with the right infrastructure in the company, the right way to gather features, the right way to specify your label data. If we have that ML platform in place, people with two or three strong classes are going to be able to deliver 70% of the models a company might need. Now that 30%, I think you're still going to need experts for a while.

I do. I just don't think you can need it like you used to need it where almost every expert had to have a PhD. Yeah. I actually resonate with that, Sam, in an interesting way. It's,

It sort of corroborates what we've been saying about what it takes to actually get impact at scale, which is like the technical stuff gets you only so far, but ultimately you got to change the way it's consumed and you got to change the way people work and the different modes of interaction between human and AI. And that's, I guess that's a lot of the humanities and the philosophy and the political science and how sort of the human works more so than what the algo does.

Well, that's a good redirection, too, because if we're not careful, then that conversation slips us into the curriculum being DevOps more. And so what Shervin's pointing out is that maybe that's a component, of course, too, but there's process change and more, let's say, business-oriented initiatives. So what other kinds of things are you trying to teach people? Or what other kinds of things do you think executives should know? I mean, we can't have the

Everybody can't have to know everything. That would be a bit overwhelming. I mean, perhaps that's ideal if everyone knows everything. But what exactly do different levels of managers need to know? I think the top decision maker needs to understand dangers of a model going awry. And they need to understand the overall process that you really need labeled data. Like there's no magic here. They have to understand there's not magic.

So they have to understand that label data is expensive, that getting the labels right and sampling the distribution of the world that you want correctly is extremely important. I believe they also have to understand the life cycle in general, which is different than two-week sprints, we're going to close these Jira tickets, right? That data gathering is extremely important and that could take a quarter or two. And that the first model you ship probably isn't going to be very good.

you know, because it was from a small label data set and now you're gathering data in the wild. So there's a life cycle piece that they need to understand and they need to understand that unfortunately in a lot of ways, maybe not for car driving, but for recommendations, the first couple that you ship get iteratively better. So I think that's extremely important for the top.

I think for a couple levels down, they need to understand like the precision recall trade-off, the kinds of errors your model can make. Your model can either be making false negative errors or false positive errors. And I think it's extremely important as a product person that you own that choice. So if we're doing document search, I think you care a lot more about false positives. You care a lot more about precision. You want the things that come to the top,

to be relevant. And for most search problems, you don't have to get all the relevant things. You just have to get enough of the relevant things. So if some relevant things are called non-relevant, you're okay with that, right? But for other problems, you need to get everything. Document search, that's fine. But yeah, Lyft as well. Put it in the context of one of these companies where you've had a precision or recall trade-off, false positive, false negative. Yeah.

I think luckily at Lyft, we have nice human escape hatches, which I think is extremely important. Like all these recommendations ideally should have a human escape hatch. So if I recommend a destination for you and that destination is wrong, it's- It's okay. No harm, no foul. You just type the destination in. Mm-hmm.

Right. So so I think for Lyft as a product, I think we're pretty lucky because most of our recommendations, which are trying to lower friction to get you to take a ride. It's OK if we if we don't get them exactly right. There's no real danger there. Self-driving cars. That one's tough because you want to get them both. Right. You want to know that's a pedestrian and you also want to make sure you don't miss any pedestrians. And the idea of putting a human in the loop there is much more problematic than just saying, all right, here's some destinations. Which one do you like? Right. Yeah. Yeah.

Craig, earlier you talked about how AI in real life is like a bunch of cognitive science experiments because it's ultimately about... For me, at least. Yeah. And it brought up the idea of unconscious bias. And so we as humans have become a lot more aware about our unconscious biases across everything, right?

Because they've been ingrained through generations and stereotypes, et cetera. And just our past experience, right? Like a biased world creates a biased experience, even if you have the best possible intentions. Exactly, right? And so I guess my question is, clearly there is unintended bias in AI, has to be. What do you think we need to think about now so that 10, 20 years from now,

that bias hasn't become so ingrained in how AI works that it would be so hard to then course correct.

It already has. So the question is, how do we course correct? So let me start by saying I was on a panel for Northeastern about this movie Coded Bias. So if you haven't seen the movie Coded Bias, you should absolutely see it. It's about this MIT Media Lab undergraduate black woman who tried to do a project that didn't work because facial recognition just simply didn't work for black females. It's an absolutely fascinating social study.

The data set that...

was used to train the machine learning, so the facial recognition algorithm was gathered by the researchers at the time, and the researchers at the time were a bunch of white males. And this is a known issue, right? There's a skew in the way the data set is gathered. Look, there's a similar skew in all psychological studies. Psychological studies don't apply to me, I'm 56. Psychological studies apply to college students because that's the readily available subjects, right?

So these were the readily available people because of the biased world. And so that's how the data set came about. So even if no ill intention, the world was skewed, the world was biased, data was biased. It didn't work for a great number of people. Not a lot of females were part of the training set. And then the darker your skin, the worse it got. And there's all kinds of technical reasons why darker skin has less contrast, blah, blah, blah. But that's not the issue here.

The issue is, should we have gathered the data that way? What is the goal of the data set? Who are our customers? Who do we want to serve? And let's sample the data in such a way that it's serving our customers. We talked about this earlier about the undergrads. I think that's really important. One way to get out of that is diversity in the workplace. I believe this so strongly. And you ask everybody, all of these diverse groups, to test the system.

and to see if the system works for them. When we did image search at Dropbox, we asked all of the employee research groups, please search for things that in the past have been problematic for you and see if we got them right. And if we found some that were wrong, we would go back and regather data to mitigate against those issues.

So look, your system is going to be biased by the data that's gathered. Fact, just a fact. It's going to be biased by the data that's gathered. You want to do your best to gather it correctly. You're probably not going to gather it correctly because you have your own unconscious bias, as you point out. So you have to ask all the people who are going to be your customers to try it, to bang on it, to make sure it's doing the right thing. And when it's not, go back and gather the data necessary to fix it. So I think the short answer is diversity in the workplace.

Craig, thanks for taking the time to talk with us today. Lots of interesting things. Yeah, my pleasure. These are really fun conversations. I'm pretty nerdy about this, so I enjoyed it very much. Your enthusiasm shows. Really insightful stuff. Thank you. Thank you, guys. Well, Shervin, Craig says he won the lottery in his career, but I think we won the lottery in getting him as a guest for our first episode of season two. Let's recap. I mean, he made a lot of good points. Clearly, the

commoditization of algorithms over time and how it's more and more going to be about tying it with strategy, going back to key business metrics, making change happen, the usage. I really like this point on what it takes to

get the bias out of the system and how it's already, bias is already in the system. The commoditization is particularly important. I think it resonates with us because we're talking about this from a business perspective. And so what he's saying is that, you know, a lot of this is going to become increasingly a business problem. When it's a business problem, it's not a technical problem. I don't want to discount the technical aspects of it. And certainly, you know, he brings plenty of technical chops to the table. But

He really reinforced the this is a business problem now aspect. Yeah, I mean, in five minutes, he basically provided such a cogent argument for our last two reports, right? The 2019 and 2020. It's about strategy and process change and process redesign and reengineering. And it's about human development.

and AI interaction and adoption. And what's also a business problem too is the managerial choice. I mean, he came back to that as well. He was talking about some of these things are not clear-cut decisions. There's a choice between which way you make a mistake. That's a management problem, not a technical problem. And...

It also requires managers to know what they're talking about, which means they need to really, really understand what AI is saying and what it could be saying and what's its limitations and what's the art of the possible. And I also really like the point that as you get closer to the developers and the builders of AI, you have to really, really understand the math and the code because otherwise you can't guide them.

Although, don't you worry that we're just running into this thing where everyone has to understand everything? I mean, I feel like that's a tough sell. Like if the managers have to understand the business and how to make money and they have to understand the code. I mean, having everyone understand everything is obviously important. Well, I guess the question is how much do you have to understand everything? I mean, a good business executive already understands everything to the level that he or she should to the point of asking the right questions.

I think you're right. But I think, isn't this like what Einstein said that...

You don't really understand something unless you can describe it to a five-year-old. You know, you can describe gravity to a five-year-old and to a 20-year-old and to a grad student in different ways, and they will all understand it. The question is, at least you understand it rather than you say, I have no idea there is such a thing as gravity. So basically, teaching and academics are really important? Is that what Shervin has just gone on the record as saying? I think the idea that

managers and senior executives need to understand AI itself is not a slam dunk because you're raising the right question. What is the right level of understanding? And so what is the right level of synthesis and articulation that allows you to make the right decisions without having to know everything? But isn't that what a successful business executive does with every business problem?

And I think that's what we're saying, that with AI, you need to know enough to be able to probe.

But suffice it to say, it's not a black box like a lot of the technology implementations have been a black box in the past. And that helps get back to the whole learning more and where to draw the line and help to understand that balance. I guess after the discussion of gravity, each one of those people would understand more about gravity than they did before. And so it's a matter of moving from current state to next state. Yeah.

Craig made some important points about diversity in the workplace. If the team gathering data isn't hyper aware of the inherent biases in their data sets, algorithms are destined to produce a biased result. He refers to the movie "Coded Bias" and the MIT Media Lab researcher Joy Blum-Wehman. Joy is the founder of the Algorithmic Justice League. We'll provide some links in the show notes where you can read more about Joy and her research.

Thanks for joining us today. We're looking forward to the next episode when we'll talk with Will Granis, who has the unique challenge of building the CTO function at Google Cloud. Until next time. WILL GRANIS: Thanks for listening to "Me, Myself, and AI." If you're enjoying the show, take a minute to write us a review. If you send us a screenshot, we'll send you a collection of MIT SMR's best articles on artificial intelligence free for a limited time. Send your review screenshot to [email protected].

Less Algorithm, More Application: Lyft’s Craig Martell 24:36 Share

Me, Myself, and AI

Deep Dive

Shownotes Transcript

Less Algorithm, More Application: Lyft’s Craig Martell