We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

AI+Data in the Enterprise: Lessons from Mosaic to Databricks

2025/2/26

Founded & Funded

AI Deep Dive AI Chapters Transcript

People

Jon Turow

Jonathan Frankle

Topics

Jonathan Frankle: 我坚信，AI技术应该掌握在每个人手中，每个人都能根据自身需求定制AI系统。在MosaicML，我们的核心目标是使机器学习更高效，惠及所有人。如今在Databricks，我们继续秉持这一理念，并将其扩展到AI生命周期的更多环节。我们提倡‘数据智能’的概念，即AI应该根据每个公司独特的数据和流程进行定制，而不是追求所谓的‘通用智能’。我们已经看到许多企业客户从原型阶段过渡到生产环境，这得益于开源模型的出现以及对AI应用场景的更清晰认识。企业客户应该关注AI系统能否解决问题以及成本，而不必关心其底层技术细节。成功的AI系统通常会结合多种技术，例如RAG、微调和强化学习。在MosaicML的早期，我们通过撰写博客、参与开源项目以及积极宣传，建立了良好的声誉，并吸引了客户。与Replit的合作是一个重要的早期成功案例，它证明了我们的技术能力，并为我们带来了更多客户。获取第一个客户非常困难，但获得第一个客户后，后续的客户获取会容易很多。如今，我的角色已经从单纯的科学家转变为连接研究、产品和客户的桥梁。我需要与研究团队、产品团队和客户沟通，确保AI技术能够有效地解决实际问题。我需要学习如何成为一名优秀的管理者，如何与不同背景的人有效沟通，如何处理冲突，如何激励团队成员。我对LLM评估产品和数据标注工具非常感兴趣，因为它们是AI领域的关键基础设施。数据标注仍然是AI世界中的一个关键环节，无论我们的模型多么先进，都需要更多的数据标注。我对机器人技术和AI治理政策也十分关注。我相信，机器人技术能够极大地改变我们的生活，解决许多现实问题。同时，我们也需要认真思考如何治理AI技术，确保其能够造福人类。 Jon Turow: 本期节目探讨了AI技术从炒作到实际应用的转变，以及企业客户在AI应用方面面临的挑战和经验教训。我们与Databricks首席AI科学家Jonathan Frankle进行了深入交流，探讨了AI初创企业如何将尖端研究转化为实际产品，以及如何应对企业客户的独特需求。我们还讨论了AI技术在不同行业中的应用，以及如何通过有效的沟通和故事讲述来促进AI技术的推广和应用。

Deep Dive

Shownotes Transcript

Translations:

中文

Enterprise customers are really evidence that you're going to be able to scale your business, that you have some traction with companies that are going to be around for a while, that have big budgets, that when they invest in a technology, invest for the long run. But on the flip side, the best customers are often other startups because there's no year-long procurement process. They're willing to dive right in and you can get a lot more feedback much faster.

Welcome to Founded and Funded. I'm John Thurow, partner at Madrona. And today I have the privilege of hosting Jonathan Frankel, Chief AI Scientist at Databricks, which he joined as part of that company's $1.3 billion acquisition of MosaicML, a company that he co-founded. Jonathan is a central operator at the intersection of data and AI.

He leads the AI research team at Databricks, where they deploy their work as commercial product and also publish research, open source repositories, and open source models like DBRX and MPT. On this episode, we're diving into the evolving landscape of infrastructure for data and AI and how that democratizes access to these critical technologies.

Jonathan shares his insight on the initial vision behind MosaicML, the transition to Databricks, and how production-ready AI is reshaping the industry. We'll explore how enterprises are moving beyond prototypes to large-scale deployments, the shifting skill sets AI founders need to succeed, and Jonathan's take on exciting developments like test-time compute.

Whether you're a founder, builder, or curious technologist, this episode is packed with actionable advice on thriving in the fast-changing AI ecosystem. Jonathan, it's an honor to have you here. Welcome to the show. Thank you so much for having me. I can't wait to take our private conversations and show them to everybody. We always learn so much from those conversations, and so let's dive in. You've been supporting builders with AI infrastructure for years.

first at Mosaic and now as part of Databricks. And I'd like to go back to the beginning. Let's start there. What was the core thesis of Mosaic ML? And how did you serve customers then? So the core thesis, quite simply, was making machine learning efficient for everyone. The idea that this is not a technology that should be defined by a small number of people, that should be built to be one-size-fits-all in general.

but that should be customized for everybody by everybody for their own needs based on their data. In the same way that we don't need to rely on a handful of companies if we want to build an app or write code, we just go and do it. Everybody has a website. Everybody can define how they want to present themselves and what they want to do with that technology. And we really firmly believed in the same thing for machine learning and AI, especially as things started to get exciting and deep learning.

And then, you know, of course, LLMs became a big thing about, you know, I guess halfway through our Mosaic journey. And I think that mission matters even more today, to be honest. We're in a world where we bounce back and forth between, you know, huge fear over the fact that only a very small number of companies can participate in building these models and huge excitement whenever a new open source model comes out that can be customized really easily and all the incredible things people can do with it.

And, you know, I firmly believe that this is technology that should be in everyone's hands to kind of define as they like for the purposes they see fit on their data in their own way. You know, it's a really good point. And you and I have spoken publicly and privately about the democratizing effect of all this infrastructure. And I would observe that the aperture difference

of functionality that Mosaic offered, which was especially about hyper-efficient training of really large models, putting it in the hands of lots more companies. That aperture is now wider now that you're at Databricks. You can democratize more pieces of the AI lifecycle. Can you talk about how the mission has kind of expanded?

Yeah, I mean, it was really interesting. Matei, our CTO, I was looking at his notes for a meeting that we had for our research team last week, and he had just written in his notes kind of casually, you know, our mission has always been to, I think it was to democratize data and AI for everyone. I was like, wait a minute, that sounds very familiar. And, you know, I think we may chat at some point about kind of this acquisition and why, you know, we chose to work together. It's the same mission.

We're on the same journey. Databricks, obviously much further along than Mosaic was and wildly successful, but it's great to be along for the ride. And so I think the aperture is widened for two reasons. One is simply that you don't need to pre-train anymore. There are awesome open source base models that you can build off of and customize. So even pre-training was the kind of thing that wasn't quite for everyone.

But that's not necessary anymore. You can just get straight to the fun part and customize these models through prompting or through RAG or, you know, through fine tuning or through RLHF these days. And the aperture is also widened to the fact that now we're at the world's best company for data and data analytics and the world's best data platform.

And what is AI without data and what is data without AI? So we can now start to think much more broadly about a company's entire process from start to finish with a problem they're trying to solve. What data do they have? What is unique about that data and unique about their company? And then from there, how can AI help them or how can they use AI to solve problems? And back again, this is a concept we call data intelligence. The idea that it's really meant to be in contrast to general intelligence.

General intelligence is the idea that there's going to be one model or one system that will generally be able to solve every problem or kind of make significant progress in every problem with minimal customization. And at Databricks, we kind of espouse the idea of data intelligence, that every company has unique data, has unique processes, a unique view on the world.

that is captured within their data and within how they work and, you know, within their people. And AI should be shaped around that. The AI should represent the identity of the business and the identity of that business is captured in their data. And, you know, there's no, obviously these are, this is very polemic to say data intelligence versus general intelligence. The answer will be somewhere in between. But to me, it's, it honestly, every day at work feels like I'm doing the same thing I've been doing since the day Mosaic started just now,

at a much bigger place with a much bigger ability to make an impact in the world. There's something very special about the Vantage that you have that you're seeing this parade of customers who have been on a journey from prototype to production for years now. And the most sophisticated among them are now in production. And so for that, I have two questions for you.

Number one, what do you think it was that has finally unblocked and made that possible? And number two, what are those customers learning who are at the leading edge? What are they finding out that the rest of the customers are about to discover? So I'm going to, I guess, reveal how much less I'm a scientist these days and how much more I've become a business person.

I'm going to use the hype cycle as the way to describe this. And it breaks my heart and makes me sound like an MBA to do this. But among enterprises, there are always the bleeding edge early adopter tech first companies. They're the companies that catch on pretty quickly and the companies that are more careful and conservative. And what I'm seeing is those companies are all in different places in the hype cycle right now. For the companies that are really early adopters and tech forward,

the peak of inflated expectations, they hit that like two years ago, around the time ChatGPT first came out. They hit the trough of disillusionment last year when it was really hard to get these systems to work reliably. And they are now getting productive and getting things shipped in production. And they've learned a lot of things along the way. I think they've learned to set their expectations properly, to be honest, and which problems make sense and don't make sense. This technology is not perfect by any stretch.

And we're still, I think the more important part is we're still learning how to harness it and how to use it in the same way that, you know, having punch cards back in the 1950s or 60s is, you know, still turn complete and still a little bit slower, but just as capable as our computing systems today from a theoretical perspective. But 50 years of software engineering later, and it's much easier to build an architect a system that will be reliable and build it in a modular way and all these principles we've learned. And I think that's,

Those companies are furthest along in that journey, but it's going to be a very long journey to come. And we know how big of a system we can build at this point without it keeling over and where the AI is going to be unreliable and where we need to kick up to a human, which tasks make sense, which tasks don't make sense. A lot of them I've seen have kind of whittled it down into very bite-sized tasks. The way that I typically frame it for people is, you know, you should use AI either in cases where

It's open-ended and there's no right answer or where a task is hard to perform, but simple to check. And you can have a human check. I think GitHub Copilot is a great example of this, where you could imagine a situation where you ask AI to just write a ton of code. And now a human has to check all that code and understand it. And honestly, it may be just as difficult as writing the code from the beginning or pretty close to it.

Or you can have the AI suggest very small amounts of code that a human can almost mechanically accept or reject. And you're getting huge productivity improvements, but this is a scenario where the AI is doing something that is somewhat more laborious for the human, but the human can check it very easily.

And I think finding those sorts of sweet spots is where kind of the companies who have just been at this the longest, they've also been willing to take the risk and invest in the technology. They've been willing to try things. They've been willing to fail, to be honest. They're willing to just take that risk and be okay if the technology doesn't work the first or second time and keep whatever team they have doing this going and trying it again. And then you have companies that are kind of, I think a bunch of companies are in the trough of disillusionment right now.

companies that are kind of a little less on the bleeding edge. And then a bunch of companies are still at that peak of inflated expectations where they think that AI will just solve every problem for them. And those companies are going to be very disappointed in a year and very productive in two years. You know, naturally, a lot of founders who are going to be listening are asking, how do they get in these conversations? How do they identify the customers that are about to exit the trough?

And how do they focus for them? What would you say to those founders? I have two contradictory lessons from my time at Mosaic.

The first is that VCs love enterprise customers because enterprise customers are really evidence, at least if you're doing B2B, that you're going to be able to scale your business, that you have some traction with companies that are going to be around for a while, that have big budgets, that when they invest in a technology, invest for the long run. But on the flip side, the best customers are often other startups because there's no year-long procurement process. They're willing to dive right in.

They understand where you're coming from and understand the level of service you'll be able to provide because they're used to it. And you can get a lot more feedback much faster. But that is taken as less valuable validation. Even when I'm evaluating companies, enterprise customers are worth more to me. But startup customers are more useful for building the product and moving quickly. And so the answer is strive for enterprise customers. Don't block on enterprise customers. I think that's fair. And I think optimizing for learning is really smart.

But there's another thread that I would pull on, and this is something that I think you and I have both seen in the businesses that we've built, which is the storytelling. And I won't even say GTM. The storytelling around our product can be segmented, even if the product is horizontal, as so many infrastructure products are.

Mosaic was a horizontal product. Databricks is a horizontal family of products. But there are stories that we tell that explain why Databricks and Mosaic are really useful in financial services, really useful in healthcare. And there's going to be a mini adoption flywheel, not so many in each of these segments, where you do want to find first the fast customers first.

and then the big customers as you dial that story in. And there may be product implications, but there may be not. I think that's a great point. And there are stories, I think, along multiple axes. These days, kind of in a social media world and in a world where just everybody's paying attention to AI,

there are horizontal stories you can tell that will get everyone's attention. And I think, I mean, one of the big lessons I took away from Mosaic was talk frequently about the work you're doing and have some big moments where you, where you really buckle down and you do something big. Don't disappear while you're doing it, but

you know, releasing the MPT models for us, which sound so quaint only a year and a half later. And it really was only a year and a half ago that we trained a 7 billion parameter model on 1 trillion tokens. And it was the first kind of open source, commercially viable replication of the Lama one models.

which sounds hilarious now that we have a 680 billion parameter mixture of expert model that just came out. And the most recent meta model was a 405 billion parameter model trained on 15 trillion tokens. It sounds quaint, but that moment was completely game changing for Mosaic. And it got the attention up and down the stack and across all verticals, across all sizes of companies and led to a ton of business.

And further moments like DBRX more recently, kind of same experience. And so storytelling through these important moments, especially in an area where people are paying close attention, actually does kind of resonate universally.

But at the same time, I totally hear you on the fact that for each vertical, for each size of company, there is a different story to tell. And I think my biggest lesson learned there is getting that first customer in any industry or in any company size or anything like that is incredibly hard. Somebody has to really take a risk on you before you have much evidence that you're going to be successful in their domain. Having that one story you can tell leads to a ton more stories. Once you work with one bank...

a bunch of other banks will be willing to talk to you. But getting that first bank to sign a deal with you and actually do something, even for the phenomenal go-to-market team we had at Mosaic, was a real battle. They had to really fight and convince someone that they should even give us a shot, that it was worth a conversation. Can you take me back to an early win at Mosaic where you didn't have a lot of credentials to fall back on?

Yeah, it was a collaboration we did with a company called Replit. Before we had even released the MPT models, we were just chatting with Replit about the idea that we could train an LLM together, that we'd be able to support their needs there. And they trained MPT before we trained MPT. They were willing to take a risk on our infrastructure. And we delayed MPT because we only had a small number of GPUs and we let Replit take the first go of it. I basically didn't sleep that week because I was monitoring the cluster constantly.

We didn't know whether the run was going to converge. We didn't know what was going to happen. It was all still internal code at that point in time. But Repl.it was willing to take a risk on us. And it paid off in a huge way. It gave us our first real customer that had trained in LLM with us and been successful and deployed it in production. And that led to probably a dozen other folks signing on right then and there. And the MPT model actually came out after that. How did you put yourself in a position for that lucky thing to happen?

We wrote a lot of blogs. We shared what we were working on. We worked in the open source. We talked about our science and we built a reputation as the people who really cared about efficiency and cost and the people who might actually be able to do this.

We talked very frequently about what we were up to. And that was kind of a lesson we had learned early on where I don't think we talked frequently enough, but we wrote lots of blogs. When we were working on a project, we would write part one of the blog as soon as we hit a milestone. We wouldn't wait for the project to be done and then do part two and part three. And those MPT models were actually, I think, part four of like a nine month blog series on training LLMs from scratch.

And that got Repl.us attention much earlier and started the conversation. Maybe one way of looking at it, if you want to be cynical, is selling ahead of what your product is. But actually, I look at it the other way, which is to show people what you're doing and convince them that they can believe you're going to take that next step. And they want to be there right at the beginning when you first take that next step because they want to be on the bleeding edge.

I think that's what got the conversation started with Replit and kind of put us in that position. But we were going to events all the time, just talking to people, trying to find anyone who might be interested in enterprise that had a team that was thinking about this. And there were a bunch of folks we were chatting with, but we had already started contracting deals with folks. But Replit was able to basically move right then and there. They were a startup. They could just say, we're going to do this and write the check and do it. So being loud about what it is that you stood for,

and what it is that you believed and being good at it. Like, I think we worked really hard to be good at one thing and that was training efficiently. You know, you can't fake it till you make it on that. Like we did the work and it was hard and we struggled a lot, but we kept pushing at the strong encouragement of Nafin and Hanlon, our co-founders. They kicked my butt to keep pushing, even when it was really hard and really scary. And we were burning a lot of money.

but we got really good at it. And I think people recognize that and you know, it led to customers, it led to the Databricks acquisition. And I'm now seeing this among other small startups that I'm talking to, in the context of collaboration, in the context of acquisition, anything like that, the startups I'm talking to are the ones that are really good at something. It's clear, they're really good at something. It's been clear through their work, I can check their work, they've done their homework, and they show their work.

You know, those are the folks that are getting the closest look because they're genuinely just really good at it. And you believe in them and you know the story they're telling is legitimate. There's one more point on this, which I think complements and extends what you just said, that you folks believed in something. And this is not about a story. And it's not about...

results either that you believe training could be and should be made more efficient. And a lot of the work you were doing anticipated things like chinchilla that quantified

how it could be done later. Oh, we didn't anticipate. We followed in the footsteps of Chinchilla. Chinchilla was like early visionary work. And I can say this, you know, Eric Elson, who worked on Chinchilla is now one of my colleagues on the Databricks research team. But I mean, there are a few moments if I really want to look for the pioneers of just truly visionary work that was quite early. And when I look back,

is just kind of like tentpole work for LLMs now. Chinchilla is one of those things. The other is like a Luther AI putting together the pile data set, which was done in like late 2020, like two years before anyone was really thinking about LLMs. They put together what was still the best LLM training data set into 2022. But we did genuinely believe in it, I think, to your point. Like we believed in it and we believed in science.

We believed that it was possible to do this and through really, really rigorous research. We were very principled and had our scientific frameworks that we believed in our way of working. We had a philosophy on how to do science and how to make progress on these problems. OpenAI believes in scale and now everybody believes in scale. We just believed in rigor, that doing our homework and measuring carefully would allow us to make consistent methodical progress.

And that remains true and remains the way we work. It's sometimes not always the fastest way of working, but at the end of the day, it leads to consistent progress. So here we are in 2025 and amazing innovation is happening and there's even more opportunity than there has been, it seems to me. Even more excitement, even more excited people. How do you think the profile and the mix of skills in a new team

should be the same and should be different as to when you formed Mosaic? So it depends on what you're trying to do. We hire phenomenal researchers who are rigorous scientists, who care about this problem and are aligned with our goals, who share our values, who are relentless and

honestly, who are just great to work with. I think culture cannot be understated and conviction is the most important quality. If you don't believe that it is possible to solve a scientific problem, you will lose all your motivation and creativity to actually solve it because you're going to fail a lot. And the first failure, you're going to give up. But beyond that,

I think this is data science in its truest form. Like, I never really understood what it meant to be a data scientist, but this feels like data science. You have to pose hypotheses about which combinations of approaches will allow you to solve a problem and about measuring carefully and developing good benchmarks to understand whether you've solved that problem. I don't think that's a skill that's confined to people with PhDs. Far from it. So...

The fact that Databricks was founded by a PhD super team now means that more than 10,000 enterprises don't need a PhD super team when it comes to their data. And I look at our Mosaic story through to our Databricks story now in the same way.

We built a training platform and a bunch of technologies around that. And now we're building a wide variety of products to make it possible for anyone to build great AI systems in the same way that when you get a computer and you want to build a company, you don't have to write an operating system. You don't have to build a cloud. You don't have to invent the virtual machine. I mean, abstraction is the most important concept in computer science. And

Databricks has had a PhD super team to build that low-level infrastructure that required it to build Spark and Delta and Unity Catalog and everything on top of that. And now it's the same thing for AI. The future of AI isn't in the hands of people like me. It's in the hands of people who have problems and can imagine a solution to those problems. In the same way that I'm sure Tim Berners-Lee, who pioneered the web, did not exactly imagine, I don't know, TikTok.

That was not what he had in mind when he was building the World Wide Web. The kinds of startups I'm most thrilled about engaging with today are companies that are using AI to make it easier to get more out of your health insurance, making it easier for you to solve your everyday problems, making it easier for you to just get a doctor's appointment or for a doctor to help you, for us to spot medical challenges earlier. That's the kind of people who are empowered because they don't have to go and build an LLM from scratch.

to do all that. That layer has now been created. So the future is in the hands of people who have problems and care about something. For a PhD super team these days, there's still tons and tons of work to do in making AI reliable and usable, building the tools that these folks need, building a way for anyone to build an evaluation set in an afternoon so that they can measure their system really quickly and get back to work on their problem.

There's a ton of really hard, complex, fuzzy, like machine learning work to do. But I think the interesting part is in the hands of the people with problems.

How is your role changing as you adopt these kinds of AI technologies inside Databricks and you try to be, I'm sure, as sophisticated as you can be about it? I'm still a scientist, but I haven't necessarily written a ton of code lately. But I spend a lot of time these days connecting the dots between research and product and research and customer and research and business. And

And then come back to the research team and say, I think we really need to do this. How can RL help us do that? And then go to the research team and say, you've got this great idea about this cool new thing we can do with RL. Let me go back to the product and try to blow their mind with this thing that they didn't even think about because they didn't know it was possible. Show up with something brand new and convince them we should just build a product for that because we can and because we think people will need it. And so in many ways, I'm a bit of a PM these days.

But I'm also a bit of a salesperson these days, you know, but I'm also a manager and I'm trying to continue to grow the incredible skills of this research team, both the people who have been with me for four years and the people who have just arrived out of their PhDs and make them into the next generation of successful Databricks talent that stays here for a while and then maybe goes on to found more companies like a lot of my former colleagues at Mosaic have. So it's kind of a little bit of everything, but I have had to make this choice about whether I'm just going to be really, really deep in

as a scientist, write code all day, get really, really good at getting the GPUs to do my bidding, or get good at leadership and running a team and inspiring people and getting them excited and growing them, or get good at thinking about product and customers and what combination I wanted to have there. And that combination has naturally led me away from being the world's expert on one specific scientific topic, but towards something I think is more important for our customers, which is understanding how to use science to actually solve problems.

There's an imaginative leap that you have to make from the technology to the persona of your customer and the empathy with that, that I imagine involves being in a lot of customer conversations. But it's an inversion of your thinking. It's not, here's a hard problem that we've solved. What can we do with it? It's keeping an index of important problems in your head and spotting possible solutions to that maybe.

I actually think it's the same skill as any good researcher. No good researcher should just be saying, I did a cool thing, let me find a reason that I should have done it. Sometimes, very occasionally, this leads to big scientific breakthroughs. But for the most part, I think a good, productive, everyday researcher should be taking a problem and saying, how can I make a dent in this? Or finding what the right questions are to ask and just asking them and coming up with a very basic solution.

All of these sound like just product scenarios to me, whether you're building an MVP, like figuring out a question that hasn't been asked before that you think is important to be asking and building an MVP and then trying to figure out whether there's product market fit or the other way around, finding a problem and then trying to build a solution to it. I don't think much research should really involve just saying, I did this thing because I could. That is very high risk and

It's hard to make a career out of doing that all the time because you're generally not going to come up with anything. I'm going out and trying to figure out what the important questions are to be asking, both asking new questions and then checking with my PM to see if that was the right question to ask and talking to my customers. It's just now instead of my audience being the research community and a bunch of PhD students who are reviewers and convincing them to accept my work,

my audience is now customers and I'm convincing them to pay us money for it. And I think that is a much more rigorous, much higher standard than getting a paper and dinner reps. I had dinner with a customer earlier this week and they're doing some really cool stuff. They have some really interesting problems

I'm going to get on a plane in two weeks and just go down to their office for the day and meet with their team all day and just learn more about this problem because I want to understand it and bring it back to my team as a question worth asking. You know, it's not 100% of my time, but...

I think you should be willing to just jump on a plane and go chat with an insurance company and spend a day with their machine learning team, learning from them and what they've done and hearing their problems and seeing if we can do something creative to help them. That's good research. And if you ever sent me back to academia, that's probably still exactly what I'd do. One of my favorite things that you and I spoke about at NeurIPS some weeks ago was the existence of a high school track system.

at the NeurIPS academic conference about AI. And I wonder if you could share a little bit about that and about what you saw and what that tells you about the next wave of thinking in AI. So the high school track at NeurIPS was really cool and also really controversial.

for a number of reasons. Is this just another way for students who are incredibly well off and have access to knowledge and resources and a parent who works for a tech company to get ahead further? Or is this an opportunity for some really extraordinary people to show how extraordinary they are and for people to learn about research much earlier than certainly I did and try out doing science?

But there are kind of generational changes in the way that people are interacting with computing. This is something that, you know, my colleague Hanlon, who was one of the co-founders of Mosaic, has observed. And I'm totally stealing from him. So thank you, Hanlon. Seeing companies that are founded by people who clearly came of age in an era where your interface to a computer was just typing in natural language, whether it was to Siri or especially now to chat GPT. And that is just the way they think about a user interface. You want to build a system?

We'll just tell the AI what you want. And on the back end, we'll pick it apart and figure out what the actual process is in an AI driven way, build the system for you and hand it back to you. That's a very different way of interacting with computing. But that's the way that a lot of people who have grown up in tech over the past several years, a lot of people who have, who are graduating from college now or have graduated in the past couple of years who are in high school now, especially that is their iPhone. That is their personal computer is chat GPT.

You know, it's not buttons and drop downs and dashboards and checkboxes and, you know, apps. It's just tell the computer what you want. And it doesn't work amazingly well right now. Someday it probably will. And that day may not be very far away, but that's a very different approach and one that is worth bearing in mind. I want to switch gears a little bit and get to a technical debate that we've had over the years as well, which is about the mix of techniques.

enterprises and app developers are going to use to apply AI to their data. And of course, RAG and in-context learning have been exciting developments for years because it's just so easy and it's just so appealing to put data in the prompt and reason about that with the best model that you can find. But there has been a

wave of excitement, renewed wave of excitement, I'd say, around complementary approaches like fine tuning and test time compute, reinforcement tuning from open AI and lots more. And I wonder if now is the moment for that from a customer perspective, or if you think we're far ahead of our skis, and what's the right time and mix of these techniques that enterprises and app developers are going to want to use?

My thinking has really evolved on this. And you've watched that happen. But I think we've reached the point where the customer shouldn't even know or care. I just want an AI system that is good at my task. And I want to define my task crisply.

and I want to get an AI system out the other end. And whether you prompt, whether you do few shot, whether you do an RL based approach and fine tune, whether you do LoRa or whether you do full fine tuning, or whether you use DSPy and do some kind of prompt optimization, that doesn't even matter to me. Just give me a system, get me something up and running, and then improve that system. Surface some examples that may not match

what I told you my intention was. And let me clarify how I want you to handle those examples as a way of improving my specification for my system and making my intention clearer to you. And now do it again and improve my system. Let's have some users interact with the system and gather a lot of data.

And then let's use that data to make the system better and make the system a better fit for this particular task. Who cares whether it's rack? Who cares whether it's fine tuning? The only thing that matters is did you solve my problem and did you solve it at a cost I can live with? And can you make it cheaper and better at this over time? And from a scientific perspective, that is my research agenda right now at Databricks. But you shouldn't care how the system was built.

You care about what it does and how much it costs, and you should be able to specify, this is what I want the system to do in all sorts of ways, natural language, examples, critiques, human feedback, natural feedback, explicit feedback, everything. And the system should just improve and become better at your task the more feedback you collect. And your goal should be to get a system out in production, even if it's a prototype, as quickly as possible. So you start getting that data and the system starts getting better. And the more it gets used, the better it should get.

And the rest, whether it's long context or very short context, whether it's RAG with a custom embedding model and a re-ranker, or whether it's fine-tuning, at that point, you don't really care. And so the answer should be a bit of all of the above. And I think most of the successful systems I've seen have had a little bit of everything or have evolved into having a little bit of everything after a few iterations. In previous versions of this conversation, you've said, dude, RAG is...

RAG is it. That's what people really want. There's other things you can do to extend it, but so much is possible with RAG that we don't need to look past that horizon yet. And I hear you saying something very different now. I hear you saying customers don't care, but you care. And you're sound like you're building a mix of things. Yeah, I think what I'm seeing, the more experience I get is there is no one size fits all solution.

that RAG works phenomenally well in some use cases and absolutely keels over in other use cases. And it's really hard for me to tell you where it's going to succeed and where it's not. My best advice to customers right now is try it and find out. So there should be a product that can do that for you or help you go through that scientific process in a guided way so you don't have to make up your own progression. And so really, for me, it's now about like, how can I meet our customers where they are? Whatever you bring to the table,

Tell me what you want the system to do. And right now, we'll go and build that for you and figure it out together with your team. But I think we can automate a lot of this and make it really simple for people to simply bring what they have, declare what they want, and get pretty close to what a good solution or at least the best possible solution will look like. It's also part of my recognition that this isn't a one-time deal where you just go and solve your problem. It's a...

repeated engagement where you should really just try to iterate quickly, get something out there and get some interactions with the system. Learn whether it's behaving the way you want it to learn from those examples and go back and build it again and again and again and again and do that repeatedly until you get what you want. And a lot of that I think can be automated to at least that's my research thesis that we can automate or at least have a very easy guided way of going through this process.

to the point where anybody can get the AI system they want if they're willing to just come to the table and describe what they want it to do. What's the implication for this sphere of opportunity of new model paradigms such as test time compute, now even open source with deep seek? So I would consider those to be two separate categories.

I was playing this game with someone on my team actually earlier today where he was telling me like, yeah, deep seek has kind of changed everything. I was like, didn't you say that about Falcon and Llama 2 and Mistral and Mixtral and DVRX and so on and so on and so on? We're just living in an age where the starting point we have keeps getting better.

And we get to be more ambitious because we're starting further down the journey. This is like when our friends at AWS or Azure come out with a new instance type that's more efficient or cheaper.

I don't go and look at that and go like everything has changed. I go and look at that and go, those people are really good at what they do. And they just made life better for me and my customers. And we get to work on cooler problems and a lot more problems have ROI because, you know, some new instance type came out that's faster and cheaper. It's the same thing with models.

for new approaches. It could be something like a DPO, or it could be something like test time compute. And you know, it's hard to those are probably not comparable with each other. But just these are more things to try. These are more points in the trade off space. I think about everything in life as a Pareto frontier on the trade off between cost and quality. And test time compute gives you this very interesting new trade off.

possibly between the cost of creating a system, the cost of using that system, and the overall quality that you can get. Every time another one of these ideas comes out, the design space gets a little bigger, more points on this trade-off curve become available, or the curve moves further up and to the left or up and to the right, depending on how you define it. And life gets a little better and we get to have a little more fun. And for this product and the system that we're all building at Databricks,

things get a little more interesting and we can do a little more for our customers. So I don't think there's any one thing that changes everything, but it's just, it's constantly getting easier and constantly getting faster and constantly getting more fun to build products and solve problems. And I love that. A couple of years ago, I had to sit down and build a foundation model if I wanted to work with it. Now I already start way ahead. I love that. Jonathan, I've got some rapid fire questions that I'd like to use to bring us home.

Bring it on. Let's do it. So what's a hard lesson you've learned throughout your journey? Maybe something you wish you did better, or maybe the best advice that you received that other founders would like to hear today. So I'll give you an answer for both. I mean, the hardest lesson I've learned is honestly, it's been the people aspects. It's been how to interact productively with everyone, how to be a good manager.

I don't think I was an amazing manager four years ago, fresh out of my PhD. And my team members who have been with me that long or the team members who were with me then will surely tell you that. I like to hope the team members who are still with me think I'm a much better manager now. And the managers who have managed me that entire time, who have trained me and coached me, think I'm a much better manager now. Learning how to interact with colleagues in other disciplines or other parts of the company, learning how to handle tension or conflict,

in a productive way, learning how to disagree in a productive way and focus on what's good for the company, learning how to interact with customers in a productive way and a healthy way, even when, you know, sometimes you're not having the easiest time working with the customer and they're not having the easiest time working with you. Those have been incredibly hard-won lessons. That's been the hardest part of the entire journey. The part where I've grown the most, but also the part that has been the most challenging.

The best advice I've received, probably from my co-founders, Navin and Hanlon. Like one piece of advice from Hanlon that sticks in my mind is just, he kept telling me over and over again that a startup is a series of hypotheses that you're testing. That kept us very disciplined in the early days of Mosaic, stating what our hypothesis was, trying to test it systematically, finding out if we were right or wrong, that hypothesis could have been scientific, it could have been product, it could have been about customers and what they'll want.

But it was turning that into a systematic scientific endeavor for me made it a lot easier for me to understand how to make progress when things were really hard and they were really hard for a long time. I know that wasn't a rapid fire answer to a rapid fire question, but it's a question I feel very strongly about. Aside from your own, what data and AI infrastructure are you most excited about and why? I think there are two things I'm really excited about. Number one is

products that help you create a valuation for your LLMs. I think these are fundamental infrastructure at this point. There are a million startups doing this, and I think all of them are actually pretty phenomenal. I could probably give you a laundry list of at least a dozen off the top of my head right here, and I bet you could give me a dozen more that I didn't name because we're all seeing great pitches for this. I have a couple that I really like, a couple that I've personally invested in, but I'm

I think this is a problem we have to crack. It's a really hard problem. And I think it's a great piece of infrastructure that is critical. The other thing that I'm really excited about personally is data annotation.

I think that just data annotation continues to be the critical infrastructure of the AI world. No matter how good our models get and how good we get at synthetic data, there's always still a need for more data annotation of some kind. And revenue just keeps going up for the companies that are doing it. The problem changes, what you need changes. I don't know. I think it's a fascinating space. In many ways, it's a product. In many ways, like my customers these days, the data scientists at whatever companies I'm working with,

are also doing data annotation or trying to get data annotation out of their teams. Building an eval is data annotation. And, you know, I mentioned two things. These are both my second favorites. I think they're the same at the end of the day.

One is about going and just buying the data you need. The other is about tools to make it easy enough to build the data you need that you don't need to go and buy it. And I have a feeling both companies have made a lot of progress on AI augmentation, or both kinds of companies on AI augmentation of this process. But when I do the math on the original Lama 3.0 models, this is the last time I really sat down and did the math, my best guess was $50 million worth of compute and $250 million worth of data annotation.

And that's the exciting secret of how we're building these amazing models today. And I think that's only going to become more true with these sorts of reasoning models where I don't know that reasoning itself is going to generalize, but it does seem like you don't need that many examples of reasoning in your domain to get a model to start doing decent reasoning in your domain.

And that's going to put even more weight on figuring out how to get the humans in your organization or to get humans somewhere to help you create some data for your task that you can start to bootstrap models that reason on your task. Beyond your core technical focus area, what are the technical or non-technical trends that you are most excited about? So I think there are two. That's just, you know, one is a layperson and one is a specialist. As a layperson, I'm watching robotics very closely.

I think, you know, for all of the interesting data tasks that we have in the world, there are just a lot of physical tasks in the world that it would be amazing if a robot could perform. Like, thank goodness for my dishwasher. Thank goodness for my washing machine. I can't imagine what my life would look like if I had to scrub every dish and scrub every piece of clothing to keep it clean. Robotics is in many ways already in our lives. These are just very specific single purpose robots. But

If we can really make a dent in that problem, and I don't know if we will this decade or in three decades, like VR, I feel like robotics is a problem that we keep feeling like we're on the cusp of. And then we don't quite get there, but we get some innovation. I love my robot vacuum. That is the best investment I've ever made. I got my girlfriend a robot litter box for her cats a few weeks ago. I get texts every day going, oh my God, this is the best thing ever.

And this is just scratching the surface of just the daily tasks we might not have to do. I would love something that could help people who, for whatever reason, can't get around very easily on their own to get around more easily, even in environments where they're not necessarily built for that.

I have a colleague who I heard say this recently, so I'm not going to take credit for it. But the idea of just things that make absolutely no logistical or physical sense in the world that you could just do if you had robots. In Bryant Park right now, right below our Databricks office in New York, there's a wonderful ice skating rink all winter. If you were willing to just have a bunch of robots do a bunch of work, you could literally take down the ice skating rink every night and set up a beer garden.

and then swap that every day if you really wanted to. Things that just make no logistical sense because they're so labor intensive. You could just do that. And suddenly that makes a lot of sense. You can just do things that are very labor intensive and resource intensive. So that gets me really excited.

From data intelligence to physical intelligence. Ah, well, somebody's already coined the physical intelligence term, but yeah, I don't see why not. And honestly, we're dealing with a lot of physical intelligence situations at Databricks right now. So I think data intelligence is already bringing us to physical intelligence, but there's so much more one can do. And we're just scratching the surface of that. It cost Google, what, $30 billion to build...

extraordinary autonomous vehicles. And the whole narrative in the past year has completely shifted from autonomous vehicles are dead and that was wasted money to, oh my gosh, Waymo might take over the world. So I'm excited about that future. I just wish I knew whether it was going to be next year or in 30 years. The other trend though, I mean, I spend a lot of time in the policy world and I think that's maybe even a good place to wrap up. Before I was an AI technologist, I was an AI policy practitioner.

That's how I got into this field in the first place. That's why I decided to go back and do my PhD. I spend a lot of time these days just chatting with people in the policy world, chatting with various offices, chatting with journalists, just working with NGOs, trying to just make sense of this technology and how we as a society should govern it. It's something I kind of do in my spare time. I don't do it officially on behalf of Databricks or anything like that, just because I think it's important that we as the people who know the most about the technology understand

try to be of service. But I think coming as a technologist and asking how can I be of service and what questions can I answer? And can I help you think this through and figure out whether this makes sense? It's a very fine line and you need to be careful about it. But if you really come in with kind of your heart set on figuring out how to be of service to the people whose job it is to think about what to speak on behalf of society or to think on behalf of society, you can make a real difference. But you've got to build a reputation and build trust over many years.

But the flip side is you can do a lot of good for the world. That is definitely a good place to leave it. So Jonathan Frankel, Chief AI Scientist of Databricks, thank you so much for joining. This is a lot of fun. Thank you for having me.

AI+Data in the Enterprise: Lessons from Mosaic to Databricks 47:18 Share

Founded & Funded

Deep Dive

Shownotes Transcript

AI+Data in the Enterprise: Lessons from Mosaic to Databricks