Hi, listeners. Welcome to No Priorities. This week, we're speaking to Arvind Jain, CEO and co-founder of Glean. Glean is an AI-powered enterprise search and knowledge management platform, which allows you to not only access all the different internal documents and slacks and other things that your company may have, but also allows you to enhance workplace productivity by using different applications on top of that. Prior to Glean, Arvind had a really storied career. He co-founded Rubrik. He was early at Google, worked on search there, amongst other things.
And so we're very excited to have him here today. Garvind, welcome to No Priorities. Thank you for having me. So I'm really excited about this. I've known you for years and Elad's known you for maybe 15 more years than that. You're an amazing, repeat successful founder with Rubrik and Glean.
I want to start by just asking you about search. You've been a search guy, you know, since before it was cool for a long time when it felt like not solved, but not as dynamic. How broadly has search changed because of LLMs? I've been working on search for almost 30 years now, long, long time. The paradigm has completely shifted. I think I would say that search had been static for a long time. It was this
keyword-based paradigm. Like, you know, people ask questions, you find words and try to find them in documents and bring them up, you know, to the users. But LLMs have completely changed it. Like, it has actually, the main thing it has done for search is that it has allowed us to really deeply understand
a question that a user is asking. And similarly, it allows us to very deeply understand what a document is about. And you can actually, you know, match people's questions with, you know, the right information conceptually. And that gives us so much more power. It's not brittle anymore. And I think it's been a foundational technology to
to really evolve search into these new experiences that you're seeing these days, where you can go far beyond just surfacing a few links to an end user to actually deeply understand their questions and answering them for them directly using the knowledge that you have. If I remember correctly, Glean got started in the more traditional search world
And that as these foundation models and these LLMs have come to the fore, you've really kind of shifted how you think about both the capability set that you provide and how you bridge things. Could you tell us a bit more about how you started off building the systems and how that's shifted and then how you've kind of mapped new use cases against it? Because you're now effectively like this really interesting platform.
that can be used in all sorts of ways inside of an organization, around the corpus of information they have. I'd just even love to hear the technology transition. Like, how did you think about that? When did it happen? I think you really lived through it in a really meaningful way. You know, we had good timing, I would say. So, you know, I started thinking about building clean in
late 2018, started the company early 2019. And so interesting thing is that Transformers as a technology had emerged by then. Now the whole world was not talking about it, but in search teams like at Google, you know, we saw
the power of embeddings and how it could fundamentally change search. And so we had that luxury to actually see this in action. So the version, one of our product actually already used transformers for semantic matching. Like, you know, we didn't have these terms, like nobody used to call it vector search. You know, we didn't have that. Like these terms had not been invented yet.
or generative AI for that matter. And so internally, we used to call it embedding search. And it was a core technology that we started out with. So you were super early to it, actually. Yeah. And the models at the time were not as powerful as today. We started with this BERT model that Google had put in Open Domain, which was trained on all of the Internet's data and knowledge.
And we would then take those models and then for every customer of ours, we'd actually build custom embeddings, you know, on their business content. And then that would sort of power the semantic part of the search. But remember, like in a search as a technique, there's a lot of focus on embeddings and vector search over the last few years. But that's actually only one part of
building a good search system. Because if you think about an enterprise, imagine a company that has been around for a few decades. You know, they have tons and tons of information spread across many, many different systems. A lot of that information has become obsolete, you know, now because it was, you know, written like, you know, many years back.
And so when you build a search product, it's not just enough to say that, hey, I want to understand somebody's question and I'm going to match it with the right information sort of semantically or conceptually matches what the user is asking. Well, you've got to solve for other problems too. You've got to actually pick information that's correct today, that is up to date.
that has some authority, like somebody who's an expert on this topic has actually written that document. So I had to do all of those other things too, to actually truly sort of pick the right knowledge and bring it back to people. So we started with building the product in that shape and form. It was a very different product actually, like nobody had actually searched enterprise search as a problem before. In fact, the interesting thing that I remember is that even though I was coming off of a successful company, like with Rubrik, we had good success,
I don't think people really wanted to invest in enterprise search or me, you know, for that matter, because, you know, this problem was not exciting. It was traditionally a very bad problem, right? So there's all these search engines fast. I remember when the early Google days was...
sort of an enterprise retention, I think, based in Norway. Like there's lots of attempts at this. A lot of attempts and no successes. Why do you think it didn't work? Because it felt like an awful market. It was like a graveyard, like, you know, of all these companies that tried to solve the problem and it didn't. Part of it was just that I think search is a hard problem. In an enterprise, like even getting access to all the data that you want to search,
It was such a big problem. In the pre-SaaS world, there was no way to sort of go into those data centers, figure out where the servers were, where the storage systems were, try to connect with information in them. It was a big challenge. The SaaS actually solved that issue. So like search products, like most of them, most of those companies started in the pre-SaaS world. They failed because you just couldn't build a turnkey product. But SaaS actually allowed you to actually build something, which is my insight was,
that like, look, you know, the enterprise world has changed. We have these SaaS systems now and SaaS systems don't have versions. Like everybody, all customers have the same version, you know, they are open, they're interoperable. You can actually hit them with APIs and get all the content. I felt that the biggest problem was actually solved, which was that I could actually easily go and bring all the enterprise information and data in one place
build this unified search system on top. So that was actually a big unlock. - So it was the rides of these connectors and APIs internally. So you're using Google Docs instead of older school systems, or you're using Slack, or you're using these new tools that now provide you access to the data or underlying content. - You guys must remember Google Search Appliance? - Yeah. - The idea of like, I need to slurp your data continuously into a hardware appliance in order to actually do search is ludicrous. - It was a challenge.
The, you know, search as a, and by the way, the origins of Lean is, so at Rubrik, you know, we had this problem. Like, you know, we grew fast. We had a lot of information across 300 different SaaS systems and nobody could find anything in the company. And people were complaining about it in our pulse surveys. And I, and I was, you know, I always run ID in my startups. And so there's a complaint that, you know, it came to me, like I had to solve it.
So I tried to buy a search product and I realized there's nothing to buy. I mean, that's really the origins of how Glean got started as a company. And so that was like, you know, one big issue, like, you know, the search SaaS made it easy to actually connect, you know, your enterprise data and knowledge to a search system. So that actually made it possible for us to, for the very first time, build a turnkey product.
But there are a lot of other advances as well. You know, one is, you know, like, look, you know, businesses have so much information and data. One interesting, you know, fact, one of our largest customers, they have more than 1 billion documents inside their company. Now, here's this, you know, when Elad and I, you know, when we were working on search at
at Google. In 2004, the entire internet was actually going to be in documents. There's a massive explosion of content inside businesses. So you have to build scalable systems and you couldn't build a system like that before in the pre-cloud era. I would spend all my time just trying to build that scalable distributed system, which we don't have to anymore because of thanks to all the cloud technology. And then of course, transformers. That's really the big unlock that we had was
that we could actually understand enterprise information more deeply. And it was very necessary in the enterprise compared to on the web. On the web, even if you don't have good semantic understanding, there is so much that you can learn from people's behavior because you have a billion people coming and using your product. In the enterprise, you don't have that luxury. So you have to sort of make up for that lack of signal from users with other techniques and transform is one of them.
It sounds like you feel a combination of, I'd call it like more traditional IR and search techniques and embeddings is relevant. Do you think that persists? Like where would you want bespoke infra or, you know, signals like freshness and authority or like how much do models just do in the end? Yeah, I mean, I think there's always this thought of that, like, you know, the models will have
near infinite context windows and you can just give them everything and they can figure things out automatically. But I don't think, you know, like they're anywhere close to, you know, that happening. I'll give you an example. Let's say that models are mimicking human intelligence, right? So they're actually getting more and more capable of like, you know, how we work.
like humans. But as a human, imagine if I were to actually give you, let's say I give you a question and then I say, here's everything. In a completely non-organized fashion, I give you a whole bunch of 1 million documents and let's imagine you have the memory powers and speed, but it still just feels like a very complicated thing. It's very hard to make sense of
information that is, for example, being given to you out of order. Like, can I give you one document that is something from today, something from four months back, something from three years, then something again from two days back. If I give you like, you know, information in a manner where, you know, where it's sort of not organized in any shape or form, then as a human, you're going to have a lot of difficulty reasoning over it. So we think about the models the same way. There is a good amount of work that you have to do and present the information in
to the model in some organized fashion, that's when they're going to actually do a much better job reading that information, reasoning over it, and giving you the answers. And sure, you can actually give them more and more over time, but still it matters how you provide them with the right information. Now that you have this sort of corpus of information, you basically aggregated all the internal documents of a company, which in itself is incredibly useful just for search.
But you've also gone down the route of enabling applications to be built on top of it in different ways. Could you talk a bit about that and what are some of the common use cases that you're seeing? So we started with this vision of building a Google in your work life. But then as models got better, developed these reasoning and generation capabilities. So first, it changed our product. And our new product, Green Assistant, it sort of looks and feels more like ChatGPT.
So instead of like, you know, me going asking questions and seeing, you know, a bunch of links going back to me, you know, now of course you converse with Glean, you ask questions and it works just like Chaiji, you come and ask question is going to actually take all of the world's knowledge
And also additionally, you know, it's going to take all of your internal companies, you know, data and knowledge and use that in a safe and secure manner, like knowing who you are and what information you can really use within the company to answer questions back for you. So that's sort of like the first progression in terms of our product. Like, you know, we evolved from being at Google to, you know, something that looks more like ChatGPT, a more powerful version of ChatGPT.
inside your company. As you build that, this Glean Assistant actually, you can think of it more like a personal assistant that you're actually giving to every employee in your company. It's a tool, you know, it's your sidekick, you know, it's always available to help you with whatever questions or tasks you have. It's going to use all of your company's context and data to help you with, you know, with your work. But,
you know, businesses are actually a lot more but more interested in not in that, but in actually thinking about how they can transform their company with AI or they can take specific business processes, you know, where they're spending a lot of money
and how do they bring automation in that with AI? So we've been asked like before agents became the talk of the day and like everybody's of course building agents, but early last year when agents had not yet taken off, people were asking us for that, hey, we need to build more curated applications using this data platform that you have.
So as an example, HR teams, you know, would come to us and say that, look, we love green assistant. People come in there, ask questions about, you know, benefits and, you know, PTO and vacation policy and whatnot. And, and it, and it works, it works great, but sometimes it uses, you know, content that's not authorized or blessed by us. And if somebody is coming and asking questions on people related topics, we want Glean to only use, you know, the curated content that, you know, our people team has created, uh,
And we want it to behave in a particular way, particular tone and all of that. So that was a request that we started to get last year that like, you know, can we create more specific curated experiences, you know, function by function for different use cases. So we started to build that and we were not calling them agents, we were calling them app.
Now, of course, people think of them more as agents because it's no longer just asking questions and getting answers, but you want these specific functional experiences to actually replace the business process, which also involves doing some work for not just answering questions, but actually doing some work in both systems.
Arvind, when you talked about, you know, access to the right data with the right authority and also like it really begs the question of like access control. Right. In a platform like Lean, when you have all this unstructured data, right.
this seems much more complicated. What is like your overall stance or how you think this is going to work in the future? Yeah. Well, so look, enterprise information, in some sense, you know, it's governed and it's protected. You like most of the knowledge, I should say like 90% of the knowledge inside the company is private in some shape or form inside your company. You'll have a document that maybe is private to you or you share with a few other people. That's the nature of
you know, enterprise knowledge. That's the fundamental sort of way like it works. And you can't take, you can't actually build, like for example, a model inside your enterprise and dump all of your internal company's data and knowledge into it and then make that model available to everybody in the company. Because if you do that, you're leaking information. Like, you know, inside a company, you're letting somebody in the engineering team, you know, see sensitive stuff, you know, which,
probably only HR teams should be able to see an example. So any AI experiences that you build inside the company has, it has to think, you know, about security and governance and permissions, like, you know, at a fundamental level. And that's what we do in Glean. So when we connect with all these different systems, you know, inside our enterprise, we, you know, if we index, you know, a particular document from Google Drive or a conversation from Slack, we also keep track of, you know, who are the users can actually access that information.
And this is fundamental. Any access to data that's going to happen through our platform is going to actually match. The users have to be signed in, and we will actually only let them use information that they have permissions for. And this is important as a problem to solve. Unless if you have infrastructure like that, you cannot roll out AI safely inside your enterprise. I learn a lot from people who work on search, especially like search with any sort of scale, because you get all sorts of weird user behavior.
And so related to your idea of, you know, us with our personal assistant team, what are some behaviors you see from end users in terms of how they're using Glean or AI in general that you think we should just do more of? Right. Like I, you know, I'm always very surprised when I learn from Google people about just like the behaviors around navigational search and how many are one word queries or what the popular queries are and those sorts of.
patterns. And so I'm sure you see like Glean and AI super users. One of the biggest surprises for me, I always felt that, you know, we're building such an intuitive product. You know, it's like it's little, there's no UI, you know, there's one box and you ask question, you put in a search and what's the big deal? Like why do you have to learn how to use this?
And we realized that as we added more and more of these natural language capabilities and the ability for you to actually ask a really long question, like paragraph long set of instructions that are given to us,
And we realized that people won't do it. Like, you know, I think everybody has been trained over the last 20 years to actually type in, you know, one or two keywords. Like Google has sort of taught us, you know, on what search can do. So with search, we never had a problem. Like, you know, we launched our product, like immediate high usage. Nobody was confused, like how to use the product. With Assistant, people didn't know what to do with it.
some people with, you know, more curiosity and they will ask all kinds of, you know, questions that we couldn't actually answer. For example, somebody says, hey, what should I do with my life? So I think, but anyways, coming back to this, that was one of the key learnings is that,
AI is actually very unintuitive. For most people, you have to actually really expose to them these capabilities in a sort of an incremental fashion. You know, like some things, you know, which sort of are more meaningful to their day-to-day work. For example, if I'm an engineer, like, you know, prompt the user sometimes that like, look, you can actually learn about a new piece of technology. Like I can actually give, you know, create a two-page tutorial for you right now.
And you sort of have to understand what people's, what their core work is. And then you could actually give them these sort of prompts, like prompts for them to sort of start experimenting and get excited about trying something out with AI. One thing in fact, which I would also add here is,
A lot of time with AI, businesses are excited. They have a lot of dollars to spend on AI. But they're also asking for ROI. They're going to make all this investment. What are the returns? What are the efficiency gains that I'm going to be getting? Or what are the top line improvements that I can make to my business?
There's a lot of focus on that. And one thing that often gets overlooked is education. Because, you know, the world is changing. Imagine like, you know, three years from now, you wake up, you're the CEO of a large enterprise. What do you want to see in your workforce? You actually want to see people like who are trained and are AI first. Like,
They're experts, they know how to leverage the strengths of AI because this is a difficult technology. It's not perfect, it's not easy. It makes mistakes, it hallucinates, but yet it's powerful. And if you become an expert, you can get a lot done with it. That has to be the objective today. As leaders think about AI, how do you build people tools that
sort of motivate and motivate them to bring AI in their day-to-day work. You had an amazing career between being early at Google, starting Rubrik, now starting Glean and running it. What was unexpected about doing Glean? Because you'd gotten to so much scale. You'd done such amazing things in the context of Rubrik. What was hard or unexpected or just very different about Glean that you didn't anticipate? From a product side, one of the most interesting things for me was
was like how hard was it to actually roll the product out to our customers. We had a very different journey, like in Rubrik compared to Clean. Like in Rubrik we're an established market, like there were budget, several dollars, and you had to actually replace an old technology with a new technology. Here, we were in a market where we had no
budgets. There was no concept of buying a search product in the enterprise. And everybody thought that, yeah, this is an important problem, but it's not a line item in my business priorities. It's a vitamin. It's a painkiller. People are living without it. Well, yeah, that's true. I mean, you live without something you don't have. That's by definition true. So we had a lot of challenges. We had to do a lot of evangelism to actually get
the right folks who wanted to be the innovators, for them to actually make that bold call and actually buy a product that they're not used to buying. So that's sort of the first part of it. You have to create the market for this, which actually was difficult. And second, which is actually a very interesting one, is our product was actually working well. It was doing good search, letting people find things. But then we started to hear from businesses that, oh, I'm scared of good search.
I don't want a good search product in my company because I have all these governance gaps. I have like, you know, sensitive information all over the place. And, you know, now people are discovering these things. So we launched like, you know, for example, you know, people found like salaries of other people. You know, there was like in one of our customers, somebody found a sensitive M&A dog that was, you know, or something that was, you know, not yet happened.
And you start, like, so people like you actually were very, very scared of actually having good search. So we had to actually, like, that was an interesting challenge. We did some good work. We were doing it safely and securely. But, you know, you don't have good governance. And now, like, you know, we don't, we can't sell because the product is so good. It seems like LLM should be able to help with that.
right, in terms of classifying documents and servicing, hey, this one may be sensitive. Do you want to secure it, et cetera? Yeah, so in fact, that's exactly right. Like, you know, so we actually were forced to build that. We were forced to actually go above and beyond respecting permissions in individual systems to knowing who you are, what you're asking. Like, you should have the right to even ask the question or like, you know, when the information comes back, like does it, you know,
feel, you know, safe enough for us to show it to you. So we actually, in fact, you know, in that sense, you know, we actually ended up becoming a security product. Like a lot of companies actually buy us to fix governance in their sort of, you know, data and systems and become AI ready, like AI ready for the clean search product, uh, the clean assistant, but also for all the other AI products that you can buy inside the enterprise. So that was actually a very interesting journey. But then for personally on, you know, for me, uh,
You know, at Rubrik, you know, I didn't actually, I wasn't the CEO. I ran R&D as one of the founders of the company. And here I actually learned how to become a CEO. And I don't think I've learned it yet. And like, you know, that's a constant process.
challenge and learnings that I go through because fundamentally, I'm still an engineer. Everything I do, that's the mindset that I have. So growing out of that into being able to run a large business, that's a personal transformation that I'm going through. One thing that I think is striking is that from a go-to-market perspective, you all have really focused on big enterprises, right? And you mentioned some of these enterprise data needs.
A lot of people always just want to do PLG and you've really done sort of the top-down sale. It's been incredibly successful. You've done it twice now, right? Because Rubrik was largely that as well. Could you talk a little bit more about when it makes sense to do
big direct enterprise deals versus a PLG motion and how you think about that as you build businesses? Because I think it's very differentiated and most people just can't pull that off. So I'm curious about how you think about when to do it and then how to do it. Just to be candid, when we started, my dream was to do PLG. I'm an engineer and I wanted the company to have engineers and then product should sell itself on the web. Who doesn't want that? It was something that was a desire for us.
But the problem is like, you know, with our product, it is by definition a company-wide product. Like it's not like, you know, we cannot offer the product to one individual inside a company. Even one person, you know, their search needs require us to actually search over the entire company's information for them. So it's expensive. You have to actually index, you know, all of your company's data and knowledge. And so we never had that.
a concept that we could make it available to one or two or 10 people inside the company. So we're sort of forced just structurally to actually build in that fashion where it is enterprise, it is like we roll the product out company-wide, every employee,
That's what makes it cost effective. But like, you know, coming back to your question, the standard approach, I think that companies prefer now is that like they think of PLG as basically lead gen as a funnel you sort of nurture and expand using, you know, enterprise sales motion.
So the right recipe for me, like, you know, if I had a choice, I would actually start both the motions simultaneously. Like, I won't actually say that, look, you know, for the first three years, I'm going to actually focus, you know, on just being PLG and then bring enterprise states later because you're actually leaving a lot on the table. Timing matters always. And so you have to sort of like start the motions at the same time. Arvind, one thing that we have talked about that I feel like must have been...
I mean, hard. The priors on this market were not great, right? And we talked a little bit through the rationale of like,
you know, you feeling like you really saw the problem internally anyway and understand that there are these sort of architectural foundational things that have changed in terms of movement to SaaS and API based integrations and such. But still, I think it's a really good question of advice for founders or maybe people joining startups. Like, when should you agree with the priors on like something is a bad market or how should you think about that question? So I'll share a few things on this.
Number one, I think as engineers, first of all, there are always doubts. The more you look at priors, the more likely you're going to ultimately kill your own idea. There is a lot sometimes. Everything's been tried. Yeah, everything's been tried. A lot of things have failed. And I think for any given idea, there are 10 reasons why it won't work. As you start to go into details. Sometimes a more simpler approach is helpful, which is...
well, there's a problem. Like, you know, you talk to people, they have and they feel this pain. And which clearly means that nobody is actually yet solving, you know, that because the pain exists.
And so don't go into details anymore. Just do it. Things will just get figured out over time. So at least for me, it was actually unusual for me. I'm an engineer by training myself and I'm naturally trained to question. And there's a lot of self-doubt in my mind. So I don't know what happened to me when we started Clean. Because there were all these people saying, no, not do it. And somehow they couldn't actually discourage me. I just felt that...
This was an exciting problem. I knew everybody in the world has this issue. Even at Google, it was a big joke, all this we had internally. All of us were spending all our time making it easy for people to find things, but not us internally at Google.
It's super hard to find anything inside the company. So I think I somehow found that conviction. I was sort of being lazy, not willing to go into the details and look at all those priors and just do it, just solve it. I mean, that's what I think worked for us in this particular case. I feel like Glean had three big components to it that all came together that you mentioned earlier, right? There was the need that you identified just as a
somebody running IT for your own company, at your point, it goes back to Google that this was a need. And every company that I've talked to has always wanted to build search and directories and all this stuff. The second thing is this rise of connectors and APIs in the context of existing enterprise software that everybody's using so you can extract the data more easily. And the third thing was the big shift in terms of the underlying technology, right? The shift in terms of what is capable of a search, these foundation models, embeddings, et cetera. Given the latter two,
Are there other big opportunities that Glide isn't going to work on that you've kind of identified as really interesting areas that suddenly are tractable again? I think for us right now, the focus remains on the two core products that we have. So the way we think about our company is that we have this really powerful end user AI
AI, you know, assistant that helps every person like, you know, work differently in the future. And then we have this agent platform that you can use to actually bring, you know, AI, inject AI into, you know, every one of your business process, make them better, make them more efficient.
And I think we have been making big promises on both to our customers. The way I describe and pitch our product to our customers is the following. You know, come to Glean, ask any questions or give it any tasks. Glean will use all of the world's knowledge and all of your internal company's data and knowledge in a safe and secure way and answer those questions for you or complete those tasks. But actually, I just promise to you that Glean does everything. You don't have to work anymore. We're long, long ways from actually even solving, you know, that, you know, the pitch that I just mentioned to you. Like, you know, I think we have to
understand knowledge properly, pick the correct information, throw away the old information. There's so many challenges there, there's so many issues. People talk about hallucinations as a big problem with AI models. We feel like a bigger problem for us is not even hallucinations. It's about most of the times you can't even find the right information. Sometimes it's not there. People are asking questions, but nobody wrote it down. Sometimes we are not able to actually understand
do the needle in the haystack and we pick the wrong thing. And so there are like a lot of challenges. And I think we will be working on this problem for a long, long time. And I don't see us having an
Any need, by the way, like, you know, wanting to do something different, like, you know, like just solving this one problem itself is a big, big success. So we're going to stay focused on these two, you know, these two products. But then they're also like, you know, talk to you a little bit about the vision for the future. So I think the way we all work is sort of well accepted that AI is going to change everything. AI is going to change how people work. AI are going to actually change how businesses actually even look and feel, you know, what kind of, you know, workforce you have in the future.
And one thing that's going to fundamentally happen is that each one of us is going to have this amazing team, um,
of, you know, call it assistants, co workers, coaches that are truly personal to you. And you know, you're always surrounded by that team. And this team knows everything about you, your work life, what you need to do today, and is proactively helps you does 90% of your work for you. And also like, you know, help you get better, like, you know, at your, you know, like upskill you be a coach.
And that's the world that we want to be living in. Like today, you know, there are some people who already live in that world. Like, you know, for example, being a CEO, you get the luxury to actually have all of that. You have assistants, you have people in the staff, you have an exec team, you have a coach.
But in the future, that's going to be something that all of us are going to have. Like, you know, regardless of how senior we are, you know, maybe a new grad joining the workforce. That's what we are trying to actually go and solve for. We're trying to actually build that amazing person team around every individual. That's going to make us all a 10Xer. And that's just a natural extension of, like, just keep evolving our clean assistant product, make it better and better over time. Yeah.
Yeah, Arvind, thanks so much for joining us today. Yeah, it's excellent. Yeah. Fun, fun questions. It's always nice to see you. Yeah, likewise. Find us on Twitter at NoPriorsPod. Subscribe to our YouTube channel if you want to see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.