We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

322: How AI is Revolutionizing Finance and Decision-Making: Insights from wunderkind George Sivulka, Hebbia CEO

2025/2/10

AI and the Future of Work

The podcast starts by discussing the future of work and how AI agents will transform it, shifting roles from "doing" to "orchestrating agents."

AI agents will handle most of the "doing" in companies.
Human roles will shift to orchestrating AI agents.

I ultimately think that agents will be used to do not only a lot of single steps, but actually most of the doing in a company. And almost all of our roles will go from doing to actually orchestrating agents in doing.

Good morning, good afternoon, or good evening, depending on where you're listening. Welcome to AI and the Future of Work, episode 322. I'm your host, Dan Turchin, CEO of PeopleRain, the AI platform for IT and HR employee service. Our community is growing. We recently celebrated our one millionth listener. We get asked all the time, how can you meet each other? And so we launched a newsletter, our Beehive newsletter. We will share a link to register in the show notes. We share...

Oftentimes, additional fun facts and some fun stuff that doesn't always make it into the regular shows. Go join us there. If you like what we do, please tell a friend and give us a like and a rating on Apple Podcasts, Spotify, or wherever you listen. If you leave a comment, I may share it in an upcoming episode. Like this one from Oscar in my hometown, San Diego, who is a CPA and listens while commuting.

Oscar's favorite episode is that great one with May Habib, the amazing CEO of Rider, the unicorn AI company that recently raised $200 million at a $1.1 billion valuation and also just a great empathetic leader. We learn from AI thought leaders weekly on this show. The added bonus, you get one AI fun fact each week. Today's fun fact, Monica Brown writes in the Chicago Booth Review about how investors should be using AI to make better decisions.

Investments funds are using LLMs to drive insights from earnings call transcripts, 10K regulatory filings, annual reports, social media, and streaming news headlines.

LLMs create direct trading signals and develop new predictive variables for their forecasting models. It makes sense to ask whether the advantages of LLM strategies will disappear as soon as everyone else uses them too. That's been the outcome with arbitrage strategies in the past. However, the opportunities here appear more bountiful.

With the field in its early stages, researchers are still finding new ways to apply AI to tease out investment insights and trading opportunities. My commentary, don't make financial decisions based on what the LLMs tell you to do. Do use LLMs to accelerate your research by knowing how they work and where they source the data and how they calculated the output text. You can understand where they're vulnerable to making mistakes and

but also when they can be trusted. And of course, we'll link to the full article in today's show notes, which is quite relevant to today's conversation.

George Svoboda was called a wonderkin by none other than tech legend Peter Thiel. He worked at NASA as a teenager and graduated from Stanford with a bachelor's degree in math in two and a half years. Cardinal George's company, Hebbia, raised $130 million in July at a $700 million valuation from a murderer's row of investors, including Andreessen Horowitz, Index Ventures, Google Ventures, and Peter Thiel himself.

Hebbia is used today by asset managers doing financial research at companies like Centerview Partners, Charles Bank, and Premier. Company grew revenue 15x over the past 18 months. George founded Hebbia in 2020 and is a pioneer in the use of RAG techniques to mine company proprietary data to limit LLM outputs to sanctioned content. And without further ado,

George, it is my pleasure to welcome you to AI in the Future of Work. Let's get started by having you share a bit more about your background and how you got into the space. Thanks so much, Dan. And it's great to be here.

The background of Hebbia is best encapsulated by the story and the reason, the problem that we set out to try and tackle back in 2020. We were first pioneering large language model applications and actually coming up with the first productionized version of RAG far before it was an industry buzzword. At the time, one of the foundational insights that was

kind of baked into my psyche from my time at Stanford was the idea that you should always start a company where there's a lot of pain. There's a lot of literature on how some companies are vitamins and some are painkillers. But I think any startup class or any startup kind of community at Stanford will tell you, focus on the pain.

And most of my smartest friends when I was in my Stanford PhD, I kind of saw them graduate undergrad and move into, if they're really smart and really lucky, investment banking roles, private equity roles, all kinds of different financial services, kind of analyst or associate positions. And they were some of the smartest people. And catching up with them three months, six months, a year into these careers,

The one thing that was underlying every single one of their experiences was the amount of pain that they were feeling. They would go out, think that they would go be discovering or creating something like net new. And instead, they'd kind of come back and they'd say, hey, my entire job is just searching through long documents or data rooms or doing really repetitive, rote and mundane tasks.

And I'd never heard of this massive disillusionment that some of America's and really the world's smartest people undergo in these careers. And I said, well, there's definitely an opportunity here. It was latent in the back of my mind. At the same time, in my PhD, I was working on and really trying to investigate meta-learning, the idea of teaching machines to learn to learn. And a lot of this was before the release of any of the models.

And I remember in June of 2020, OpenAI released, not even ChatGPT, but GPT-3. And the title of that paper was something along the lines of, hey, large language models are meta learners, are multitask learners. And thinking that, well, I was going to work on the most important technology of all time, and it got scooped out from under me. And maybe the thing that I could do with my life was instead to go and work on the most important product of all time.

And set out to start Hebbia as this understanding, hey, there's a huge amount of pain in financial services. And really, the knowledge economy writ large, knowledge work is inherently painful, document-based, very repetitive. Think cubicle. And then there was this new technological revolution that I knew was about to unfold.

I saw GPT-3, I knew this would be important technology, these meta-learning models. And I said, well, if there's anything to work on in my life, it'll be... And it can't be the most important technology. Maybe it's going and working on the most important problem. And so we started Hebbia as a product studio, actually creating many of the foundational applications that now you see across enterprises. So your research was fairly broad, meta-learning.

And what you do at Hevia is very applied. What was that transition like from thinking like an academic to thinking like an entrepreneur? There's a very foundational piece of being an entrepreneur, which is challenging the norm or trying to probe into things that might not be as they seem.

And the idea that anyone can go out and create a really impactful, really large, really important company, it kind of seems like a stretch. But in academia, you're kind of tasked with going out and saying, hey, this is the landscape of what exists. How can I push that frontier? Or how can I think differently or turn something on its head?

And there's some academics that really focus on working on things that kind of push the entire field forward. And that's a common critique with academia. I've always been very interested in kind of like the edge or things that are a bit more contrarian or kind of pushing back on what society thinks. And I think that lent itself actually very naturally and well to a startup ecosystem and really a product studio. Go back to the pre-chat GPT days when you were experimenting with the technique that eventually became RAG.

What was the insight that led you to focus on financial services? Why is it better applied in maybe a narrow use case versus applying it to any document? Yeah. So I think ultimately, there's maybe even a misconception about Hebea that it's only for financial services. The reason we started in financial services first actually kind of harkened back to a bit of an analogy that I was making around the most

kind of close analog in history, in recent history, to a foundational technological revolution on the scale of Aeon. And that was the introduction of compute in general. And so if you think about, you know, computers really came on the scene and really picked up kind of societal importance and impact 60 to 80 years ago.

And for the large part of their early life, there weren't really great products around computers. You had terminal windows, you had geeks, and a variety of people just hacking away on these things until the invention of Microsoft Excel. And Microsoft Excel, you could argue, well, if computing was the important technological revolution, Excel was the most important product of the last 100 years.

It took computing and made it accessible and understandable to not only people in financial services, but really it ended up going out and now 2 billion people use it worldwide, probably more today. Governments run on Excel. Every single company, every single person in the world learns how to use a spreadsheet when they're in middle school or early. It is the foundational product for computing.

And if you look at their go-to-market, Excel started 1985 and went from 1985 to 1986, it went from 0% to 90% market penetration in financial services. And that was where they had the largest pain point.

And financial services actually moves really fast when there's some alpha or there's some really strong return. And so you kind of saw it start in financial services. Financial services moved very quickly. And then it was just pushed into every part of the knowledge economy, into portfolio companies, and everywhere else.

And that's a very similar path that we're on at Hebbia. We're rapidly penetrating financial services, but you already see really strong traction, not only in financial services, but also in the legal vertical. We're now picking up lots of AMLAW, 50 AMLAW, 100 customers in insurance across the board.

Not only for reps and warranties insurance and commercial implications, but also for a variety of other different types of policies in auditing and all of these different applications. I mentioned a few of the reference customers in the intro. Share a typical success story. Yeah, I think almost all of our customers want AI. They see it as this incredibly important and impactful thing.

But almost all of them are still just experimenting with it. They have a few different tech forward users that are going out and they're using AI in a small way. They're spinning the wheels on it. When people get heavier, they expect that it'll be something like a chatbot or something like all the other AI application companies in the world, but are very good at doing a single task. Hey, rewrite this email. Or hey, go and find something in this document or summarize this document.

But that's actually just one step. It's like the calculator. It can do only a single step on the way to actually augmenting an entire workflow. They'll get heavier. And really quickly, they'll realize that the entire platform is built around answering questions or doing tasks that require many steps.

And that can affect any workflow. So if you want to go into investing as a canonical example, one of the questions that people always want to answer, they just get a new opportunity, they just get a new company, they're reviewing a new pitch deck, and they want to answer, hey, is this company a good investment? A chat GPT or any other kind of search products on the market will go in and they'll try to find areas in the document where it says whether or not it's a good investment.

But obviously, that's biased. Obviously, the CEO is going to say all kinds of things and make it look like a great investment. Actually, it would do if you were a good investor. It would be, you take all the marketing materials, you take everything on the internet, you take everything off the web. And you'd say, hey, give me something about the strength of their management team that might not be mentioned. Or go look into their competitors and see what's strong about their competitive positioning. Or tell me something like their customer concentration.

And each of these individual steps, you can think of as a decomposition of that question. Hebe will take a question like, is this company a good investment? Review documents like a data room, a SIM, or anything else that's brought if you're an investor. And it can then go and answer customer concentration, strength of the management team, and everything else that you'd like to know down to your specific investing process. We talk frequently on this show about what it means to practice responsible AI. And with a product like Matrix, the product from Hebea,

gosh, is in control of making a lot of important decisions. And I want to be curious to know, from the perspective of the CEO and your culture, what do you think about in terms of

taking ownership or accountability for unintended consequences. Not because your customer asked bad questions, but because matrix may do something. These are statistical, these models are looking at statistical probability. The decisions can have high impact ramifications. How do you think about the responsibility of you as a developer? It's incredibly important to our company and really to the collective dialogue of our industry about the ethical implications and then really how to responsibly roll out this technology.

You can think of new technologies as really a tool or an opportunity to go out and do anything. But when they have some slight bias or they have some slight kind of tendency to skew information, that becomes incredibly dangerous. With Matrix, we actually engineered the entire product to be very human-centric, human-first, and transparent.

And that's very different than anything else in the market. Whereas other things might just reply or show you a few search results and a few citations, Hevia actually shows you its work. And it builds a piece of collateral. That's actually the matrix that will show you every step the AI takes. Actually, where documents add up and where they don't add up.

And if data is mixed or data is slightly skewed one way or the other way, it won't tell you, okay, 52% of the time people say this and 48% of the time folks disagree. It won't really skew just to the 52% of time. It will actually tell you that breakdown. It'll show you every side of a problem. And that's inherent not only in how we think about our responsibility, but how we build our product.

in investing that's even more important because you care about not only the majority opinion, but maybe the secrets or the information that people might not be talking.

This is unanswerable, but you're pretty good. So I'm going to ask it anyway. Maybe it's unfair. Hit me. What do you think about the cost of a false positive? So you make a poor decision that someone acts upon. Is that higher or lower than the cost of a false negative where you have a right matrix has a right answer and because it's not confident enough, it chooses to not provide it?

I think that it's interesting. It's a problem, maybe even a paradox that we've tried to engineer the product for. And so maybe it won't really skew false negative or false positive, but rather it'll show you, it'll really say, I'm not sure. And these are the possibilities. And so to the point of it being human-centric and human-first, it puts the onus back on the user and is more around surfacing the insight rather than making the insight.

And so the majority of the time, we're not actually trying to make the insight. I don't believe in a world where AI is running everything. You could think of if every investment firm had the exact same instance of ChachiBT Enterprise, the exact same instance of whatever application they'd like, they'd all get the exact same answer.

Maybe it has their own data. But ultimately, what Hebe tries to convey is it'll also understand your own custom process. It'll treat your firm and your investing criteria different than it will treat another firm and its investing criteria. And that's not only for investors or for finance. If you're a lawyer, a lot of the time what makes a great law firm

is not the data that they have, but actually the way that they engage with their clients, the way that they understand things, the way that they deal with problems. And so that process is actually just as important as the data. And whereas other companies are kind of working on the models, training the models, or giving them just your data, Hebe is saying that's not enough. We're also going to give you your own custom process.

But a big part of that custom process involves simulating insights from external sources, obviously. So the RAG search would restrict it to some known sanctioned content. Presumably, to do this research, you want to extend it to content that hasn't been sanctioned, right?

100%. If you actually think about how RAG works, it's ultimately taking a group of sanctioned content or a group of whatever content you'd like and giving you just the most similar, the most relevant results and feeding that to a large language model to answer a question.

What really matters and what Matrix does is it'll take in entire worlds of information, that sanctioned content, but also other things as well. And it'll say, okay, hey, we're going to give you not just the relevant results, but just everything that's even remotely related, package it into something that's a bit more unbiased and level-headed in its analysis of whatever it's been given, and then present that to the user. Makes sense.

So it's not often that I get to hang out with computational neuroscientists. So I got to ask you a question here. So you use the term meta-learning and you talk about teaching machines, teaching them how to learn. I love the concept. I often though bristle at these analogies where we liken machine learning models to human brains. You know, the concept of learning

It warrants discussion. But really, we're teaching machines to pattern match. And maybe the philosophical side of that computational neuroscience background, what does it mean to learn? And are we in fact, based on that definition, are we in fact teaching machines to learn?

I love philosophical questions. Whenever I think about, I mean, even our name, Hebbian Learning, relates to the kind of a way that you can train neural networks, but also a way that the biological brain learns, which is called Hebbian Learning. And if you think about what it means, what it actually means in practice, when you're training a model, you are giving it access to the world's information, and it's predicting the most likely next sequence. And you see learning as an emergent property from it.

So forget about learning during training, but learning during inference is more of what this is all about. Wherein you could give any LLM a variety of examples, and it can then take those examples and generalize them to new examples it's never before seen, or that aren't in its training data set. And if you think about

Kind of what humans do or what animals do or biological systems. We'll learn very early on, hey, if I pick something up and I let it go midair, it will fall. And I'll do that with maybe a block. Then maybe I'll do that with a spoon. Maybe I'll do that with an entire plate of food. Very common children, toddlers will experience this.

And then we know that, hey, if I pick up the TV remote, or if I pick up mom or dad's computer or phone, if I drop it, it's also going to break and shatter just like that plate. Or that gravity is something that we can kind of pick up on inherently. And

People had a conception of this far before Isaac Newton, right? But it was almost an innate understanding of the world. And you see that these models actually are picking up and understanding not only things that are in their training data, but that they actually are generalizing pretty well to things that are completely outside of training data. And that poses the argument and poses really the conclusion that these models are building what's called a world model. They understand how the world works. They understand logic.

And it's quite beautiful that language actually is the only thing that is really needed to understand richly how the world works. We see these models generalized to principles, even physics principles, just based on all of the language that humanity has created. And so I would definitely call that learning. And I think that these models are learning actually in a very similar way to how humans learn, even if the way that they compute is very different than how biological brains compute. It's that last clause I think is really important to capture. Like,

Anything that involves prediction tasks or computing, computers are really, really efficient at doing that thing over and over again really reliably. But when it comes to your point, I think you described it really well, those world models, the understanding...

the physical attributes of the world, I think that's where the brain is just so incredibly power efficient. And as humans, we learn from so few examples. I guess at some point you can make the analogy hold up, but we just fundamentally learn in ways that are so different from what machines do when they, quote, learn. So I would like to test the boundaries of the analogy.

Sure. I'd say that biological brains, especially human brains, are fundamentally very random. They're stochastic machines, right? And you could give any amount of humans the same problem and we'll probably come up with tons of different ways of thinking through it or predicting it.

And if you look at large language models, they're still, depending on the temperature or the randomness that you inject into their inference, they'll actually create a variety of different possibilities as well. But I think humans have a benefit of being stochastic, right? If you think about maybe the world's best investors or even the world's best lawyers, this idea of creativity or of the randomness of the human brain actually ends up

being a superpower. Take what the market thinks and turn it on its head, right? Do the contrarian thing. These are all fundamental truths that make great investors great or great knowledge workers great. Don't fit into the mold or fit into the norm. I think that large language models

will make humans better at experimenting. They'll make them better, especially really great applications at doing the stochastic thing. And they'll let us go down different rabbit holes and enable us in ways that we can't even imagine yet. You just said the key thing, make humans better. And yet, I had Peter Voss on this podcast recently. He's one of the co-inventors of the term AGI. And we had a spirited discussion about what

the end game of AGI generating artificial general intelligence should be. And he quite proudly, you know, claims the objective is to replicate everything that a human can do, only better because, you know, it's a digital brain.

And that's not, in my estimation, making humans better. That's aiming to replace or perhaps even in the most cynical sense, confusing humans into thinking that they're interacting with humans when they're interacting with digital likenesses of humans. What's your sense of kind of the concept of AGI and making better humans with technology versus potentially replacing them?

Yeah, I think that there's a variety of different ways that you can talk about AGI and kind of what that means or what the impacts that will have on society. I think one of the ways that I really like to think about it is that I believe foundation models are already smarter than any living human being.

Already, no human being can pass the MCAT and the LSAT and be a world-class map. I don't think there's a single human being that has the intelligence of the latest foundation models. And that will only continue to be true.

But there's a distinction around, okay, are these things being used to replicate humans? Or I always call it creating more noise in the world. More marketing copy, more emails written, more emails in my inbox, more meeting notes that are summarized. And AI, instead of creating more noise, getting the signal from the noise. Or reducing the amount of information and just distilling insights from it.

And I don't really love how generative AI has been used mostly to generate. I actually think it's important that Hebbia, as part of our mission, is all about making sense from all the noise that is being created. And I think that's a much better societal use of AGI-esque technologies. Yeah, that's a better model. Complimenting humans versus...

confusing placing them or yeah confusing humans into thinking that they're interacting with other humans um and yet you know every day you know it's a relatively small tech community that's building these technologies intentionally or unintentionally we're making decisions about the future of humans interacting with machines and just always interesting you know to have these discussions with other i mean you are one of those members of the small really small community making these decisions you know it's curious to you know get your reaction to

Even if it's unintentional, you know, how you think about these important decisions that you and your team are making. Absolutely. So these days, we're taking this beginning of 2025, we are inundated with all kinds of conflicting visions of what AI agents can do, will do, should do.

And a lot of what you described about kind of moving from tasks to jobs or maybe objective-driven activities assigned to machine learning models certainly is akin to what we talk about as AI agents, however you might define that term. When you think about, you know, play out heavy, you know, a decade, and it's...

just ubiquitous, right? Beyond your wildest imagination, right? It's worked. What does the world of work look like when we're interacting continuously with, you know, heavy matrix today and then future versions of it? Yeah, I think that the world of agents will actually be not too dissimilar to the world that we have today.

which might be a bit of a contrarian take. I think Hebi actually, in many ways, has pioneered the agent. We were one of the first companies to go out and do these multi-step workflows that I talk about. And we're probably even the number one agent AI company at scale actually deployed with the real use cases for a lot of customers. And what we see is that the work of an employee isn't actually replaced as much as it is changed. And I actually liken it to

creating more managers. If what's going on at the lowest levels of every company, the nodes, the bottom of any hierarchy, is a lot of manual work, it's a lot of doing rather than a lot of thinking. I ultimately think that agents will be used to do not only a lot of single steps, but actually most of the doing in a company. And almost all of our roles will go from doing to actually orchestrating agents in doing.

And so people will become managers of AI agents before they'll become replaced by AI agents. And

The interface to which you would manage, let's say, 10,000 junior employees, which I think every company will soon have, whether you're the largest Fortune 100 company in your industry or a mom-and-pop bake shop, will be even more important. Having 1,000 employees as a mom-and-pop bake shop, or even as a new grad out of college, is a management problem and an orchestration problem, even more than it is a technical or a doing problem.

And we see a lot of the businesses that actually turn to Hebdia and make a lot of use of it starting to lean on this as a, we're upskilling our workforce for a new kind of work altogether. Rather than the doing, they're doing more of the thinking, the discovering. It's empowering every single aspect of the organization. I like that answer. That's very satisfying.

I know I let this one go way over. Thank you for being a good sport. But you're not going anywhere without answering one last question for me. Curious about your personal journey. What is your journey so far? I know heavy is just getting started, but from White Plaza, so to speak.

We share a common alma mater. It's the main campus at Stanford. Two, being the CEO of a fast-growing company, amazing investors, amazing customers. What have you learned about yourself along the way? I think that

The journey of being an entrepreneur is very much something that will humble you 10 times out of 10. The market does not want startups for you to exist. It's fundamentally pretty efficient. It's not perfectly efficient, but it's pretty efficient. And it doesn't like random PhD students going and dropping out and trying to impact the way that the largest firms in the world do work.

And I think that there's, no matter who you are as an entrepreneur, there's a resilience that you find in yourself as you continue to just climb and never quit in the face of a very, very daunting task and a very daunting environment. And I think the act of creation is one of the most beautiful things that you can do in your life. So working at a startup has been an amazing gift, all the ups and all the downs.

Osmos are definitely conspiring against us, and yet we do things that are irrational sometimes because we need to, because it's that itch that you have to scratch. You can't not build this thing because you owe it to the world. 100%. It's a vocation. It's not just a job. And working on, I think Hebbia is working on the most interesting problem in the most interesting industry in the world. The work that Hebbia is doing in effectively creating an infinite

context window for these models so that they can learn and understand exactly how we work is the limiting factor. It's no longer, hey, training larger models or building 03 or 04 or whatever they're going to come out with. It's actually taking what exists already and solving this next problem. And I think that that's the work that we do. It's a gift to be working on it.

Brilliant. Hey, George, where can the audience learn more about you and the good work that your team's doing? I think Google have you. We've got a blog. We've got a variety of other pieces of collateral. And we're increasingly sharing more success stories about how financial services, legal, and then a variety of other random applications are coming out of the work that we've done with Matrix. Yeah, we'll be continuing to share there.

Thanks, Dan. George, if you're willing to talk computational neuroscience, then you've got an open invite to come back and hang out anytime. I'm always, always, always down, and I'm looking forward to it. Hey, this was a great one. Thanks for hanging out, George. And gosh, that's all the time we have for this week on AI and the future of work. As always, I'm your host, Dan Turchin, and we're back next week with another fascinating guest.

322: How AI is Revolutionizing Finance and Decision-Making: Insights from wunderkind George Sivulka, Hebbia CEO 32:17 Share

AI and the Future of Work

Shownotes Transcript

322: How AI is Revolutionizing Finance and Decision-Making: Insights from wunderkind George Sivulka, Hebbia CEO