We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

REPLAY: Scoping the Enterprise LLM Market

2024/11/30

AI + a16z

AI Deep Dive AI Chapters Transcript

People

Matt Bornstein

Naveen Rao

Topics

Naveen Rao 阐述了其在人工智能领域的多年经验，以及对大型语言模型市场趋势的深刻见解。他认为，虽然Nvidia目前在硬件方面占据主导地位，但随着技术的进步和成本的降低，其他硬件平台将涌现，为企业提供更多选择。他强调了定制芯片的重要性，以及在Transformer架构标准化后，硬件厂商将拥有更多优化机会。他还讨论了模型训练和推理之间的关系，以及模型的持续迭代和更新的重要性。他认为，企业应该根据自身需求选择合适的模型大小和训练方法，并利用高质量的特定领域数据来提高模型性能。他同时指出，小型模型在特定领域可以超越GPT-4，因为它们更专注于特定任务，并且利用了企业自身独有的高质量数据。最后，他还展望了大型语言模型的未来发展，认为它们将成为企业数据基础设施的重要组成部分，并最终发展成为能够自主学习和推理的智能代理。 Matt Bornstein 主要关注大型语言模型的企业应用和市场趋势。他分析了Nvidia在硬件市场上的主导地位及其原因，并探讨了企业寻找具有更好TCO（总拥有成本）的替代方案的可能性。他与Naveen Rao 共同探讨了Transformer架构的标准化对硬件厂商的影响，以及未来模型架构可能的变化。他还关注了模型训练和推理的成本以及生命周期，并与Naveen Rao 一致地认为，模型的持续迭代和更新是必要的。他最后探讨了企业内部部署大型语言模型的挑战和机遇，以及如何利用自监督学习和微调技术来提高模型性能。

Deep Dive

Chapters

The discussion revisits the state of enterprise LLM adoption and market demand, highlighting the relevance of the topic despite changes in the AI world.

Enterprise LLM adoption remains valid and insightful.
Naveen Rao's background in AI and custom chips is briefly mentioned.

Shownotes Transcript

Translations:

中文

I have been in this field for a while. This from A A sheer interest standpoint. I would spent ten years in industry as a computer architects and software architecture, and then one back, get A P, C.

And earth science for the reason of, can we actually bring intelligence to machines, do in a way that economically feasible. I'm very happy and excited about where the world has gone. I mean, being the sideshow is never just one, to be honest with.

You can get passion from that because I think that actually what I one's you the sideshow, you have passion to keep going. And I think you can use that as strength. But really the whole point of having that passion, that strength, is to make something that's meaningful and something that's lasting, something that really does, is change the course of human evolution.

Hi there, and thanks for listening to the sixteen Z A I podcast. I am dare careless if you're listen to the episode on the dead published under in the united states happy black friday. Now this is actually real ring of our first episode for back in April features myself data bricks, VP of A I in the vro and sixteen partner matt bornstein.

A lot might have changed in the AI world since then, but the discussion about the state of enterprise Ellen adoption and the overall market demand for Ellens remains both valid and insightful. All the background you should need is that the vine has been in the AI space for more than a decade, building with custom chips and models. And that would record this on the heels of in video gtc event in march. So we naturally ly kick off the discussion on the topic of video.

As a reminder, please note that the content here for informational purposes only should not be taken as legal, business tax or investment advice or be used to evaluate any investment or security, and is not directed that any investors or potential investors is in any a sixteen fund. For more details, please see eight sixteen in that calm slash disclosures in .

video has also been really on top of the on top of each trend. So could do see them for doing that really well. I mean, they be able to like see the trend for whatever thing that is low precision tensor cars, what have you and an an execute extremely well.

So it's it's just a formidable competitor for anyone to go up against. And then you know everyone talks about the sort of kuta lock in that sort of thing. I actually don't think that's a reason anymore.

I think it's just they become the gold standard and IT introducers risk to move to any other hardwork platform. We are always looking at new to see if we can find a Better tco, basically like effective flops per dollar as the number I look at. And it's hard because they do build a good part. And you know, we can extract a lot out with this, the mature software stack that exists that's early, what keeps IT locked down as just a maturity.

But you see you look another platform. Our platforms are you deliver to do evolves what those are my brain immediately. So what the club platforms are building everything. There are some students experimenting in this space, but i'm curious what we're looking .

at and we talk to all, all of the about for. It's to this at this point in time, it's still very hard to move away from the video because if we're trying to go build models to for some purpose that represents the the shortest path to the 记得跟我， if you anything else, introduced to some friction at this point. Now I think by the end of the year, that might actually change.

There might be some other players that are capable of getting to the end goal without so much friction. You we're building our software stack to make that really easy for our customers, really deliver the best tco to our customers through a stack that they aren't know how to use. We've got a lot of focus who are building on top of moc database stack. Yeah, we can attract away all these hard ward details. We can make IT such that our customers can more choice.

And on that point, language models have have for the most part, standard around the transformer architecture. Now obviously, IT seems like that's creating an opening for chip companies to a sort tailor their their products to kind of a more homogeneous set of workloads. Do you think that's true? And like if if so, do you think is a good thing or bad thing for the industry?

I mean, it's definitely true if you if you go back like five, six years, you had to support so many different families of neural networks. I mean, they reconnect and R, N, N, S and L S D, ms and this study everything. So IT was actually quite a bit more difficult to actually launched some piece of hard because you had to have enabled ment for all these different things that do optimising for these different things.

Now as he said, it's kind of a transformer or dimension model. I would say the division models are still pretty important work roads. So those two things have pretty confined set of primitives.

And so you can just go optimize IT. Now is that good? I'm not very sure what good means.

I don't think it's the ball and all. I think there are some intrinsic problems with transformers that are still not quite address. Now whatever the next thing is, we going to build off of the learnings from transformer.

So I mean, whose nations grounding are still problems? And are they solvable within the confines of a transformer architecture? I'm not entirely sure. I think there's going to have to be some modification here. I mean, we're doing things like rag, which i'm sure you ve heard of, for people to manage generation.

And this is basically it's a way you come to think of as extending the context window to a whole bunch of other documents using approximate search. I think it's a bit of it's a pattern networks and IT adds value today. It's a bit of a hack because of approach the search and and editing models and all of this.

So can we actually start to make this more part of the the news network paradigm from the start and make is much more grounded to what's given us truth? That's a hard problem, just even philological. So I think the transformer being a standard paradise is a good thing for hard war vendors, for sure because IT gives them opportunity to actually do the game.

And that's what are going to see this year. And that's why I think this year is where will actually see some competition is IT for the industry. I think it's a bit of an overrode on the architecture for now, but that's just how these things go right.

And we got something that works. We keep chasing IT know. So I think whatever IT is will have to be some sort of modification of that paradigm to move forward.

And since you're one of the few people who really is an expert on both the hardware and the software side, you just give us some intuition for why building custom ships was so hard when you had to support, say, CNN. And are you know like many different architectures and and kind of looking forward, like how long will IT be for the for the chip industry to retool if if we do move off transformers?

One part of IT is that fundamental concept in hardware is constraining the number of computational motifs, motives, meaning like a set of Operations actually over, over again. So I do you make location times A A liar scaling term with a look up table after, like if you see that same thing happened over over again now, like a big peace of hardware that optimize that.

So building custom hardware has always been about finding those dominant motives and patterns and then basically building something very poke for IT. So there's an inherent trade off between that and flexibility. CPU were kind of the dominant paradise for a long time because we care very much about the sequential blasting of instructions.

That's what most, most applications like microsoft or whatever care about. Now we went to this world words is very data parallel and having that right amount of flexibility to enable the next generation of algorithm like transformer, while also so having good enough performs on certain primitives. Israel has been the tradeoff.

And when we've started to build soft hardware and like va, for instance, we were very focused on a particular set of music percept, trans and convolution nets. But then we had things like a resonance ts, which are convolution nets. We have a different sort of full of information that presents the issues.

We change the way we potentially will do convolutions. Can we do them in the frequency domain instead of the domain that actually changes the motives? Again, a lot of that kind of worked in favor of something like a GPU that was a little, little more flexible.

But now that we have something it's more stereotype, like with the transformer, IT gives an opportunity to go build something a little bit less flexible, but more performance. You can figure IT as at a very limit. Imagine where I just took a fully trained neural net like take your favorite eem off the shelf and I literally burned that in silicon.

I can find all kinds dal optimizations because zeros just become they're not even gate. They don't take any diary and anymore. Make your most locations. I have a bunch of zeroes when you run IT through an optimization logic optimization tool to build the hardware, you'll literally throw away a bunch of sisters because you have zero is propagating through the system.

So now you have the thing where it's like, okay, if I thought about some perspective, i'm just going to take when you go all the way to complete the other inflexibility, i'm just going to burn in one year. Al network, I actually could find a ton of optimi zone. But is that really going to be something that's a has enough of a market for me to sell IT that will justify the cost of development? I think that's the train off here.

I mean, what does that looked like? You mention the of market size for this. There's the market. The current players are right. And maybe a few others training foundation models are training these large models? Or is this something where you know enterprises are realistically going to invest in like what he's very prescribed and and inflexible chips .

and there might be a market for both, to be purely honest with you. So training models and building the spoke models for different domains is, is just going to be a thing. IT is a thing right now.

And there's a lot of reasons. For one, if you're regulated industry, you want control. Also, companies want differentiation.

They want to make models that are specific their data to their customers, and they can have something that is differentiated from their competitors. But the same time there there some dominant use cases like say, ChatGPT. G P T four has changed a bunch over the last year.

So I know people like like to say, oh, is a one year old models. So that's not really it's always changing of the hood. Well, let's safe. Just take, for instance, that maybe IT is a one year old model or maybe that one year old model is still useful. If that's the case.

Well, if I can run enough inferences through this or I have enough subscribers, about hundred millions subscribers in twenty box a month, kindly sort of starts to justify the cost of building a chip, building a chip. What says of this complex? I would be like thirty million box as the front costs, each chip would be so much cheaper to run each inference.

I think it's actually worth IT. So in some cases, we can start to think a little bit differently. We used to think about hardware is the big, expensive and unable thing in current days where people are raising a billion dollars for around that's not tiny.

More I can go spin a chip. I can build a new chip every six months. IT cost me thirty knowing box. Big deal. So I think if we start looking at if in that perspective is like if I know what my time to live for a model is and I know how many people going to use IT or how many, you know, token is going to generate, I can actually start to come up with A A financial model that makes sense to just fied building a new chip for each one of these .

things is crazy health. That transformation happens like that years goes like what google would do that and and who else. And now it's companies at a handful of years old doing IT.

I like that. I want to make a if we do this shift here to this from from the hardware to the software. Side that seeks seeks my impressions, your preferences, custom training models. But curious where the or maybe just questions worth the trade off or where is what's the right use case for anyone of those approaches?

I don't think there's one answer to that. We want to support the customers where they are and add value. I mean, today, we actually see most of our business is custom training once for people going to production with those.

We anticipate that, that influence side of things rap up even more. So I think at a state state, I want to look out IT like fine tuning plus pre training. I may call that all just pre training or call that all just training.

That's all training and inference is just an appointment site. It's going to be close to fifty, fifty in terms of dollars. That's my guess because when you train, when you train, you obvious ly you to put something in production.

When you put something into production, you gather feedback and you want to go back and train. So those two things kind of work synthetically with each other. Maybe you'll change at some point.

I I for many people you know say, no, it's all gonna inference. Like, well, that doesn't actually happen for this particular reason. When I when I build a model, a model is a snapshot time.

It's not something that's immutable, able and will live, live for a long time. By my measure that i've seen a model has a time to live about six months and even even the best, you know, models. Author GPT four, that's true as well.

Like you started seeing pretty big moderations do IT after about six months. And I think that, that if we look out of my perspective, it's like, well, i'm going to put a model in the production for six months. I'm going to go back and and take my learnings from having to point that model and then make the mode Better somewhere to go back and train.

So these two things keep going hand in hand. I mean, don't really care. We're going to go where the trends take us, but that seems to be my observation so far as that training and inference just scale up together.

This seems like a shift to say like basically this application that were running has a six months time to live and then we need to redo that because like michot, that affect how a company even functions in that regard, right? That seems faster a faster turnaround for something that's kind of foundational, then a lot of firms are going to be used to.

I think that is true. I mean, you see this in the chipper role to chips or no two years time to love something like that. By two years, the chip is kind of not observe, but it's old.

And but you know you you can say like I could be some deployments where folks will keep A A particular model that's already been compliance checked in this that everything probably keep that model in in production for much wonder. You know, chips are going to cars. They run for ten plus the car is on the road for ten years.

There's a lot of checks that go into those ships. Those things are living for a long time. But in any data center, things move much faster because we have these upgrade cycles.

And I think in the same way, we're going to see models, kind of the ones that are the forefront that are you you're getting feedback against all the time, you're going to updating a lot more. And when I say six months time to live, I don't mean that you thrown away and build something completely different. After that, you're basically going to continue to support your application Better.

That's what the things we've built inside data bricks is, the ability to gather feedback from a model that's indepth yet use that feedback then to improve the model, you actually can create supervised fine tuning data sets from IT and then find to the model, put the new model into into deployment. It's kind of a continuous development cycle sort of thing. We see this and soft all the time.

It's it's not necessarily that you've change your entire architecture of reapplication. It's that you are in continuous development. You see feedback, you get bug reports, you you see performance issues, you feed those back, you see I C D them and you deploy updates. And so I think in the data, and we're going to see this with with models, quite often, people are going to be versioning moving to the next thing over and again.

what is the difference then between doing this inside of a company like google or open a air and near the usual suspects here and actually doing the inside of your standard enterprise? Because I think what I keep hearing is this is the year we're going to see like alms actually happening inside enterprises. And i'm curious what that looks because we've been told for years is like this is not for everyone, this is for the experts. But now that seems like actually IT is becoming kind of for everyone.

So yeah, that that was a large part of what we did in music. And now data bricks is making IT possible for everyone to do this. So and I one of our famous use cases is the company who has a distributed I D E.

Two guys built a state of the art coating model on our platform. They were two guys to know what they're doing. They were very good. But two people basically did this with our tools. So that wasn't something you could do inside a google, you know, five, six years ago.

So coming like google and opening eyes, they build huge teams that go and build the infrastructure, make the infrastructure work, deal with failures, all that. We basically taken that pattern and built IT for everybody. And so we made of tools that make IT easy to wrangle infrastructure.

What we're seeing, I mean, we're seeing lots of enterprises train models. I I check the latest, but we're over a one hundred thousand elan's trained. It's not that it's coming. It's there.

Enterprises are building their own models and because they have the tools now that make IT tractable for them and the cost of gotten to a point where it's not like hundred million dollars, they can they can do something really meaningful for under million dollars. And so I think that those two things coming together actually made IT quite important. What does that look like? Well know, I think a lot enterprises are figuring this out.

A lot of us sits under the cio many times. You know it's it's an I T kind of a thing who does the infrastructure. But then there is some kind of line of business data sciences or ml engineer who's wrangling data and is the one that use the tools that a slightly different paradigm know what people use before with data platforms that all sets on C I O.

generally. So we have a slightly different a slide deviation from that within companies. So typically, you'll see an ml team fun up somewhere where it's closed to business and IT matters. And those folks of the ones are like, I need these tools because I need to go build this money to customize this model. I know exactly what I want do.

There are smart folks now everywhere we're undergrads to come out of schools like stanford to amErica or whatever, who understand a lot about, and everyone I have to tune and make IT do what they want. You know, I F T S F T R H. They know all this stuff.

Now, least conceptual. So I think the talent is getting to a point where the it's proferred into many enterprises. It's just you're not going to see the density.

You're not going to see a hundred person in for a team in these in these lines of the business. You're going to see a five percent and thirteen. So they need tools, that tract stuff and that's who we sell .

to key just helped us understands like how can a small model actually outperform GPT for right now? And like like how does that actually work and why yeah I mean.

GPT four is trained down a lot of data. IT has a lot of representative capacity for most companies. They want IT to work in a particular domain. I just don't need you to have all over.

So I think you can kind of look at as an analogy of if I have like human who can be good to everything, where is the human is really good one thing, actually the person is really good at one thing adds more value to the world typically, because we can have lots of humans that are all good at one thing. So making something that's kind of them a jack of all trades are really a master of none is less valuable. And so what we're finding is we see IT over over again in different domains.

You you beat GPT for a no problem when you have domain specific data trained on a small model, and that model good at general stuff. Let's be really clear if you make something that's really good at customer support for a bank or something, it's not going to be good at talking about philosophy. It's not going to good at doing math.

It's not going be good at the problems. It's just not built for that. It's built for customer support and knows the products, but that's what the the enterprise cares about.

And you kind of look at IT like the cost of training or side the cost of serving this very large model for a specific use case is complete. Another overkill. Everyone seeing that now is like, I can build something that's one one hundred the cost to serve, that's one one hundred the size.

But IT does my domain just fine. So why would I use this thing that costs me Marks? That kind of the calculation is happening.

Do you have some examples you can share? Because I think folks of you just really interested like this is a common claim people make, right? It's it's almost become accepted wisdom that if you train a small model or find tune a small model, you can beat GPT for. But then then like when you actually go looking for examples, they are harder to find that you might think.

I mean, the report one is, is a great example that's a code completion and it's because they had dataset from their own customers. So IT keeps that in mind, right? It's not just a smaller model that's train our domain specific.

It's also a very high quality data that ones in that model that opening ee just didn't have gh. That's not their domain. So revolt has a ton of data from where their customers try to use a tool like this, ask for a code completion and they have all that data locked.

They use that to to train the model. So the model is good. actually. Internally, we we have a coding model as well that we we use for a coating assistance. It's actually exactly the same story.

We have made the the model much higher performing the GPT for we don't force anyone internally, by the way. So we have. We're enterprise company. We build tools and we don't force in those company in those groups to to use our models over GPT for we use open eye, we use all those all the tools out there. Everybody who builds an application inside the data bricks for our customers should use the best possible thing.

And so we look at IT like, well, if they don't flip over, that's that's proof positive that they couldn't get Better performance of a different model. But if they do fit over, then they could. And we actually see a number of cases internally on code generation, on assistant type stuff that is flipping over. We're starting to do demos now using our own models because again, we understand that domain, we understand the customers usage and we have data sets that characterized IT. And now we can go find to the model and make IT really good.

One of the really big trends we've seen in recent horror wz is a change from the supervised learning to unsupervised learning era. Know we've been investing in A I companies for eight years, call IT, you know and and even longer than that. But I think one of the big takeaway from the supervised learning era, at least for us, was that I was really hard to convince customers.

These big companies, like you said, to staff up a team of twenty, fifty, one hundred at the time. We're called machine learning experts or machine learning engineers, collect all this data label, let feed IT through a custom, a training job, figure out with model architecture. IT was really complicated, really expensive and really hard to get right to do machine earning on your own verses.

Now in the unti vise learning era with with these large retrain models, that's not really necessary, right? Like to your point, two people can just take the model off the shelf and and do something amazing, create a great demo anyway. Yet you're start of making this really compelling point that fine tuning your own model or even training your own model from scrap can do even Better.

And just it's just funny for me IT almost sounds like or egg back towards supervised learning a little bit. And and so the question is like have retrain models just sort of broken through the noise and convince ed people this is important? And like now it's some form of supervised learning or kind of direct training on on microplate tory data is like the real answer and we just needed like the mark needed a shop to get there? Or or is there something else going on like like maybe we found a happy medium or or something like that?

Slightly correct from your terminology. This is not unsupervised. This is self supervised. It's otto, aggressively trained. So the supervision comes from the intrinsic ordering of words. Once you move to auto regression, you don't need to have a labelling. The labelling is in transit, in the data.

which is prety cool. And just to maybe explain what you're saying, you're sort of saying when we have these very long sequences of text that we're training on, the the quote supervision or the quote labelling comes from the fact that a human ones, through this and the sort of structure built into IT and in that particular sequence of words, kind of has supervision onal built.

correct? exactly. So intrinsically, the words are ordered in a way that has meaning. And I think understanding what that meaning is and how that meaning was derived as kind of what these mills can do, which is pretty cool. But I think it's more the happy medium argument are making.

What what we're finding now is that you still I mean, every one of these elements is, is trained with supervised morning. So supervised by tuning has to happen. You can't just retrain the model put out in a while. You'll be it'll go off the rails and the weird things right away.

So you know, I think there was some of the early attempts, if you remember, I think what was I called a for microsoft and go like like these were just sort of wild models trained without much find mining. Afterward, you get some weird behavior. So you can't really do that.

Any one of them actually goes to the phase of retraining where you get inherent structure of language, and then you start to actually say, well, these are the patterns of input, output that I want, like somebody asked this kind of question format at this way. These are things that are inherently bad. Don't do this, you know.

So that's that's where the the supervision comes in and the supervised data says have gone pretty big. I think they're not nearly as big as pretrail ing, but they are they are getting pretty big. And so actually making a really great model as much more about the vise part.

So now we've done really well. We've mixed ed, the ottawa gressier self supervised part, along with a paradigm, does different kinds of supervision. And that supervision can even be small. You can be a hundred examples, but if you, if you train at the right way and construct the loss the right way, you can actually get profound changes, the behavior of the model that are built upon that self supervised to learning. So that's the paradigm shifted in my mind.

So you're exactly right in that pure supervised learning required you to go and build a very high quality data set that was completely supervised and that was hard and expensive. And now we have to do a bunch of another engineering. And so that then quite take off, just do hard.

But now we can get the small of smooth gradient of performance where I say, well, I have this base model that's prety good, understands language to understands concepts. Then I can start aware, and the things that I do know, and the more I know, I know I can put in. And if I don't have at some information, that's okay, I can still get to something which is useful. I can put IT out in the wild, or or in something straight wild, get feedback and then make that model Better. So that paradigm, actually the cure paradigm.

technology leader at every enterprise on earth right now, is probably thinking about starting a project like this, and rightfully so. I mean, this is probably the most important technology change we've seen, you know, in ten years, twenty right, like longer. But like, how do I really know if in that position, other than coming to talk to you and coming to talk to data bricks and like that, I actually have a problem and a good data set. And in the kind of thing that would actually work this.

well, that's a great question. I think the the data set almost always exist. Almost all these companies who gather data, something that can be is already either format in a way that like is useful or they can just do some simple lunging on IT and formative correctly like you know call center of type stuff.

You have a bunch of transcripts, that's all you need. You basically just need examples of what kind of behavior you want. And like I said, I think what's really cool about the pre training plus S F T, I F T A super vice, find sitting and instruction fine tuning paradigm is that I can take a small number of examples of stuff I have show value, show that things got Better, and then and then rap IT up.

So what we find the most, most enterprises, as you know, maybe they have some subset like somebody did something good where they went, and i'm going to go and gather data from this customer experience or whatever in a way that's really clean. They have a bunch of other data from this from like the last twenty years, that's all dirty in crappy and hiding in a million nux and cranney. But they have one nice queen small data set.

So now they can take that. They can they can find you like, wow, okay, I got something is pretty cool. Okay, now let's justify going back and doing the archaeology, digging up all with old data and reformat formatting IT and putting that into get some thing even Better. So that's the trend we're seeing today is like justify the archaeology to go fine. All the other data you know this .

is like the inDiana Jones version of you know it's it's like you go find like the special artifact and and then once you get that, you can go around the world and know it's with all all this core .

data exactly yeah like you have because most of these businesses, especially digin native businesses, have they have these golden data sets, which are basically chartering zo of their business. You you have buyer behavior, you have my interaction behavior, you have all this stuff, but no one knew how to use this right.

I mean, this was the promise of A I more than ten years ago that you could take that data and actually do something useful with IT. And IT wasn't true until maybe today. So the archeology got got because we have been gathering data for twenty years. So there's stuff in weird four months on weird storage places and weird drive sitting somewhere is a tape drive sitting somewhere that have really good data.

I'm now worried for like the poor dad, engineers who were pulling like one dust, the old man you script out of the stack and I just get crush. I got the surprise on that's totally happening now.

You know because it's like, wait, like bob, I can actually take that date and do something with IT you and it's differentiated. I got something that none of my competitors have.

So you mentioned like building something useful. And I thought that was a an accurate a way to some degree of describing. And I think the other time would people would be like probabilistic in a sense, like how does that evolution look in terms of getting to the point where or maybe it's just a shift in how we actually think of what applications our hour enterprises wrapping their heads are on the fact that, yes, these applications are useful, but they are definitely not like deterministic applications and there is also .

not guaranteed to be yeah chAllenge, especially regulate industries. They like to have models that are super deterministic. I I know what the inputs are, and I can predict the outputs very precisely in distribution, out of distribution.

All of that stuff is very well characterised when something is generative, you know, IT IT generates new stuff. IT does new match up and it's somewhat unpredictable. The best way, the best way that we see to to character ze this for companies is figure out what good is and what bad is for your business.

So come up with evaluation metric. We do this in in academic and sort of industry world where there's there are fifty different events out there like and you and help swag and always kind of stuff, right? So we use those things to say this models about Better IT. But when IT comes to like something is very specific or domain specific, intrinsic problems are I actually don't have a way of giving IT an exam.

And many times businesses have a way of saying, well, I know what, I know what a human is good like I put someone in front of customers like, right? I got the the good check that this human is is approved to be in front of the customer or in front of the media or whatever. But we don't have a good exam to say, like once you pass this exam, your hundred percent good.

So I think that's the hard car right now, especially for regulating industries, is coming up with that exam. That evaluation criterion such that he gives them signal. Okay, once I run a model through, I can compare the relative merits of this model, of that model on my domain. And so a lot of what we're doing now is advising companies to do that, come up with your success criteria, write IT down and then we'll start construction and you all for you here.

I was to say customer service seems easy because it's so proud to begin with, like an improvement is an improvement. But on the other hand, I thinks I want to engineer an airline airline chat back to to elect IT for like a dollar or something. So could you do that to this? So want to ask, like you are a blog post a while back about thinking of elms actually like a kind, like a maybe a relational database, or like a some sort structured representation of of a company's DNA for life of a Better term. And all the data ve ask me became to elaborate on the concept because IT does seem like car point yeah, we are seeing like if you can actually train on whatever decades worth of data you do start to put together a some sort of knowledge based on how this business actually works.

We're always struggling to find those terms that people will latch onto that they understand that we can then explain what the new thing is because the new thing is inherently knew that has some new capabilities and it's not exactly a database like that sort of a metaphor. Yes, you can put data and you can put knowledge in, but models actually can create can essentially do reason over that new data.

And that, that happens during the training process or tuning process when you introduce this new data, IT can actually starts to find useful rotations of that data for reason around IT. For instance, if I started rattling off a bunch of terms from from pub mid like about genetics, but someone who doesn't know about genetics is going to be like what i'm was very quickly, but after hearing the terms a lot, you actually start to understand, like, well, this, this is the mental model of what how genes work. This is the mental model of the stop code on and promoter and this and that you started to understand these concepts.

And then when you hear new text, you can actually say, like, oh, okay, I can find the thing that's interesting about this. This text you just escape me because you have that mental model of of the domain. And so this kind of customized reasoning is actually what's important.

And now that term reason we can start to use IT a little bit, I think people are that the the industry is getting a little more mature and that I say the word reason people don't like. He said that two years ago, like I don't know what that means. I know to find terms like, okay, I I can kind of more clearly to find IT and that like it's something that forms a mental model around your data.

That's the customized reasoning that happens when you introduce customer data, when you when you pretrail or find tune something, you start to build something as a custom reason engine. And so that's I think we're moving from the database metaphor to this metaphor now, but it's really describing the same phenomenon. You know that I have something that can take custom data. I can form some kind of representation on IT and then I can do something useful with that for my business IT. My first sense.

make sense. That means, speaking of databases, I am curious like like, like data infrastructure at least like that and that the platform layer takes a long time to move, like IT stays in place for a long time, right? In terms terms of you really people are still running the same database they've been running. How do you think how do you should we think about foundational models in that in that regard as as part of the data infrastructure of a company because they even to move a like the space seems to move a lot faster, to your point, does more reasoning than just kind of processing data.

So yeah, I mean, I think right now for the next couple of years, we're just going to be in this space where we're not laying down one foundation model for a long period time. And you know you saw this the beginning of database. So I I was in probably medical school or something when databases were were a big deal like when oracle, was he becoming big.

And all these different companies were growing in the early nineties. But at that time, like I don't think a company would actually commit to one to one data for a while because like there are always as a new technology, something is faster, Better. IT has its own schema for my data.

You can kind of think of an large language model as like a flexible schema. IT sort of comes up with the schema based upon the data, where as back then, we were like building a schema that was intrinsically Better for some use case, right? Like I had something that was serving real time stuff that was had a Better schema for recall.

IT was faster versus something that could have have been a very large numbers of keys, for instance. These are kind of trade off people made. So I think even back then, you would see a company probably not commit to one database for a long time.

Now those patterns have become very established. And I can commit to a database for a long time because it's mature and I know that things is going to work. It's going to be supported and IT adds, it's sort of a it's got Better rock of my business at this point.

It's going to take a while for foundation models to get to that point. But I think this is actually why we started seeing this prosperity of domain specificity and even customer specificity is like once I build a model that understands my domain really well, you know, our vision is to have that kind of model actually starts to become more agents. IT actually becomes, I can trust.

And let's say, IT just is unreal, all my data all the time, and I can go around and find interesting insights. okay. You know what? You could actually produce your costs on your supply chain by by registering this stuff. We're not very yet. But once we have a thing that can kind of really understand my business well and have this kind of customized reasoning for my business and actually give me novel insights that I trust over over again, then I will start to get to this point where we have a established mature i'll live longer than six months. Even you've been in this .

world for a long time. You've seen IT from a bunch different angles. And i've just always found you to be like an especially thoughtful person. You took a little bit from, like, the personal side of this, like founded a company that was very successful, got quired by big companies, stayed there for a while, came out to another another start up.

I think they're a bunch of us sort of in the A I world who have been waiting for A I to like actually work for like years. I know I count myself personally in that category. And conversely, there's a lot of new people like developing with A I for the first time, learning about ai for the first time.

Like you said, kids learning in college about convolution makes me kind of sad that I am not in college right now. You know I mean, like you need to talk a little bit about sort of your personal journey through through way. I am like like what kind of the current wave means to you and all that?

I have been in this file for a while from a sheer interest standpoint. I spent ten years in industry as a you computer architecture and software architects.

And one facts get a pigeon in neuroscience for the reason of, can we actually bring intelligence to machines and do IT in a way that economically feasible? And last part is actually very important, because if something is not economically feasible, IT won't take off, IT won't proferred IT won't change the world. So I love building technologies into products because when someone pays you for something that means something very important.

They saw something that adds value to them. They are solving a problem that they care about. They're improving their business, mean with something.

They're William apart ways with their money and give IT to you for their product. That means something. And I think once you've stablished, you can do over, over again.

Now you're starting to see a real value. Now you're up to right? I've been waiting to see A I add more value and really work for a long time.

I mean, you know, we had all these wild ideas, and amid two thousands around how to build and intelligent machines, a lot of those are going to come back around. We come into these like kind of local minimum of these paradise back propagation convolution math transformers. They're all sort of vocal minimum, but they're working toward maybe you know some sort of greater, greater view of what intelligence could be.

So i'm very happy and excited to see the idea of machine intelligence go mainstream. All the discussions we have, even the stupid ones when we're talking about like you know, robots killing the world and you know all kind of crazy numerous stuff even that, to be completely honest with me, honest with you, is like actually interesting to me from the perspective of we've we've now made this part of the conversation of our of our Normal social, social construct. It's not a weird side thing anymore.

I was always part of the weird side thing for many years. But now it's not. It's something that is gonna big and is going to a life value. When we see that uptake, we'll see more investment. We'll start to solve these problems, find new ways of solving these problems, will reinvent the paradigm.

And I think in ten or twenty years will actually start to have things that can truly be agented, where they can formulate hypotheses, create actions, execute actions, observe consequences, update hypotheses generation, and then actually starts to do that a stable way through a very complex behavioral and circumstances. So that's very interesting. And that I work at this on a much longer art of time, where I see IT.

Now ten years ago, I didn't know how we're onna get there, but now I think we have some bits and pieces, we have some primitives in place that we can actually do this. Yeah we is going to take you know a hundred mega watts to do IT maybe so it's not twenty watts of energy like your brain, but that's progress, you know. So i'm actually super excited and pumped about this like I see this is like when i'm eighty years old.

And looking back, we will have really made dents and made this completely, have changed the world that I think is very exciting. It's frustrated. IT is as IT isn't a short term. Short term is ten years. This is not a small thing to do, is a, you know, hundred years kind of thing.

Transition was IT more fun to be like part of the weird side world sub, could, you know, like, try shouting the day I was important that we are going to get there when nobody believed IT? Or or is IT more fun now to be the like billion dollar exit? You know, everybody wants you on their podcast headlines across of a newspaper.

I was going asked a related question actually, which was in your personal life, are you more interesting to people now as a easy to explain what you do?

Yeah, IT is easier to explain what I do that that is true. So that made that made that part of my wife easier, I would say. But I think I also made an annoying because people just want to talk about that.

Now everyone has their pet theories now, which is with the whole something I talk about pet theory and like as my opinion not and i'm like, can we just talk about something else? Point you, i'm out of work right now. I'm like taking a day off a one talk about something else though overall, like I think i'm i'm very happy and excited about where the world has gone.

I mean, being the sideshow is never as fun to be honest with you. It's like that you can get passion from that because I think IT actually you know whenever ver one selling you, you're a sideshow. You kind of have three passions to keep going.

And I think no, you can use that as strength. But really the whole point of having that passion, that strength is to make something that's meaningful and something that's lasting, something that really does change the course of of human evolution. And that's that's what i'm here for. I'm here to like you be a part of building the next set of technologies that really make humans feel to influence the world than greater and more perron ways.

amazing. Hopefully you enjoyed listening to the vene as much as we enjoyed speaking with them. We have some more and exciting new episodes to come as well as more than the a sixteen, the archive to subscribe to the feet so you don't miss them. Thanks again for listening.

REPLAY: Scoping the Enterprise LLM Market 43:12 Share

AI + a16z

Deep Dive

Shownotes Transcript

REPLAY: Scoping the Enterprise LLM Market