We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

2024 in AI Startups [LS Live @ NeurIPS]

2024/12/21

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

AI Deep Dive AI Insights AI Chapters Transcript

People

Pranav Reddy

Sarah Guo

Topics

Sarah Guo: 我在Greylock工作十年后创立了Conviction风险投资基金。我们认为AI技术革命将带来巨大的经济机遇,市场动态变化为我们提供了以第一性原理进行投资,寻找具有颠覆性潜力的公司和创始人的机会。我们投资的公司涵盖基础设施、基础模型、替代架构、特定领域训练和应用等多个领域。两年来,我们见证了AI领域的快速发展,并对未来趋势进行了一些预测。我们观察到,应用层存在巨大的创新空间,模型选择、价格竞争和开源生态系统的发展都为创业公司提供了机会。测试时间扩展也使得用户价值与计算成本的匹配更加有效。AI技术能够改变传统上被认为难以投资的市场,例如法律、医疗保健等领域。AI技术为消费者市场和社会媒体领域带来了新的机遇,并且可以改变软件服务的商业模式,增加软件需求。大型企业虽然拥有分销渠道和数据优势,但其固有的产品和商业模式也可能成为其劣势,为创业公司提供了机会。许多公司缺乏创业公司真正需要的数据,例如推理轨迹数据。AI领域正在发生深刻变革,为创业公司提供了前所未有的机遇,市场机会多样化,商业模式也需要创新。AI创业公司发展迅速,但也面临着挑战,例如规模化和可持续性问题。一些公司可能快速发展,但也可能迅速衰落,这取决于其产品的可持续性和竞争力。AI创业公司对资金的需求各不相同,有些公司能够保持高效运营,有些则需要更多资金。许多公司规模保持相对较小,并通过AI技术提高效率。由于AI领域存在大量机会,投资策略也需要调整,例如通过Embed项目与更多公司合作。我们认为,未来几年将涌现更多面向消费者的AI公司。 Pranav Reddy: 2024年,基础模型领域的竞争比2023年更加激烈,OpenAI的市场份额下降,谷歌等公司以及开源模型的竞争力增强,这体现在模型评估结果和实际市场份额数据上。开源模型的竞争力日益增强,在数学、指令遵循和对抗鲁棒性等方面表现出色,并且小型模型的性能与大型模型的差距正在缩小。人工智能的成本显著下降,这使得可以使用更大的数据量进行模型训练和应用。新的模态(如生物学、语音和代码执行)开始发挥作用,并带来新的用户交互体验。虽然单纯的规模扩大可能存在局限性,但新的扩展范式正在出现,例如测试时间计算扩展,这为AI创业公司带来了新的机遇和挑战。在一些基准测试中,模型的性能已经取得了显著进步,例如SweBench。视频生成技术也取得了显著进展,例如Sora和HeyGen等公司在视频生成和配音方面取得的成果。第一波服务自动化领域存在巨大的市场机会,因为许多公司目前无法有效地完成某些工作。改进搜索和信息获取方式是另一个有潜力的方向,文本模态已经取得了成功,但未来可能会有更多新的信息形式出现。AI正在帮助人们更容易地获得创造性和技术技能,并为更多的人群带来了机会。计算和数据是AI发展的关键基础设施,并且对数据需求也在发生变化。应用层存在巨大的创新空间,并且模型选择、价格竞争和开源生态系统的发展都为创业公司提供了机会。测试时间扩展也使得用户价值与计算成本的匹配更加有效。关于创业公司和大型企业谁将获得更多价值的问题存在争议。企业级市场对多模态AI的需求相对较低,但未来随着信息获取方式的变化,这种情况可能会改变。AI技术能够改变传统上被认为难以投资的市场,例如法律、医疗保健等领域。AI技术为消费者市场和社会媒体领域带来了新的机遇,并且可以改变软件服务的商业模式,增加软件需求。大型企业虽然拥有分销渠道和数据优势,但其固有的产品和商业模式也可能成为其劣势,为创业公司提供了机会。许多公司缺乏创业公司真正需要的数据,例如推理轨迹数据。AI领域正在发生深刻变革,为创业公司提供了前所未有的机遇,市场机会多样化,商业模式也需要创新。AI创业公司发展迅速,但也面临着挑战,例如规模化和可持续性问题。一些公司可能快速发展,但也可能迅速衰落,这取决于其产品的可持续性和竞争力。AI创业公司对资金的需求各不相同,有些公司能够保持高效运营,有些则需要更多资金。许多公司规模保持相对较小,并通过AI技术提高效率。由于AI领域存在大量机会,投资策略也需要调整,例如通过Embed项目与更多公司合作。我们认为,未来几年将涌现更多面向消费者的AI公司。

Deep Dive

Key Insights

What are the top five themes in AI startups in 2024 according to Sarah Guo and Pranav Reddy?

The top five themes in AI startups in 2024 are: 1) A closer race in foundation models, with OpenAI no longer dominating as it did in 2023. 2) Open-source models becoming increasingly competitive, especially in areas like math and instruction following. 3) The price of intelligence dropping significantly, with OpenAI's API costs decreasing by 80-85%. 4) New modalities like biology, voice, and video beginning to work effectively. 5) The debate around the end of scaling, with new paradigms like test-time compute scaling emerging.

Why has the competitive landscape for foundation models shifted in 2024?

The competitive landscape for foundation models has shifted because OpenAI is no longer the clear leader. In 2023, OpenAI models were significantly better than others, but by 2024, proprietary and open-source models have become increasingly competitive. For example, Google's models are now outperforming OpenAI in some evaluations, and open-source models like LLAMA are among the top performers in specific areas like math and adversarial robustness.

How has the cost of AI intelligence changed in 2024?

The cost of AI intelligence has dropped dramatically in 2024, with OpenAI's flagship model API costs decreasing by 80-85% over the past year and a half. This trend is not limited to OpenAI; across the industry, the price per token for AI models has significantly decreased, making it more affordable for companies to leverage AI capabilities at scale.

What new modalities in AI are showing promise in 2024?

New modalities in AI showing promise in 2024 include biology, voice, and video. For example, Chai Discovery released an open-source biology model that outperforms AlphaFold3. Low-latency voice models are creating new interaction experiences, and video models like Sora and HeyGen are enabling advanced capabilities such as lip-syncing and dubbing for live speeches.

What is the current state of open-source AI models in 2024?

Open-source AI models have become increasingly competitive in 2024, particularly in areas like math, instruction following, and adversarial robustness. Models like LLAMA are now among the top performers in these domains. However, there are still areas, such as agenting and tool use, where proprietary models maintain an advantage due to more specialized training and data.

What are the key trends in AI startup funding in 2024?

In 2024, AI startup funding has seen a substantial recovery, with foundation model labs raising upwards of $30-40 billion. However, the broader funding environment remains rational, with most money going to companies demonstrating real traction and growth. The narrative of an AI bubble is largely debunked, as startups are raising funds based on actual outcomes rather than hype.

How are startups addressing the challenge of competing with incumbents in AI?

Startups are addressing the challenge of competing with incumbents by focusing on better products, innovative business models, and leveraging AI to create new user experiences. Incumbents may have distribution and data advantages, but startups are finding success by rethinking workflows, offering outcomes-based pricing, and building products that are more efficient and user-friendly than traditional solutions.

What are the emerging opportunities in AI for startups in 2024?

Emerging opportunities for AI startups in 2024 include first-wave service automation, better search and productivity tools, democratization of creative and technical skills, and enabling layers like compute and data. Startups are also exploring new markets traditionally considered difficult for venture capital, such as legal, healthcare, and education, by leveraging AI to make capabilities cheaper and more accessible.

How is the debate around scaling in AI models evolving in 2024?

The debate around scaling in AI models is evolving with the recognition that there are limits to the benefits of increasing scale. However, new paradigms like test-time compute scaling are emerging, where models can dynamically allocate compute resources based on the complexity of tasks. This approach is particularly effective in well-constrained domains like math and physics, but challenges remain in less-defined areas.

What is the role of open-source models in the AI ecosystem in 2024?

Open-source models play a significant role in the AI ecosystem in 2024 by providing competitive alternatives to proprietary models. They are particularly effective in specific domains like math and instruction following, and they contribute to lowering the cost of intelligence. Open-source models also enable startups to experiment and innovate without the high upfront costs associated with proprietary solutions.

Chapters

This chapter explores the key advancements in AI during 2024, focusing on the increased competitiveness of foundation models, the rise of open-source models, the reduction in the price of intelligence, and the emergence of new modalities like biology, voice, and video.

OpenAI's dominance decreased, with Google's models becoming increasingly competitive.
Open-source models showed significant improvements in various areas.
The cost of using flagship OpenAI models decreased by approximately 80-85%.
New modalities like biology, voice, and video are showing promising results.

Shownotes Transcript

Translations:

中文

Welcome to Latent Space Live, our first mini-conference held at NeurIPS 2024 in Vancouver. This is Charlie, your AI co-host. When we were thinking of ways to add value to our academic conference coverage, we realised that there was a lack of good talks just recapping the best of 2024 going domain by domain.

We sent out a survey to the over 900 of you who told us what you wanted and then invited the best speakers in the Latent Space Network to cover each field. 200 of you joined us in person throughout the day with over 2,200 watching live online. Today, we're kicking it off with a keynote on the state of AI startups.

Sarah Guo, founder at Conviction and host of No Priors podcast. And Pranav Reddy, partner at Conviction and former engineer at Neva. They'll be unpacking the top five themes of 2024. What ideas are working and what's not? From shifting market opportunities to why the supposed advantages of big tech incumbents might not be as strong as they seem. As always, don't forget to check the show notes for the YouTube link to their talk as well as their slides.

Watch out and take care.

So I'd start by just giving 30 seconds of intro. I promise this isn't an ad. We started a venture fund called Conviction about two years ago. Here is a set of the investments we've made. They range from companies at the infrastructure level in terms of feeding the revolution to foundation model companies, alternative architectures, domain specific training efforts, and of course applications.

And the premise of the fund, Sean mentioned I worked at Greylock for about a decade before that and came from the product engineering side, was that we thought that there was a really interesting technical revolution happening, that it would probably be the biggest change in how people use technology in our lifetimes, and that represented huge economic opportunity.

And maybe that there'd be an advantage versus the incumbent venture firms in that when the floor is lava, the dynamics of the markets change, the types of products and founders that you back change, it's a lot for existing firms to ingest and a lot of their mental models may not apply in the same way.

And so there was an opportunity for first principles thinking, and if we were right, we would do really well and get to work with amazing people. And so we are two years into that journey and we can share some of the opinions and predictions we have with all of you. Pranav is going to start us off. So quick agenda for today. We'll cover some of the model landscapes and themes that we've seen in 2024, what we think is happening in AI startups, and then some of our latent priors on what we think is working and investing. So the

I thought it would be useful to start from what was happening at NeurIPS last year in December 2023. So in October 2023, OpenAI had just launched the ability to upload images to ChatGPT, which means up until that moment, it's hard to believe, but roughly a year ago, you could only input text and get text out of ChatGPT. The Mistral folks had just launched the Mixtral model right before the beginning of NeurIPS. Google had just announced Gemini. I very genuinely forgot about the existence of BARD before making these slides.

And Europe had just announced that they were doing their first round of AI regulation, but not to be their last. And when we were thinking about what's changed in 2024, there's at least five themes that we could come up with that feel like they were descriptive of what 2024 has meant for AI and for startups. And so we'd start with

First, it's a much closer race on the foundation model side than it was in 2023. So this is Elm Arena. They asked users to rate the evaluations of generations from specific prompts. So you get two responses from two language models, answer which one of them is better. The way to interpret this is roughly 100 ELO difference means that you're preferred 2/3 of the time. And a year ago, every OpenAI model was more than 100 points better than anything else.

And the view from the ground was roughly like, OpenAI is the IBM. There is no point in competing. Everyone should just give up, go work at OpenAI, or attempt to use OpenAI models.

And I think the story today is not that. I think it would have been unbelievable a year ago if you told people that A, the best model today on this, at least on this eval, is not OpenAI, and B, that it was Google would have been pretty unimaginable to the majority of researchers. But actually there are a variety of proprietary language model options and some set of open source options that are increasingly competitive.

And this seems true not just on the e-mail side, but also in actual spend. So this is RAMP data. There's a bunch of colors, but it's actually just OpenAI and anthropic spend. And the OpenAI spend at the end of last year in November of '23 was close to 90% of total volume. And today, less than a year later, it's closer to 60% of total volume, which I think is indicative both that language models are pretty easy APIs to switch out, and people are trialing a variety of different options to figure out what works best for them.

Related, second trend that we've noticed is that open source is increasingly competitive. So this is from the scale leaderboards, which is a set of independent evals that

are not contaminated. And on a number of topics that actually the foundation models clearly care a great deal about, open source models are pretty good on math, instruction following, and adversarial robustness. The LLAMA model is amongst the top three of evaluated models. I included the agenting tool use here just to point out that this isn't true across the board. There are clearly some areas where foundation model companies have had more data or more expertise in training against these use cases. But models are surprisingly and increasing-- open source models are surprisingly increasingly effective.

This feels true across evals. This is the MMLU eval. I want to call out two things here. One is that it's pretty remarkable that the ninth best model and two points behind the best state-of-the-art models is actually a 70 billion parameter model. I think this would have been surprising to a bunch of people who were, the belief was largely that most intelligence is just an emergent property and there's a limit to how much intelligence you can push into smaller form factors.

In fact, a year ago, the best small model or under 10 billion parameter model would have been Mistral 7B, which on this eval, if memory serves us somewhere around, is 60. And today, that's the LAMA 8B model, which is more than 10 points better. The gap between what is state of the art and what you can fit into a fairly small form factor is actually shrinking.

And again, related, we think the price of intelligence has come down substantially. This is a graph of flagship OpenAI model costs where the cost of the API has come down roughly 80, 85%.

call it the last year, year and a half, which is pretty remarkable. This isn't just OpenAI 2.0. This is also the full set of models. This is from artificial analysis, which tracks cost per token across a variety of different APIs and public inference options. And we were doing some math on this. If you wanted to recreate the kind of data that a text editor had or something like Notion or Coda,

somewhere in the volume of a couple thousand dollars to create that volume of tokens. That's pretty remarkable and impressive. It's clearly not the same distribution of data, but just as a sense of scope, there's an enormous volume of data that you can create.

And then fourth, we think new modalities are beginning to work. Start quickly with biology. We're lucky to work with the folks at Chai Discovery who just released Chai One, which is an open source model that outperforms AlphaFold3. It's impressive that this is roughly a year of work with a pretty specific data set and then pretty specific technical beliefs. But models in domains like biology are beginning to work. We think that's true on the voice side as well. I'll point out that there were voice models before things like 11Labs have existed for a while.

We think low latency voice is more than just a feature. It's actually a net new experience interaction. Using voice mode feels very different than the historical transcription first models. Same thing with many of the Cartesian models.

And then a new nascent use case is execution. So Cloud launched computer use, OpenAI launched code execution inside of Canvas yesterday, and then I think Devon just announced that you can all try it for $500 a month, which is pretty remarkable. It's a set of capabilities that have historically never been available to the vast majority of the population, and I think we're still in early innings. Cognition, the company, was founded under a year ago. The first product was roughly nine months ago, which is pretty remarkable.

Pretty impressive. - And if you recall, like a year ago, the point of view on SweBench was like it was impossible to surpass what, like 13% or so? And I think the whole industry now considers that, if not trivial, accessible. - Yeah.

Last new modality we wanted to call out, although there are many more, is video. I got early access to Sora and managed to sign up before they cut off accesses. So here is my favorite joke in the form of a video. Hopefully someone here can guess it. Yeah, you're telling me a shrimp fried this rice is a pretty bad joke, but I really like it. And I think this one...

The next video here is one of our portfolio companies, HeyGen, that translated and does the dubbing for-- or lip sync and dubbing for live speeches. So this is Javier Millet, who speaks in Spanish, but here you will hear him in English if this plays. And you can see that you can capture the original tonality of his speech and performance. I think audio here doesn't work, but we'll-- Does it? --push something publicly. Did you try it?

Let's give it a shot. Yeah. Let's give it a shot. Excellent. Yeah, and you can hear that this captures his original tone and the emotion in his speech, which is definitely new and pretty impressive from new models. So the last-- that makes sense. The last point that we wanted to call out is the much purported end of scaling. I think there is a great debate happening here later today on the question of this. But we think at minimum, it's hard to deny that there are at least

some limits to the clear benefits to increasing scale. But there also seems like there are new scaling paradigms. So the question of test time compute scaling is a pretty interesting one. It seems like OpenAI has cracked a version of this that works and we think A, Foundation Model Labs will come up with better ways of doing this and B, so far it largely works for very verifiable domains, things that look like math and physics and maybe secondarily software engineering where we can get an objective value function.

And I think an open question for the next year is going to be how do we generate those value functions for spaces that are not as well constrained or well defined. And so the question that this leaves us in is like, well, what does that mean for startups?

And I think a prevailing view has been that we live in an AI bubble. There's an enormous amount of funding that goes towards AI companies and startups that is largely unjustified based on outcomes and what's actually working on the ground. And startups are largely raising money on hype. And so we pulled some PitchBook data, and the 2024 number is probably incomplete since not all rounds are being reported.

largely suggests actually there is a substantial recovery in funding and maybe 2025 looks something like 2021. But if you break out the numbers here a bit more, the red is actually just a small number of foundation model labs, like what you would think of as the largest labs raising money, which is upwards of $30 to $40 billion this year.

And so the reality of the funding environment actually seems like much more sane and rational. It doesn't look like we're headed to a version of 2021. In fact, the Foundation Model Labs account for an outsized amount of money being raised, but the set of money going to companies that are working seems much more rational. And we wanted to give you, we can't share numbers for every company, but this is one of our portfolio companies growing really, really quickly.

We think zero to 20 and just PLG style spending is pretty impressive. If any of you are doing better than that, you should come find us. We'd love to chat. And so what we wanted to try and center discussion on-- this is certainly not all of the companies that are making $10 million more revenue and growing. But we took a selection of them and wanted to give you a couple ideas of patterns that we've noticed that seem to be working across the board.

The first one that we notice is first wave service automation. So we think there's a large amount of work that doesn't get done at companies today, either because it is too expensive to hire someone to do it, it's too expensive to provide them context and enable them to be successful

at whatever the specific role is or it's too hard to manage those set of people. So prescribing, it's too expensive to hire those specific set of people. For Sierra and Dekogon, for customer support style companies, it's really useful to do next level automation. And then there's obviously growth in that. And for Harvey and Evenup, the story is you can do first wave professional services and then grow beyond that.

Second trend that we've noticed is better search, new friends. So we think that there is a, it's pretty impressive like how effective text modalities have been. So Character and Replica have been remarkably successful companies and there's a whole host of Not Safe for Work chatbots as well that are pretty effective at just text generation. They're pretty compelling mechanisms and on the productivity side, Perplexity and Glean have demonstrated this as well. I worked at a search company for a while. I think the changing paradigms of how people capture and learn information is pretty interesting. We think

It's likely text isn't the last medium. There are infographics or sets of information that seem more useful or sets of engagement that are more engaging. But this feels like a pretty interesting place to start.

So one thing that I've worked on investing in in a long time is democratization of different skills, be they creative or technical. This has been an amazing few years for that across different modalities, audio, video, general image, media, text, and now code and really fully functioning applications.

One thing that's really interesting about the growth driver for all of these companies is the end users in large part are not people that we thought of as, we the venture industry, you know, the royal we thought of as important markets before. And so a premise we have as a fund is that there's actually much more instinct for creativity, visual creativity, audio creativity, technical creativity than like there's latent demand for it. And AI applications can really serve

that. I think in particular, Midjourney was a company that is in the vanguard here and nobody understood for a long time because the perhaps outside view is like how many people want to generate images that are not easily, you know, they're raster, they're not easily editable. They can't be using these professional contexts in a complete way. And the answer is like an awful lot, right? For a whole range of use cases. And I think we'll continue to find that, especially as the capabilities improve. And we think the, the range of, um,

quality and controllability that you can get in these different domains is still, it's very deep and we're still very early.

And then I think if we're in the first or second inning of this AI wave, one obvious place to go invest and to go build companies is the enabling layers, right? Shorthand for this is obviously compute and data. I think that the needs for data are largely changed now as well. You need more expert data. You need different forms of data. We'll talk about that later in terms of who has, like, let's say, reasoning traces.

is in different domains that are interesting to companies doing their own training. But this is an area that has seen explosive growth, and we continue to invest here. OK, so maybe time for some opinions.

there was a prevailing narrative that, you know, some part from companies, some part from investors. It's a fun debate as to where is the value in the ecosystem and can there be opportunities for startups? If you guys remember the phrase GPT wrapper, it was like the dominant phrase in the tech ecosystem for a while. And what it represented was this idea that there was no value at the application layer. You had to do pre-training and then like nobody's going to catch OpenAI in pre-training.

And this isn't like a knock on open AI at all. These labs have done amazing work enabling the ecosystem and we continue to partner with them and others, but it's simply untrue as a narrative, right? The odds are clearly in favor of a very rich ecosystem of innovation. You have a bunch of choices of models that are good at different things. You have price competition, you have open source.

I think an underappreciated impact of test time scaling is you're going to better match user value with your spend on compute. And so if you are a new company that can figure out how to make these models useful to somebody, the customer can pay for the compute instead of you taking as a startup the capex for pre-training or...

or RL upfront. And as Pranav mentioned, small models, especially if you know the domain can be unreasonably effective. And the product layer has, if we look at the sort of cluster of companies that we described, shown that it is creating and capturing value and that it's actually a pretty hard thing to build great products that leverage AI.

So broadly, we have a point of view that I think is actually shared by many of the labs, that the world is full of problems and the last mile to go take even AGI into all of those use cases is quite long.

Okay, another prevailing belief is that, or you know another great debate that Sean could host is like does the value go to startups or incumbents? We must admit some bias here even though we have you know friends and portfolio, former portfolio companies that would be considered incumbents now. But, sorry swap, swap, swap views. Sorry, you know there are

There are markets in venture that have been considered traditionally like too hard, right? Like just bad markets for the venture capital spec, which is capital efficient, rapid growth. That's a venture backable.

where the end output is a tens of billions of dollars of enterprise value company. And these included areas like legal, healthcare, defense, pharma, education, any traditional venture firm would say like bad market, nobody makes money there, it's really hard to sell, there's no budget, et cetera. And one of the things that's interesting is if you look at the cluster of companies that has actually been effective over the past year, some of them are in these markets that were traditionally not obvious, right?

And so perhaps one of our more optimistic views is that AI is really useful. And if you make a capability that is novel, that has several orders of magnitude cheaper, then actually you can change the buying pattern and the structure of these markets. And maybe the legal industry didn't buy anything because it wasn't anything worth buying for a really long time. That's one example.

We also think that, what was the last great consumer company? Maybe it was Discord or Roblox in terms of things that started that have just really enormous user bases and engagement until we had these consumer chatbots of different kinds and perhaps the next generation of search. As Pranav mentioned, we think that the...

opportunity for social and media generation in games is large and new in a totally different way. And finally, in terms of the markets that we look at,

I think there's broad recognition now that you can sell against outcomes and services rather than software spend with AI because you're doing work versus just giving people the ability to do a workflow. But if you take that one step further, we think there's elastic demand for many services, right? Our classic example is there's on order of 20 to 25 million professional software developers in the world. I imagine much of this audience is technical.

Demand for software is not being met, right? If we take the cost of software and high quality software down two orders of magnitude, we're just going to end up with more software in the world. We're not going to end up with fewer people doing development. At least that's what we would argue.

And then finally, on the incumbent versus startup question, the prevailing narrative is incumbents have the distribution, the product surfaces and the data. Don't bother competing with them. They're going to create and capture the value and share some of the back with their customers. I think this is only partially true.

Incumbents have the distribution. They have always had the distribution. The point of the startup is you have to go fight with a better product or a more clever product and maybe a different business model to go get new distribution. But the specifics around the product surface and the data, I think, are actually worth understanding. There's a really strong innovator's dilemma. If you look at the SaaS companies that are dominant, they sell by seat.

And if I'm doing the work for you, I don't necessarily want to sell you seats. I might actually decrease the number of seats. The tens, the decades of years, the millions of man and woman hours of code that have been written to enable a particular workflow in CRM, for example, may not matter if I don't want people to do that workflow of filling out the database every Friday.

And so I do think that this sunk cost or the incumbent advantage gets highly challenged by new UX and code generation as well.

And then one disappointing learning that we found in our own portfolio is no one has the data we want in many cases, right? So imagine you are trying to automate a specific type of knowledge work and what you want is the reasoning trace, all of the inputs and the output decision

That sounds like a very useful set of data. And the incumbent companies in any given domain, they never save that data, right? They have a database with the outputs some of the time. And so I would say one of the things that is worth thinking through as a startup is when an incumbent says they have the data, what is the data you actually need to make your product higher quality?

Okay, so in summary, our shorthand for the set of changes that are happening is software 3.0. We think it is a full stock rethinking and it enables a new generation of companies to have a huge advantage. The speed of change favors startups. If the floor is lava, it's really hard to turn a really big ship.

I think that some of the CEOs of large companies now are incredibly capable, but they're still trying to make 100,000 people move very quickly in a new paradigm. The market opportunities are different, right? These markets that we think are interesting and very large, like represent a trillion dollars of value, are not just the replacement software markets.

of the last two decades. It's not clear what the business model for many of these companies should be. Sierra just started talking about charging for outcomes. Outcomes-based pricing has been this holy grail idea in software and it's been very hard, but now we do more work.

There are other business model challenges. And so, you know, our companies, they spend a lot more on compute than they have in the past. They spend a lot with the foundation model providers. They think about gross margin. They think about where to get the data. It's a time where you need to be really creative about product versus just replace the workflows of the past. And it might require ripping out those workflows entirely. It's a different development cycle.

I bet most of the people in this room have written evals and compared the academic benchmark to a real-world eval and said, "That's not it." And how do I make a user understand the

non-deterministic nature of these outputs or gracefully fail. I think that's like a different way to think about product than in the past. And we need to think about infrastructure again, right? There was this middle period where the cloud providers, the hyperscalers took this problem away from software developers and it was all just gonna be like, I don't know, front end people at some point and it's like we are not there anymore. We're back in the hardware era where people are acquiring and managing and optimizing compute and I think that will really matter in terms of capability in companies.

So I guess we'll end with a call to action here and encourage all of you to seize the opportunity. It is the greatest technical and economic opportunity that we've ever seen. Like we made a decade plus career type bet on it. And

We do a lot of work with the foundation model companies. We think they are doing amazing work and they're great partners and even co-investors in some of our efforts. But I think all of the focus on their interesting missions around AGI and safety do not mean that there are not opportunities in other parts of the economy. The world is very large and we think much of the value will be distributed in the world through an unbundling and eventually a rebundling.

as often happens in technology cycles. So we think this is a market that is structurally supportive of startups. We're really excited to try to work with the more ambitious ones. And the theme of 2024 to us has been like, well, thank goodness. This is an ecosystem that is much friendlier to startups than 2023. It is what we hoped. And so please ask us questions and take advantage of the opportunity.

I think we have some questions. We're also taking questions online. Companies can go from 1 to 20 in such a short amount of time. Do you think that they can also disappear in a short amount of time? I can take this one. I mean, I think you've seen companies go from 0 to 80 million and stall out pretty badly, actually. So your data is correct. There's going to be... There's a set of challenges that...

are just the challenges of scale, right? Like I think sometimes the revenue numbers in these companies can overstate the maturity of the businesses themselves, right? They need to figure out how to serve customers. They need to scale their leadership. They need to prepare to service these customers with the right quality level and

You know, like the company that we showed that went zero to 20, that company has 20 people, right? And they have, you know, X hundred thousand users is very challenging. And so I think there's a set of good hard problems that these companies will have.

I think part of the, like, most catchphrases or memes, they don't catch on unless there's some seed of truth. And so there was a set of companies that were described by this term GPT wrapper that were not more than a somewhat trivial set of prompts and SEO pages that directed people to a particular use case.

And I think that's likely not a durable position as a technology company. And so it's not a very clean answer for you. It's a nuanced one. But some of the value that is represented by this, let me just scroll back to it.

some of this value that is represented by this cluster is durable. And that's the thing that we are interested in. The zero to 20 and the zero to 80 and then collapse, it's actually valuable. It's just not durable, right? Users are voting for it and other people can compete. And so, you know, we kind of separate these two questions of like, you know, which of these companies is defensible and where is the revenue or the usage, not a novelty, but something that's really important to like,

work or player communication. Sean, do you want me to take questions or do you want to do it? Hi. So if all of these companies need a lot more money and this is the greatest economic opportunity ever, don't we need much bigger venture funds? Or does it magnitude bigger? And won't the economics of those funds be really broken if they're still raising $40 million? Like, could I invest in a bunch of seed company funds?

Okay, this is a bit of a triggering question for me because I take a particular point of view on it. Hopefully without arrogance, we've chosen to raise funds that are relatively small as early stage investors. And part of it is the view of like this company that, you know, this company, I think they've spent like maybe $7 million to date.

And so the view that all AI product companies or all AI companies in general are very expensive is not true, objectively. We have several companies that are expensive in the traditional sense of SaaS. Like we've got to go hire a lot of go-to-market people and we have to pay them and there's a J curve of that investment before it comes back in repeatable SaaS revenue. And I think...

We have companies that are profitable or break even and have been incredibly efficient. And we have companies that spend a lot up front. And so I think there's an entire range. Our view as a firm is that very early on,

My friend Elad has a funny phrase here, which is no GPU before product market fit. I think that is not always true. We have given people GPUs before anything, right? But there's a shred of truth in this, which is you can experiment.

Thank you to the OpenAI and Anthropix and other companies of the world that allow great product people to experiment at very low cost, very incrementally. And so I think much of our portfolio looks like those companies where you're going to see what kind of value you can bring to users without spending a ton upfront. As one example, like we just saw,

New fine tuning interfaces for O1 come out. The amount of data that you need to, in theory, improve those models for a particular domain is very small. If that pans out, like that's incredibly encouraging as well.

So I would say like, our goal is to work with the most important companies in AI with a relatively small fund. And I think that most companies don't actually, they don't benefit from a huge amount of capital upfront.

The only thing I would add to that is I think an interesting trend is that we work with a number of second-time founders whose point of view this time around is like, we're never going to make the company that big again. I think it's not a surprise. Actually, I was doing the math in my head and this rough ratio of a million dollars of revenue for per employee of early stage company holds true for like a remarkable number of our companies. Like a number of our companies have more millions in revenue than they do

employees and the point of view of a bunch of this is like we're going to keep it that way. Like we're not going to grow into a giant team. AI will make us much more efficient. And if you believe in the grand vision of much of the intellectual labor that we do should actually just be captured by some number of models. And we can build much more long term efficient businesses than we have been able to historically.

I do think it's an interesting question because if we think there is this much opportunity, your opportunity doesn't come evenly. And so I'd say our investment pacing is higher than I guess mine has been traditionally. And another part of our view is like, OK, well, we want to offer founders a certain service level.

Founders can decide if they want that or not, but it's very time expensive to us. We can only work with that many companies. We think many more are really interesting and that is one of the reasons that Pranav and I did this program for the ecosystem called Embed where we can work with a larger set of companies. We own less, but we give them a network and some guidance. And it is genuinely because there are more interesting things that we think are going to work than we can work on in a traditional artisanal venture sense.

And shameless plug, applications will open in January. I think if I press a button to request

So fancy. Cool. Thanks for the talk, it was awesome. So I work for a Series C enterprise-focused company called Writer. And one of the interesting things about the multimodality thing that we're seeing in the enterprise is beyond vision, we're not actually seeing a lot of demand for multimodality. We'll get asked about audio and video stuff, but then when we ask what's the use case of

It's sort of like, I don't know. And so I'm curious if you and your portfolio companies are seeing that in the enterprise space. And if so, what use cases? It seems very, the multimodality stuff seems great for the consumer level. I'm curious if you're seeing anything on the enterprise. I think it's a good call out. Enterprises, the data they have is mostly like,

It's text, it's structured data and some SQL data. I don't think your average enterprise has that much vision, video, audio data that is that interesting. But I think that will change.

maybe it's because I'm lazy and disorganized, but humans are very unstructured. They don't necessarily think in terms of relational database schema and hierarchical management of their own information. And I think there's a future where we take that away from people and the capture of information that you're going to use for different enterprise workflows

enables more multimodal use, if that makes sense. And so like the sort of obvious example would be there are companies from like perhaps a half generation ago, like the Gongs of the world that captured video and found some keywords and initial insights for sales reps. But the communications within an organization, the decisions made, the

I think there will be much more capture, especially of video, but making use of it requires companies to do that capture. So we kind of require this intermediate step.

There's a company in our, and this is still a prosumer company today as well, to your point of like, you know, the consumer-prosumer side is ahead of the enterprise, but there's a company in our last embed batch called Highlight that kind of has this premise that like, okay, well, you know, we're going to use the multi-modality by using on-screen capture, that's what this little like bubble is, on-screen and audio capture. And I think that, I think it's a powerful idea.

By the way, just a quick check. Peter, Isaac, are you here? You're welcome. Hi, thanks. Yeah, there's sort of like a meme going around that the price of intelligence is going to go to zero. And you can kind of see this with GPT-4.0 and with Gemini Flash, you can get a million tokens a day, which is probably enough for a small company, right? So I'm curious how...

As these large companies lose tons of money for market share, how are startups going to respond to this? How does that change? Okay. I think it is impossible for anything to be too cheap. So I'll start with that. I would also say this company, with this awesome revenue chart, I'm pretty sure we paid $5 to $7 million to a foundation model provider in this period of time. And so demand

If there was a secondary theme to this talk, demand is elastic in so many ways, especially for technology. And when you make things cheaper, we want things to be more intelligent. And so if you make hundreds of calls in order to deliver an output, then suddenly the fact that the cost of a call has come down 85% doesn't do you enough.

And so, yes, it's like an incredibly compelling idea of like having intelligence too cheap to meter. I'm like, maybe this is really old school of me, but for the last two decades, like the internet and compute and software and data pipeline, like they still hasn't been cheap enough. Actually, we would do more if it was free. So the other like physical barrier that we've run into is

When models are really large, if you're not going to quantize and distill and do domain-specific things, it's hard to run. You need a lot of compute, just to state the very basics. And even with the foundation model providers, we are seeing people run into inference capacity issues. And so I do not know if this is true, but one way to read anthropic pricing change is there's not enough capacity.

And so I think incredible kudos to the open source ecosystem, incredible kudos to OpenAI for staying on this drumbeat of offering cheaper and cheaper intelligence in every generation. But we have companies that are spending a lot of money on, let's say, search and validation systems with many calls, and we think that will continue.

I think you can see that as well in the price charts that we had before. O1 pricing is still absurd. With love. Yeah, but volume of tokens-- Absurd. How could they? I think it is really interesting that if you believe-- the other part of this is if you look at the test time compute scaling,

It's a log scale. It's easy to forget that that's a lot of... Historically, as a result of overtraining, a small set of companies took on the majority of financial burden for generating high-quality models, which is you just overtrain the shit out of your model and then it's useful for everyone else. If the customer has to pay this, that's a lot of money. If you want high-quality generation and that means that I pay on the order of thousands of attempts, that ends up being pretty expensive.

Question from YouTube. Hi YouTube! You talked about price, price going down. There's also the other dimension of capabilities going up and people always get steamrolled by opening AI. So the question is, what are some specific ways that you've seen companies build to prepare for better models like GPT-5 or O2? How do you future-proof that?

So I think the most common refrain from at least OpenAI, but I think the model companies is you should build a company where you're excited when you hear that a new model is coming out, not anxious. I would have one edit to this, which is in the limit, it seems like the majority of things that are worth building today are actually, I don't know, should you hire a sales team at all if you think that models will be perfectly capable? One framing that I've thought about on this is you should...

decide how much you believe foundation models will improve on some core learning or intelligence capability, and then build your company imagining that on that prediction. So an example here would be

I think there's a generation of these copywriting companies that were largely subsumed by ChatGPT. And the story for many of them was the original usage was they understood better than other people how to get the model to learn what my intent was in generating some piece of content, some piece of SEO content, or they understood how to ingest information about my business.

And it's not hard to imagine the next generation models are just natively better at this. The context length gets longer, you can stuff more into the context length, you can crawl and learn more about external websites. All that is relatively cheap. And so if the core thesis of the company looks like we don't think models will be capable of doing that, that feels likely short-sighted. On the other hand,

like there are a number of delivery mechanisms that are like far out of range of what what models will do like uh sarah had a a good example of this which is like there are some businesses where the limiting factor is like not actually intelligence like the the limiting factor for a number of businesses is like access to a specific set of people or um like i don't know we work with a pharmacy services company where like a core question is like long term can you negotiate pricing contracts like core issue there is on intelligence you need some amount of scale and then the ability to negotiate contracts um

So I think many businesses are not exactly just a function of your ability to efficiently compute some small set of things.

I gave this presentation with Pranav and I'm like, oh, I'm so biased. It just sounds like startups are going to win everything. We still, I like to play this game, which is what investment decision do you regret from the past year? It's a really fun game. I'm super fun. Yes. But one of the decisions that I regretted was actually a company that operates in, I

a space that feels very core to perhaps foundation model companies and to hyperscale software players where there's tons of ecosystem risk around the company. And by the way, the people are amazing, the metrics were amazing. We're just like, oh, they're going to get crushed. And so with everything I said, I still like overestimated the incumbents like ability to compete and make aggressive strategic decisions.

And so I think it's like really hard to overstate how important it is to understand

Somebody can steamroll you if they focused all of their effort and all their best people on a particular area. Are they going to? The copywriting example is illustrative because it's just not hard to see that understanding the context of a business from its website and from a couple documents and by making prompting a little bit easier and adding some buttons that replace some prompts

or doing suggested queries, it's just not a lot of work. But there are things that are a lot of work, like having taste in developer products and distributing something amazing. And so I actually think that it's,

If you ask me, we have to make predictions in this business. I worry more about under-projecting capability than I worry about over-projecting, at least in the short term. And then I worry more about expecting too much from the incumbents and being too afraid of them than being not afraid enough. Maybe it's just one investment regret. That's right.

Either one of you. Yeah. We have one more from online. Oh, okay. Okay, you can do that. It has to do with hardware. I'll just shout. How do you see AI changing hardware or in what ways? And for example, do you see a new Apple coming out and transforming hardware to that level? Not specifically, but humane situations if they're trying to pass very general hardware AI. Okay, I'd approach this from...

two dimensions.

every investor wants a new consumer hardware platform to exist because it's so valuable. And the question is, why should it? I can think of two very good reasons. One is that the usage pattern that you can imagine for AI applications actually requires you to, the specs you'd want are different. What if I want to capture image or video 100% of the time and that's

like a determinant of my battery life, of my sensors, of how I manage my network, etc. What if I want to run local models all the time? Like maybe most of the phone should be a GPU, right? I don't, I think that the usage patterns are perhaps very different for the next generation of, you know, the intelligence in your hand. I think it's a hard thing to pull off. Another reason

that you could believe in a new hardware device is that the advantages of the existing consumer platforms go away. Right? And so at the extreme, like, should you have individual applications that

track a single habit, like drink water today, Sarah. Like, I don't know, like I can generate that pretty easily now. And like maybe the single function applications that live in the mobile phone ecosystems are part of a more general intelligence and they like that ecosystem is less important.

And so I think there are different arguments for this. And like we continually look for opportunities to invest here. I don't think this is exactly what you asked, but I also think the, like there are,

We invested in a company this past year that is doing robotics. I, for many years at Greylock, my prior firm, like thought of robotics as an easy way to lose a lot of money over a long period of time. And like, I think that is true when you look at the outcome set for classical robotics, even for the companies that got to scale of distribution for an industrial robot or a single use consumer robot. But like,

It's really cool that algorithms and generalization from the broader machine learning field seem to apply here as well. And so I think being imaginative about what physical intelligence looks like is also something we're excited about.

So related to agents, I think everyone has been chatting about agents, you're seeing more agent usefulness in production, but I'm more curious, like at the infrastructure layer, what infrastructure primitives do you think are required for agents to actually work and continue to work in production? Okay, we've talked about this a little bit. I'm not sure if our points of view on this are the same. I think it's really hard to tell. My suspicion is that

like if you look at the number of like true agents that work like the number roughly rounds to zero maybe it's like low single digits or low double digits now um double double yeah and uh like they're they're all like relatively recent i would say like beginning of this year um we saw like a bunch of agent framework companies and um like i uh like i empathize with like the the root of the question which is it's just really hard to tell what any of these companies need especially when like this set of companies that works really well is unclear and um

I think there's a lot of valid anxiety on what foundation model companies want the interface to be. The computer interface is a pretty low level one. The anthropic version actually just makes specific clicks and

Rumors of other interfaces are much more general. They take actions on a specific webpage or entire browser environments. At a high level, I imagine that there are sets of... There's the full scope of tools, which is like... I worked in a search engine for a while. Crawl seems pretty useful. Live data seems pretty useful. An API that looks something like...

here's a URL, give me the set of data that's available, or here's a URL and a user login, let me take some action on this page seems pretty useful. And then I don't know what the right place to operationalize this and commercially develop a product are. If I had, like, if I was building a company here, like, one thing that I think it's useful to just remain agile, like, the core set of infrastructure is consistently useful, like, a crawler is consistently useful, and then one day you can figure out how to expose this better. But

But I empathize with the difficulty of it's really hard to know what works for a bunch of agent companies. And my suspicion is the most successful agent frameworks will come from the most successful of these agent companies that solve these problems in-house for themselves and then operationalize this externally. It's some version of React is really useful because React was well adopted at Facebook for a while.

I think we can say that there are missing components in the ecosystem where if there was a default, lots of agent developers would use it. And so identity and access management is a big problem.

If you could make agent development feel more like traditional software development, I think a lot of people would use that and be like, "Oh, it magically retries until it gets something and then it gives me data back about how well it's working." I think it's pretty easy to actually imagine the utilities in the abstract that would be useful to the ecosystem.

the entire environment is fluid, right? And so do you need, like if you think about other things in infrastructure, like will more workloads need vector indices? Yes. Like what is the shape of company that gets to be durable here? Like we don't know yet and we'll keep looking at it. But as Pranav said, I think we look to the handful of companies in our portfolio that are agents working at some scale and like look for the patterns there versus try to intuit it right now.

My cache hit was wrong, I should have updated. It's a dozen, not a small number. It's been a long six months, guys. I think one last question, then there's a whole bunch of online stuff we won't get to, but... Mark. It seems like there should be more consumer companies. Why aren't there? Or is it just a matter of time? I think simply a matter of time. We...

We keep bringing people into embed, we keep looking. I think I genuinely, this is not a knock on the research community or the really young set of founders that like, I think focused on AI companies first, but the diffusion of innovation curve that applies to customers, I think also applies to entrepreneurs. Researchers saw the capability first and they're like,

like we should do something with this. This is going to be amazing. And it's like, that will continue to happen. Like our portfolio is heavily overrepresented with people from the research community pushing the state of the art with creative technical ideas. I think young, very young people also were quite early to AI because they're like, oh, of course, like this makes sense. I've never seen other technology like chat GPT all the way. And their opportunity cost is lower than like,

you're the best product person at an amazing product organization. You have to leave your job to start a new company. And it's been a really long two years. I feel like that's just started to happen where some of the talent that has the, and maybe it's just like the next Zuck

there's some dropout that figures out the pattern of social interaction and is really AI native about this stuff. I also think there's a chance that some of the people who have built intuition for consumer adoption and consumer interfaces, they're just taking a little bit to also build intuition for AI products.

and now they're showing up and starting companies and experimenting. And so we have a lot of confidence like it is going to happen over the next few years in a matter of time. Okay, I think we're out of time. I'm just trying to defer to Sean here, but thank you so much. You know, please call us. Thank you, that was amazing. Thank you.

2024 in AI Startups [LS Live @ NeurIPS] 52:23 Share