We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode The 10 Trillion Parameter AI Model With 300 IQ

The 10 Trillion Parameter AI Model With 300 IQ

2024/11/1
logo of podcast Lightcone Podcast

Lightcone Podcast

AI Deep Dive AI Insights AI Chapters Transcript
People
H
Harj
J
Jared
L
Leslie Kendrick
M
Mark Blyth
M
Mark Mandel
Topics
Mark Mandel: 我认为,如果O1模型像我们想象的那么神奇,那么对于创业者和开发者来说,它可能是把双刃剑。一方面,它可能过于强大,导致OpenAI一家独大,垄断所有未来的价值。另一方面,它也可能简化开发者的工作,使他们能够专注于用户体验和产品的细节,从而提升产品竞争力。 我观察到,在当前的AI发展阶段,开发者们花费大量时间在优化提示词和确保输出准确性上。如果AI模型能够变得更确定性和准确性,那么开发者就可以将更多的时间和精力投入到用户界面、用户体验以及其他软件工程方面的工作中。最终,谁能打造出最佳的用户体验,并处理好所有这些细枝末节,谁就能胜出。 Leslie Kendrick: 我同意Mark的观点。我认为,拥有万亿参数的AI模型将带来巨大的创新潜力,这可能与GPT-2的出现一样具有里程碑式的意义。模型参数规模的指数级增长将推动新一轮的AI发展浪潮,就像GPT-3.5的发布所带来的影响一样。 此外,我们已经看到,当前最先进的AI模型已经接近通用人工智能(AGI)的水平,能够胜任大部分知识工作者的日常工作。而万亿参数模型可能拥有远超人类的智力水平,这将带来难以预测的突破。 然而,我们也必须认识到,当前AI技术的影响尚未广泛显现。傅里叶变换的例子说明,重大技术突破可能需要很长时间才能被大众感知。AI技术主要体现在软件中,因此其应用速度可能比硬件技术更快。AI技术融入消费级设备(如智能眼镜)将加速其普及。 大型语言模型可以被蒸馏成更小、更廉价的模型,从而扩大其应用范围。OpenAI已经在其API中实现了模型蒸馏功能。许多公司更倾向于使用蒸馏后的较小模型,而不是大型模型。大型语言模型市场正在走向多元化,开发者更倾向于使用多种不同的大型语言模型。 O1模型的早期测试显示其具有强大的功能,能够根据文档和代码构建应用程序。O1模型的强大功能可能导致OpenAI垄断价值,但也可能降低开发门槛,促进竞争。O1模型可能使开发者过去花费在提示词优化上的时间变得不再必要,从而降低开发门槛。 随着模型准确性的提高,更多任务将能够使用LLM完成。AI技术可以显著提高企业的效率和盈利能力,帮助企业实现高速增长和高盈利,甚至帮助企业摆脱困境,实现盈利。一些公司已经开始使用AI技术替换内部系统。垂直领域应用是未来AI发展的重要方向。O1模型将对AI应用的开发者产生重大影响。OpenAI能否保持其在AI领域的领先地位尚不明确,能否实现真正具有防御性的技术突破也尚不明确。O1模型增加了对GPU的需求。O1模型可以用于不同类型的应用场景,其应用取决于目标用户和应用场景。 语音AI是AI应用的一个重要方向,语音AI技术已经取得了显著进展。许多企业尚未重视AI技术,企业对AI技术的接受程度可能存在代际差异,企业对新技术的接受通常需要较长时间。AI技术发展速度非常快,AI技术正在改变开发者的工作方式,新型代码辅助工具正在迅速普及,其普及速度很快。开发者工具市场中,YC孵化的公司是重要的风向标。AI技术的发展将最终使开发者受益,未来AI领域的主要竞争将集中在用户体验和细节方面。强大的AI模型可能加速科学技术进步,强大的AI模型可以分析海量信息,从而加速科学发现,强大的AI模型可能带来突破性的科学发现。 Sarah Fryer: 下一代AI模型将需要巨额资本投入,因为规模效应至关重要。 Harj: 随着模型准确性的提高,更多任务将能够使用LLM完成。 Jared: O1模型将Dry Merch公司的准确率从80%提高到接近100%。 Mark Blyth: OpenAI的实时语音API可能对依赖呼叫中心的行业构成威胁,语音AI是AI应用的一个重要方向,语音AI技术已经取得了显著进展。

Deep Dive

Key Insights

Why is OpenAI's $6.6 billion funding round significant for the future of AI models?

OpenAI's $6.6 billion funding round, the largest venture round ever, highlights the capital-intensive nature of developing next-generation AI models. The CFO, Sarah Fryer, emphasized that scaling laws now require orders of magnitude increases in model size, making compute resources and talent critical investments. This funding will enable OpenAI to build models that are significantly larger and more powerful, potentially unlocking new capabilities and innovations.

What potential impact could a 10 trillion parameter AI model have on innovation?

A 10 trillion parameter AI model, two orders of magnitude larger than current state-of-the-art models, could lead to a leap in innovation similar to the transition from GPT-2 to GPT-3. Such a model might unlock new scientific discoveries, improve reasoning capabilities, and enable applications that were previously impossible. It could also lead to a flourishing ecosystem of AI-driven companies, much like the boom seen after GPT-3's release.

How does the O1 model change the landscape for AI builders and founders?

The O1 model introduces a more deterministic and accurate AI, reducing the time founders spend on prompt engineering and output accuracy. This allows them to focus on core software development, user experience, and business growth. However, there is a concern that O1's power could centralize value capture within OpenAI, potentially limiting opportunities for other builders. On the other hand, it lowers the barrier to entry, enabling more competition and innovation.

What are the implications of AI models like O1 for enterprise automation?

AI models like O1 can significantly enhance enterprise automation by improving accuracy and reducing the need for human intervention. For example, companies have automated up to 60% of customer support tickets, leading to cost savings and improved efficiency. This allows businesses to achieve cash flow break-even while maintaining growth, creating substantial enterprise value and freeing up resources for further innovation.

How does the rise of AI voice applications impact industries like call centers?

AI voice applications, such as OpenAI's real-time voice API priced at $9 per hour, pose a significant threat to industries reliant on call centers. These applications can handle tasks like debt collection and logistics coordination with high accuracy and low latency, potentially replacing human workers in these roles. This shift could lead to cost savings for businesses but also disrupt traditional employment models in these sectors.

What role does model distillation play in making AI more accessible?

Model distillation allows larger, more expensive models like O1 to train smaller, cheaper models that retain much of the original's capabilities. This makes AI more accessible by reducing inference costs and latency, enabling broader adoption. For example, OpenAI has enabled distillation from O1 to GPT-4.0 Mini, allowing developers to use smaller models for routine tasks while reserving the larger model for more complex problems.

How are developers diversifying their use of AI models beyond OpenAI?

Developers are increasingly diversifying their use of AI models, with platforms like Claude and Lama gaining significant market share. For instance, Claude's developer market share among YC companies jumped from 5% to 25% in six months. This diversification reflects a shift away from OpenAI's dominance, as developers seek models that better suit specific use cases, such as coding or legal applications.

What are the potential long-term implications of AI models with 10 trillion parameters?

AI models with 10 trillion parameters could revolutionize scientific and technological progress by analyzing vast amounts of data and generating original insights. They might unlock breakthroughs like room-temperature superconductors, fusion energy, or advanced space travel. Such models could act as a 'rocket to Mars' for human intelligence, accelerating discoveries and solving complex problems that have eluded human researchers for decades.

Shownotes Transcript

Translations:
中文

If O1 is this magical, what does it actually mean for founders and builders? One argument is it's bad for builders because maybe O1 is just so powerful that open AI will just capture all the value. You mean they're going to capture a light cone of all future value? Yeah, they'll capture a light cone of all present, past, and future value. Oh my god.

The alternative, more optimistic scenario is we see ourselves how much time the founders spend, especially during the batch on getting prompts to work correctly, getting the outputs to be accurate. But if it becomes more deterministic and accurate, then they can just spend their time on bread and butter software things. The winners will just be whoever builds the best user experience and gets all these nitty gritty details correct.

Welcome back to another episode of The Light Cone. We are sort of in this moment where OpenAI has raised the largest venture round ever, $6.6 billion with a B. Here's what Sarah Fryer, the CFO of OpenAI, said about how they're going to use the money. It's compute first and it's not cheap.

It's great talent, second. And then, of course, it's all the normal operating expenses of a more traditional company. But I think there is no denying that we're on a scaling law right now where orders of magnitude matter. The next model is going to be an order of magnitude bigger and the next one on and on. And so that does make it very capital intensive.

So, it's really about orders of magnitude. Let's live in the future. There's 10 trillion parameters out there, 10 trillion parameter large language models, two orders of magnitude out from the state of the art today.

What happens? Like, are people actually going to be throwing queries and actually using these 10 trillion parameter models? It seems like you'd be waiting, you know, 10 minutes per token. Yeah, for a bit of context right now, the frontier models, I mean, they're not public exactly how many parameters they have, but they're roughly in the 500s of billions-ish, like Lama 3, 4 or 5 billion.

Anthropic is speculated to be 500 billion, GPT-4-0 roughly around that much. Getting to 10 trillion, that's a two-order magnitude, right? I think the type of level of potential innovation could be the same leap we saw from GPT-2, which was around 1 billion parameters that was released with the paper of a scaling loss, which was one of these seminal papers that people figure out, okay, this is transformer architecture that we figured out.

what if we just throw a bunch of engineering and just do a lot of it? Where does this scale and this logarithmic type of scaling? Then this was proved out when GPT-3.5.3 got released. That was about 170-ish billion parameters. So that's like that two order magnitude. And we saw what happened with that. That created this new flourishing era of AI companies. And we saw it, we experienced this back in 2023.

when we started seeing all these companies building on top of GBD 3.5 that was starting to work and it created this giant wealth. So we could probably expect if this scaling law continues

The feeling will be similar to what we felt from that year of transition in 2022 to 2023. Yeah, that was the moment when everything changed. So that would be pretty wild if that happens again. I think there's one interesting aspect to this, which is clearly the current generation state-of-the-art models that are available, especially given O1 chain of thoughts, right?

they sort of basically rival normal intelligence. Like you could make a strong case that AGI is basically already here. The majority of the tasks that 98% of knowledge workers do day to day

It is now possible for a software engineer, probably sitting in front of cursor, to write something that gets to 90-98% accuracy and actually do what a human knowledge worker with 120 IQ would be doing all day.

And that's sort of writ large, like there are probably hundreds of companies that each of us have worked with over the past few years that are literally doing that day to day right now. You know, the weird interesting question is like at 10 trillion parameters at, you know, 200 to 300 IQ, like sort of ASI beyond what a normal human being normally could do.

you know, what does that unlock? There's a great article in The Atlantic with Terence Tao, sort of famously this Taiwanese mathematician who is like literally north of 200 IQ and how he uses chat GPT right now. And it's sort of unlocking new capabilities for him. There are some examples of this happening, you know,

Quite a few times in human history, like you could argue that nuclear power was that fission. You had to actually model theoretically that something like nuclear fission was possible before anyone experimentally tried to do it for your transforms. Yeah, maybe the thing is...

If we think a lot of the capabilities right now are here, but it's not evenly distributed. If you go walk down the street and you talk to the random Joe, they don't feel the AI. They're just living their normal life and stuff is still just normal. It hasn't changed.

But I think the counterexample is just sometimes these discoveries take time for it to really pan out. This is an example we're discussing. Fourier transform was this mathematical representation that Joseph Fourier discovered in the 1800s. That was like a seminal thesis that he wrote.

about representing series of functions that were repeating in periods. And before Fourier transform, they were written as these long sums, a series of sums that are very expensive to add them up and figure out how to really

model the equation basically, but he found this very elegant way that instead of just doing sums or series, you could basically collapse all these math function into sines and cosines wave that only need two variables, basically the amplitude and the period, and you could represent every periodic signal and function. I mean, it's not really cool math, which is like how some of this LLM and use case sounds like, okay, cool, it can do all this coding.

But Fourier transform, it took another 150 years until the 1950s. When people figure out what to do with this, it turned out that Fourier transform were super good at representing signals. And we need signals basically to represent everything in the analog world to be digital because bits are ones and zeros. And how do you compress that? And

One of the big applications as well is radio waves and made telecommunication a lot more efficient. Image representation, encoding, information theory, it just unlocks so much of the modern world. Like the internet and cell towers work because of this theory. But it took 150 years until the average Joe could feel the Fourier transform. Interesting. That's a really powerful idea. Yeah.

I mean, that took a while then. I mean, apparently in the 1950s, that's the moment that color TV happened. Unlocked by Fourier transforms as well. That's right. If you apply it to the AI stuff that's happening today, though, it's like, one, where do you start the clock ticking from? Like, it's not clear if you start it from...

the chat gbt3 moment two years ago or from just like all of the research that's been going on for decades like we might actually just be we've talked about this before but we might actually be like decades into this now and it's starting to hit like the inflection moment potentially yeah for sure i mean if we run with diana's

of fast Fourier transforms, like all the math that's underpinning all this new AI stuff is linear algebra stuff that's like 100 years old. It just turns out that...

But I have all the GPUs to compute it. I guess that's one potential way that these 10 trillion parameter models actually alter the face of what humans are capable of. Like they sort of unlock something about the nature of reality and our ability to model it. And then somehow it leads to either nuclear weapons or the color TV. The other big thing is just because this is all in software, like, yeah,

compared to like furrier transforms that like a lot of the applications are seeing physical devices right like record players or or telephones like you said and so it takes a while for your technology to get adopted because you have to like buy your updated device all these things now we have like facebook and google

who have like you know pretty decent percentage just like the world using their software already like as soon as these things start rolling out and i feel like it's another thing that's starting to be noticed is meta in particular coming out with their meta ray bands the consumer like device like i think consumers once this becomes um something that's like visual in your like

smart glasses, plus a voice app that you can talk to and it is indistinguishable from a human being, that's going to be a real change the world moment for people. They'll start feeling the AI once they can talk to it all the time. I mean, it seems like there's really a bifurcation in what we might expect when we have this capability. At the extreme end, you're going to have people like Terence Tao pushing the edge and boundary of our understanding of our modelable world.

And then maybe that's actually worth tens or hundreds of millions of dollars of inference to run these 10 trillion parameter models. And then the more likely way this ends up being useful for the rest of us is actually in distillation. So there's some evidence that, for instance, Meta's 405b was mostly useful to make their 70 billion parameter model much, much better.

And you actually see this today. There's sort of this moment there where we thought that people might just go to GPD-4 and distill out all the weights. And it seems like there's some evidence that certain governmental entities are doing that already.

But GPT-4 itself and 4.0, it became 4.0, OpenAI itself has now enabled distillation internal to its own API. So you can use O1, you can use even GPT-4 or 4.0 to distill it down to a much cheaper model that's internal to them, like GPT-4, 4.0 Mini.

And that's sort of their lock-in capability. Yeah, I don't think this is talked much about, but it is interesting that you have these giant models like the 400 or 500, whatever, billion parameter models that are basically the teacher models because they're the mega train with everything and took forever.

And they are the teacher model, master model that teaches a student model, which are these smaller ones that are faster and cheaper because doing inference for a $4 or $5 billion parameter model, it's very expensive. So we have evidence that all these distillation models are working. Companies in the batch that are building from the latest and greatest, they're not going for the...

giant model with all of the parameters and give me the biggest thing to do so that it works the best. We have evidence that's not the case. People are not going for the big model.

And we actually have stats in the badge. I mean, harsh. We kind of talked about them. Yeah, Jared ran some numbers on this and it's fascinating. But I think the bigger meta point is even the fact that talking about the startups or the founders building this stuff are choosing like the smaller models versus bigger models. I just have choice. And even like a year ago when this entire industry started

existing. Everything was built on top of ChatGPT. It was 100% market share, the ChatGPT wrapper meme. And I feel like we've, especially over the last six months, seen people start talking about the other models, like Claude and Sonnet being sort of this word of mouth for almost being better at Cogen than

ChatGBT and people just starting to use different models. And so the numbers that Jared ran for the Summer 24 batch are fascinating because it seems that that trend has just continued. We have more diversification of LLMs and models that develops a building on top of. And some of the stuff that really stood out is Claude

has even just in six months from the winter batch, the summer batch has gone from like 5% developer market share to like 25%. Of companies in the batch. Yes, of companies in the batch, which is huge. That's I've never seen a jump like that. Right.

Lama's gone from 0% to 8%. One thing that we know from running YC for a long time is that whatever the companies in the batch use is a very good predictor of what the best companies in the world are using and therefore what

the products will be most successful. A lot of YC's most successful companies, you could have basically predicted which ones they would be based on just looking at basically just running a poll of what the companies in the batch use. If we just take OpenAI's latest fundraise off the table and the latest, like the 01 model off the table for a second, it would seem like amongst developers and builders, OpenAI was losing. Like they went from being the only game in town to just like bleeding market share to the other market share.

models at a pretty rapid rate the interesting thing though is maybe they are coming back like was the stat that you pulled jerry seems like or 15 of the batch are already using o1 even though it's not like fully available yet it's only like two weeks old now yep yeah

And we're seeing some interesting things with O1. We're actually hosting right now, in person right now as we speak, downstairs, a hackathon to give YC companies early access to O1. And Sam himself was here at the kickoff. There's a bunch of open AI researchers and engineers working on it. And it's only been about four hours of hacking, and we already heard of-- I already saw actually some demos as I was walking by to see some teams.

And they already built things that were not possible before with any other model. MARK MANDEL: Do you have some examples? LESLIE KENDRICK: One of the companies I'm working with is Freestyle. They're building a cloud solution fully built with TypeScript, if you're familiar with durable objects, with this really cool framework that makes front end and back end seamless to develop and is really cool to use. What was cool about--

them is they'd just been working on it for a couple hours, and I saw a demo that was mind-blown. They basically got a version of Repl.it agent working with the product. All they had to prompt 01 with was all their developer, some of their developer documentation and some of their code, and they could just prompt it, build me a web app that writes a to-do list or this, and it was just boom, just work. And it was able to reason and inference

with the documentation and took a lot longer, but it arrived and built the actual app. What's interesting for us to talk about is if O1 is this magical, what does it actually mean for founders and builders? And one argument is it's bad for builders because maybe O1 is just so powerful that open AI will just capture all the value and everything that could be valuable to

and build on top of this stuff will just be owned by them. You mean they're going to capture a light cone of all future value? Yeah, they're going to capture a light cone of all present, past, and future value. Oh my God. The alternative, more optimistic scenario is we see ourselves how much time the founders spend, especially during the batch, on the tooling around getting...

Getting prompts to work correctly, getting the outputs to be accurate, human in the loop, all of this time spent just getting the core product working is not deterministic. But if it becomes more deterministic and accurate, then they can just spend their time on bread and butter software things, like better UI, better customer experience, more sales, more relationships. In which case it's like,

it may be a better time to start now than ever because you don't even have like maybe all of the knowledge you learn around how to get like the prompts accurate and working was just

temporary knowledge that's no longer relevant as these things get more powerful. Actually, we had this conversation with Jake Heller from Casetax where getting the legal copilot to work to 100% was the huge unlock. And it was really hard. Yep. He like, you know, we hated this whole talk about all the things he had to do to actually get the thing to be accurate enough. Yeah. Imagine if he didn't have to do any of that. If they just on day one, you could be guaranteed 100% accuracy as though you're just building a web app on top of a database. Yeah.

barrier to entry to build these things goes way down, there's going to be more competition than ever. And then it will probably just become look more like a traditional winner takes all software market. Jared has an example. So there's a company, Dry Merch, that you work with, and they went from 80% accuracy to pretty much a 99 or for intents and purposes, 100%.

Using 01 and it unlocked a bunch of things. You want to talk about them? Yeah, just by swapping out GPT-404 for 01. I think there might be an even more bullish version, Harj, which is that there are use cases right now that people...

are not able to use LLMs for because even though they're trying to get the accuracy high enough, they just can't get it accurate enough to actually be rolled out in production. Especially if you think about really mission-critical jobs where the consequences of mistakes are dire, pretty hard to use LLMs for that. As they keep getting more accurate, those applications will start to actually work. I guess there is a lot of evidence inside the YC...

greater portfolio. I was meeting a company from 2017. I think I tweeted about them. They were

$50 million annualized revenue at that point, but growing 50% a year. A year or two ago, they were not profitable. They knew that they needed to raise more money. But in the year since, they automated about 60% of their customer support tickets. And they went from something that needed to raise another round imminently to something that was totally cash flow break-even while still growing 50%.

year on year. That's sort of like the dream scenario for building enterprise value because you're big enough that you know you're going concern and then you're literally compounding your growth with like no additional capital coming in. So it's companies like that that actually end up becoming like half a billion, a billion dollars a year in revenue and like

driving hundreds of millions of dollars in free cash flow. I mean, that's sort of the dream for founders at some level. And I think that that's one of the more dramatic examples that I've seen thus far. And I think it's sort of not an isolated case. You know how we're sort of talking, it's 2024 now, and we're still in this overhang moment where companies sort of on this path raised way too much money at, you know,

30x or 40x, you know, next 12 months multiple revenue, like,

seemingly struggling, but also never going to raise another round. This is actually pretty good news for them because they actually can go from not profitable to break even to then potentially very profitable. I think that narrative is not out there, and I think it's really, really good news for founders. I've already started to catch attention. Didn't the Klana CEO got a lot of attention a few weeks ago for... I mean, it's not unclear how much of it is real or not, but at least they're pitching that they're just...

you know, replacing their internal systems of records for HR and sales with home-built or LLM-create apps, at least was like the insinuation. Yeah, what is it? They got rid of Workday. Yeah, that was it. Yeah. That's pretty wild, honestly. I mean, so that's good. If you treat OpenAI as the Google of the next 20 years, you want to invest in Google and all the things that Google enabled, like Airbnb. Google could do Airbnb. It probably won't.

Yeah. Just from like, I don't know, Coase's theorem of the firm probably is just like too inefficient and too difficult, requires too much domain expertise to actually pull that off. So what are sort of the new Googles that are getting built out? There's these vertical agents. What are some examples that we'll have that we can talk about? I loved working with this company called TaxGPT from the last YC batch. They started off actually really...

really literally a rapper and like, you know, it's in the name tax GPT. But my favorite example about them is like, you know, it turned out that tax advice, you know, doing basic rag on, you know, it's sort of like case text, actually, it was, you know, being able to do

rag on existing case law and existing policy documents from the IRS or internationally. That was just sort of the wedge that got them in front of tens of thousands of accountants and accounting firms.

And now what they're doing is building an enterprise business on document upload. So you sort of get them for cheap or free for the thing that people are Googling for. And then once they know about you and trust you, you get like this $10,000 or $100,000 a year ACV contract that then takes over real workflow that actually extinguishes tens to hundreds of hours of work per accountant. Yeah.

Another interesting thing about the O1 model is we were just saying, originally ChatGPT was the only thing you could build on top of. OpenAI was the only game in town. Then there were all these models. I think the sort of alpha leak we have here, like right now in this room, is downstairs people are building at the cutting edge of O1 that even the public does not have access to. And what we're seeing is that this is a real major step forward. Like O1 is going to be a big deal for any programmer or engineer who is building an

an AI application. The interesting thing is, will this cycle repeat where it will give OpenAI a temporary lead, their market share will just like go, you know, back up towards 100%, but then within six months,

Llama will be updated, Claude will come up with its new release, Gemini will keep getting better, and there'll just be four different models that have equivalent levels of reasoning. Or will this be the first time OpenAI has a true breakthrough? And I would just define a true breakthrough as something that's actually defensible. If no one else can replicate it, then that puts them in a really powerful position. But we don't know. And I think that's what's interesting. It's like OpenAI seems like it is continually the one pushing the envelope, but they always seem to be

the first ones to make major breakthroughs, but they have never been able to maintain the lead so far. I think the other thing that's interesting about O1 is that it makes a lot of the GPU needs even bigger because it's moving a lot of the computation needs a lot higher for inference because it's taking a lot more time to do a lot of the inference. So I think it's going to change also a lot of dynamics underneath for a lot of the companies building AI infrastructure as well.

which is something food for thought. It seems like there are two different types of use cases. I believe they did just enable distillation from O1 into 4-0. And so it's conceivable that for relatively rote and repeating use cases, you could just sort of use O1 for the difficult ones, and then you distill it out, and then you pay 4-0 or 4-0 mini prices from there.

And then again, there's this other type of problem that is very specific. I mean, I imagine many code gen situations are a little bit more like that, where you need to pay for the full O1 experience because it's fairly detailed and specific. MARK MANDEL: It depends on who you're building for too, right? If you're an enterprise software and you can pass the cost on to your customer and they can tolerate high latency and don't care as much about it being instant,

then you can just use maybe 01 a lot. If you're building consumer apps, probably not. But talking of consumer apps, the other thing that was striking about OpenAI is release releases like this real-time voice API. MARK MANDEL: Super cool. MARK BLYTH: It's pretty remarkable. And I think the most telling thing to me is that the ongoing usage-based pricing is $9 per hour.

And it sort of points to a sort of powerful thing. Like if I were a macro trader, I would be very, very bearish on countries that have that rely very heavily on call centers right now because your nine dollars an hour is sort of right there at what a call center would cost. This is another thing we're definitely seeing within the batch, right? Like it's clear that voice is a tool.

almost like a killer app, arguably. There's a company I just worked with in this batch, or in my group at least, Domoove, just do AI voice for debt collection, and their traction is just phenomenal. It's working incredibly well. A whole bunch of the voice apps in S24 were just some of the fastest-growing, just explosive companies. It was a clear trend for S24. And I remember working with companies...

in the prior two batches that tried to do voice, and it just wasn't working well enough. The latency was too high. It got confused with interruptions and things like that. It's just turned the corner where it's finally working. There's another company I work with, Happy Robot, that landed on this idea that was doing a voice agent for coordinating all these phone calls for logistics. Think of a truck driver that needs to go from point A to point B. These are all like

people just calling to check where you are. There's no like, there's no like, find my friends for it. And they started getting a lot of usage on this. And I think we talked a bit about this before that at this point, AI has passed two ring tests and is solving all of these very

menial problems over the phone. That's pretty wild. I guess one thing that is maybe under discussed is to what degree the engineering teams that are in these sort of incumbent industries, it feels like it's pretty binary, either, you know, the vast majority of companies and organizations, especially the ones that were maybe founded four or more years ago, they actually don't

take any of this seriously. Like they have literally no initiatives on this stuff. And I've sort of wonder how generational it is. Like I'm realizing that Eng managers and VP of Eng, like they're probably my age now. I'm 43 now. And, you know, if I wasn't here seeing exactly what was happening, I would be sort of tempted to say like, this is just the same old thing. AI, yeah, yeah, yeah. But I think it's the rate of improvement that people don't get if they're not

as close to it as we are. I just think your average corporate enterprise person is certainly used to technology disrupting things, but over pretty long timeframes. And if anything, they become cynical because they're like, oh, the cloud, like cloud was such a buzzword for a long time. It totally did change how enterprise software is built and delivered, but it took like a decade or so. And so I suspect everyone's feeling that way about AI. It's just your natural

default mode is to be cynical. Oh yeah, like it's not going to be ready for a while. And then probably if you looked at this stuff even six months ago, like we were just talking about, if you looked at an AI voice app six months ago, you're like, oh, this is...

this is years away from being anything that we need to take seriously. And it's like, actually, three to four months later, it's hit some real major inflection point. And I think that's what it takes. Even people within tech, it's surprising all of us how quickly this stuff is moving. MARK MANDEL: It's the fastest any tech has ever improved, I think. Certainly faster than processors, certainly faster than the cloud. LESLIE KENDRICK: And it's kind of fun to actually watch. It's been remarkable to actually see another example of this in the batch.

So a lot of the technical founders, sometimes I sit with them and I just watch how they code, the before and the after, before all of this wave of AI. It's just standard. You have your IDE and things on the terminal. People ship, fine. But the demos and products that we're seeing founders build during the batch, it's like a next level of polish. And when you see them and see them code, it's like they're living in the future. They're really not just like at this point, GitHub Copilot is already kind of old news.

They're using the latest, greatest coding assistant. A lot of them perhaps using something like continue or cursor, right? This is something Jared pulled out as well. When we asked the founders, all the IDs, right? Yeah. Yeah. We surveyed the summer 24 founders and half the batch is using cursor compared to only 12% that's using GitHub co-pilot. That was surprising to me. They're not even using the fully agentic coding agents like, uh,

So, for example, Replit, these are still sort of like co-pilot phase stuff, but even just going from GitHub co-pilot to Cursor, which is the next step up in terms of how much the actual AI does, is this incredible breakthrough. They shift very quickly. I mean, this is evident today in the hackathon. I was impressed with what they built. I was looking at their editors, like, Cursor. It's like, cool. It's another sign why the founders have the advantage, right? It feels to me, when GitHub co-pilot first came on the scene, it seemed...

It's GitHub plus Microsoft. It has all the developers, plus it has all the capital, plus it has all the access to the inside track on OpenAI. How could any coding IED compete with them? It would just get subsumed. And Cursor has come out of nowhere and is calling cloud numbers five times the size of GitHub Copilot within the batch. Which again, like you were saying earlier, the startup founders are actually usually the tastemakers on this kind of thing.

I think there's certain types of businesses where it doesn't make sense to maybe go after startup founders as your early customers. But for developer tools, it definitely does. Like Stripe, AWS both wanted to own YC batches in particular. And that worked out really well for them. So it's probably a really good sign for Cursor, honestly, that they have such good penetration within the YC batch. Yeah, I would definitely say Cursor is pretty awesome. But AltaVista was awesome too. Yeah.

I remember using that as a search engine, and there was another version, and the next version was 10 times better. And so this is the way it's going to go. I mean, the only people who win are actually developers because of all this competition. So I think, again, it takes us to the optimistic view of all of this stuff, which is as the models get more powerful, the winners will just be whoever builds the best user experience and gets all these, like,

nitty gritty details correct. And so that's why Cursor can be GitHub co-pilot that has all the advantages. AltaVista is a great example. Like there's still like Google still came along and crushed them, right? So there's still room for someone to keep doing to Cursor what Cursor has done to GitHub co-pilot. So let's get back to 10 trillion parameters. What world do you think we will live in with this made real with ASI or something approaching it, you know,

What will humans actually do and how much more awesome will it be? Well, I'll give a steel man for a really bullish case, which is that the thing that is holding back the rate of scientific and technological progress is arguably the number of smart people who can actually

analyze all the information that we already know about the world. There's millions of scientific papers already out there, an incredible amount of data, but like try reading all of it. It's far beyond the scale of any human's comprehension. And if we make the models smart enough that they can actually do original thinking and deep analysis with correct logic,

And you could let loose an infinite, a near infinite amount of intelligence on the near infinite amount of data and knowledge that we have about the world. You can just imagine it's just coming out with just crazy scientific discoveries, room temperature fusion, room temperature superconductors, time travel, flying cars, all the stuff that humans haven't been able to invent yet.

Like with enough intelligence, maybe we'll finally invent it all. Sign me up for that future. Sounds great. I totally agree with you. I think, you know, what this might be is not merely a bicycle for the mind. It might actually be a self-driving car or even crazier, maybe a rocket to Mars. So with that, we'll see you guys next time.