Today, my partner Jordan and I have a special episode of Unsupervised Learning, a crossover with one of our favorite AI podcasts, Latent Space. If you're not already a listener, Latent Space is a technical newsletter and podcast by and for AI engineers. It had over 2 million downloads in 2024, and it's become a go-to resource for anyone who wants to understand the cutting edge of AI infrastructure, tooling, and product.
If you like this show, it's definitely worth checking out. Given we've all spent a lot of time talking to some of the sharpest minds in AI, we thought it'd be fun to interview each other. In this episode, we dig into the questions we're constantly thinking about. What surprised us most last year? What we're paying most attention to right now? How we think about defensibility at the app layer? And which public companies we're long or short on? It's a different kind of episode, and I think you'll really enjoy it. Now here's my conversation with Swix and Alessio from Latent Space.
Well, thanks so much for doing this, guys. I feel like we've been excited to do a collab for a while. I love crossovers. Yeah, this is great. Like the ultimate meta about just podcasters talking to other podcasters. Yeah, it's a lot. Podcasts all the way up. I figured we'd have a pretty free-ranging conversation today, but brought a few conversation starters to kick us off. And so I figured one interesting place to start is, you know, obviously it feels like this world is changing like every few months. I'm wondering as you guys reflect on the past year, like what's
What surprised you the most? - I think definitely recently models. We're kind of on the right here. We're like calling-- - What was this? - Oh, that, well, I think there's like the, what surprised us in a good way and maybe in a bad way, I would say. In a good way, recently models and I think the release of them right after the new reps, scaling is dead talk by Ilya. I think there was maybe like a little,
it's so over and then we're so back in such a short period of time. - It was really fortuitous timing, right as pre-training died. Obviously, I'm sure within the labs, they knew pre-training was dying and had to find something, but from the outside, it felt like one right into the other. - Yeah, exactly. So that was a good surprise. - I would say if you want to make that comment about timing, I think it's suspiciously neat.
that like we know that strawberry was being worked on for like two years ish uh like and we know exactly when gnome joined open ai and that was obviously a big strategic bet by open ai um so like for it to transition so transition so nicely when like pre-training is kind of tapped out to into like oh now inference time is the new scaling law is like very convenient i i like
If there were an Illuminati, this would be what they planned. Or a simulation or something. Then you said open source as well? Yeah, well, no, I think open source, we're discussing this on the negative, I would say, the relevance of open source. Specifically open models. Yeah, I was surprised by the lack of other...
adoption. I mean, people use it, obviously, but I would say nobody's really a huge fanboy. I think the local Lama community and some of the more obvious use cases really like it. But when we talk to enterprise folks, it's like, it's cool. And I think people love to argue about licenses and all of that, but the reality is that it doesn't really change the adoption path of AI. So...
The specific stat that I got from Ankur from Brain Trust in one of the episodes that we did was, I think he estimated that open source model usage in work in enterprises is at 5% and going down. It feels like all these enterprises are in use case discovery mode where it's like, let's just take what we think is the most powerful model and figure out if we can find anything that works. So much of it feels like discovery of that. And then right as you've discovered something, a new generation of models are out, and so you have to go do discovery with those. Yeah.
I think obviously we're probably optimistic that the open source models increase in uptake. It's funny, I was going to say my biggest surprise in the last year was open source related, but it was just how fast open source caught up on the reasoning models. It was kind of unclear to me over time whether there would be a compounding advantage for some of the closed source models where in the early days of scaling, there was a tight time loop, but over time, would the gap increase? And if anything, it feels like a trunk. And I think
DeepSeek specifically was just really surprising in how, you know, in many ways, if the value of these model companies is like, you have a model for a period of time and you're the only one that can build products on top of that model while you have it, like, God, that time period is a lot shorter than I thought it was going to be a year ago. Yeah. I mean, again, I don't like this label of how fast open source caught up because it's really how fast DeepSeek caught up, right? And now we have, like, I think some evidence that DeepSeek is basically going to stop or
open sourcing models so like there's no team open source there's just different companies and they choose to open source or not and um we got lucky with deep seek releasing something and then everyone else is basically distilling from deep seek and those are distillations catching up is such an easier lower bar than like actually catching up which is like
you're like from scratch, you're training something that like is competitive. On that front, I don't know if that's happening. Like basically the only player right now is we're waiting for Lama 4. I mean, it's always an order of magnitude cheaper to replicate what's already been done than to create something fundamentally new. And so that's why I think DeepSeek overall was overhyped, right? I mean, obviously it's a good open source new entrant, but at the same time, there's nothing new fundamentally there other than sort of doing it, executing what's already been done really well.
yeah right so well but i think the traces is like maybe the biggest thing i think most previous open models is like the same model just a little worse and cheaper yeah like r1 is like the first model that had the full traces so i think that's like a nat unique thing in open source but yeah i think like we talked about deep seek in the our end of year 2023 recap and we're mostly focused on um cheaper inference like we didn't really have
Deep Sea V3 was out then and we were like, there was already like talking about fine green mixture of experts and all that. That's a great receipt to have to be like, yeah, end of year 23. Let's go. That's an impressive one. You follow the right whale believers on Twitter. Yeah.
It's pretty obvious. I used to be in finance and a lot of my hedge fund and PE friends called me up. They were like, why didn't you tip us off on DeepSeek? And I'm like, well, it's been there. It's actually kind of surprising that NVIDIA fell 15% in one day because of DeepSeek. And I think it's just whatever the public market narrative is,
decides as a story becomes the story but really like the technical movements are usually one to two years in the making before that basically these people were telling on themselves so they didn't listen to your podcast they've been on the end of the year no no no like yeah we weren't we weren't like banging the drum so like it's also on us to be like no like this this is an actual tipping point and i think i like as people who are like our function as podcasters and industry analysts is to
raise the bar or focus attention on things that you think matter. And sometimes we're too passive about it. And I think I was too passive there. I'd be happy to own up on that. I feel like over time, you guys have moved into this more interesting role of like taking stances of things that are or aren't important. And, you know, I feel like you've done that with MCP of late and the
of things. Yeah. So the general push is AI engineering, you know, like, got to wrap the shirt. And MCP is part of that. But like, the general movement is what can engineers do above the model layer to augment model capabilities? And it turns out it's a lot. And turns out we went from like making fun of GPT wrappers to now, I think the overwhelming consensus GPT wrappers is the only thing that's interesting. Yeah. I remember like Arvind from Perplexity came on our podcast, and he was like, I'm proudly a wrapper.
It's like anyone that's talking about differentiation, pre-product market fit is a ridiculous thing to say. Build something people want, and then over time you can kind of worry about that. Yeah, I interviewed him in 2023, and I think he may have been the first person on our podcast to probably be a GPT rapper. Yeah. And obviously he's built a huge business on that.
Now we all can't get enough of it. I have another one. So that was Alessio's one. We put up individual answers just to be interesting. In the same Uber on the way up? Yeah. Oh, I was driving too. So it was actually... I mean, it was a Tesla, so I mostly drove it. Mine was actually, it's interesting that low-code builders did not capture the AI builder market.
AI builders being Bolt and Lovable, local builders being Zapier, Airtable, VTool, Notion, any of those. When you're not technical, you can build software. Somehow, not all of them missed it. Why? It's bizarre. They should have the DNA. I don't know. They already have the reach. They already have the distribution. Why? I have no idea. The ability to fast follow, too. I'm surprised. Yeah, there's just nothing.
Yeah. What do you make of that? It seems, and not to come back to the AI engineering picture, it takes a certain kind of founder mindset or AI engineer mindset to be like, we will build this from whole cloth and not be tied to existing paradigms, I think. Because if I'm to, you know, you know Wade, or who's the Zapier person that you know? Mike. Who has left Zapier. The Zapier person. Who's the...
Yeah. Zapier, when they decided to do Zapier AI, they were like, "Oh, you can use natural language to make Zapier actions." When Notion decided to do Notion AI, they were like, "Oh, you can write documents or fill in tables with AI." They didn't do the next step because they already had their base and they were like, "Let's improve our baseline." The other people who
who actually tried to create a phone call where like we got no prior preconceptions. Like, let's see what we can what kind of software people can build with like from scratch, basically. I don't know that that's my explanation. I don't know if you guys have any retros on the builders. Yeah. Or did they kind of get lucky getting, you know, starting that product journey, like right as the models were reaching the inflection
There's the timing issue. Yeah, yeah, yeah. Yeah, I don't know. To some extent, I think the only reason you and I are talking about it is that both of them have reported ridiculous numbers, like zero to 20 million in three months, basically, both of them. Jordan, did you have a big surprise? Yeah, I mean, some of what's already been discussed. I guess the only other thing would be on the Apple side in particular. Oh, wow.
I think, you know, for the last... Those text message summaries, like, whew. But they're funny. They're funny. They're great. Very viral. Yeah, I mean, so, like, for the last couple of years, we've seen so many companies that are trying to do personal assistance, like all these various consumer things. And one of the things we've always asked is, well...
apple is in prime position to do all this and then with apple intelligence they just totally messed up in so many different ways and then the whole bbc thing uh saying that the guy shot himself when he didn't and just like there's just so many things at this point that i would have thought that they would have ironed up their their ai products better but
just didn't really catch on. - You know, second on this list of generally overly broad opening questions would be anything that you guys think is kind of like over-hyped or under-hyped in the AI world right now? - Over-hyped agents framework. - Not naming any particular ones. - I'm sorry. Yeah, exactly. It's not, I would say they're just overall,
a chase to try and be the framework when the workloads are in such flux that I just think it's so hard to reconcile the two. I think what Harrison and LanqChain has done so amazingly is product velocity. The initial obstructions were maybe not the ending obstruction, but they were just releasing stuff every day trying to be on top of it. But I think now we're past that. What people are looking for now is something that they can actually build on and
stay on for the next couple of years. And we talked about this with Brad Taylor on our episode, and it feels like it's like the jQuery era of like agents and NLMs. It's kind of like single file, big frameworks, kind of like a lot of developers, but maybe we need React. And I think people are just trying to build still jQuery. Like I don't really see a lot of people doing React-like.
Yeah. Um, maybe the only modification I made about that is maybe it's too early, even for frameworks at all. And the thing that there's enough stability, uh, in the underlying model layer and patterns to have this, the thing is the protocol and not the framework. Yeah. Uh,
Because frameworks inherently embed protocols, but if you just focus on a protocol, maybe that works. And obviously MCP is the current leading area. And I think the comparison there would be, instead of just jQuery, it is XML HTTP requests, which is the thing that enabled Ajax. And that was the inciting incident for JavaScript being popular as a language.
I would largely agree with that. I mean, I think on the React side of things, I think we're starting to see more frameworks sort of go after more of that. I guess like Mastra is sort of like on the TypeScript side and more of like a sort of... Mastra? Yeah, yeah, yeah. The traction is really impressive there. And so I think...
We're starting to see more surface there, but I think there's still a big opportunity. What do you have for an over or underhyped? On the underhyped side, I know I mentioned Apple already, but I think the private cloud compute side with PCC, I actually think that could be really big. It's under the radar right now, but in terms of basically bringing
the on-device sort of security to the cloud. They've done a lot of architecturally interesting things there. Who's they? Apple. Oh, okay. On the PCC side. And so, I actually think that. See, negative on Apple Intelligence, but positive on Apple Cloud. On more of the local device, sort of, I
I think there will be a lot of workloads still on device, but when you need to speak to the cloud for larger LLMs, I think that Apple has done a really interesting thing on the privacy side. We did the seat of a company that does that. Especially as things become more consumerized. I was like, let's go, Jordan. Tell me about that company after. But yes, I think that's the unique...
The thing about LLM workflows is you just cannot have everything be single tenant because you just cannot get enough GPUs. Like, even large enterprises are used to having VPCs and everything runs privately, but now you just cannot get enough GPUs to run in a VPC. So I think you're going to need to be in a multi-tenant architecture and you need, like you said, single tenant guarantees in multi-tenant environments. So yeah, it's an interesting space. Yeah. What about you, Swix?
Under hyped, I want to say memory, just like stateful AI.
As part of my keynote, every conference I do, I do a keynote. And I try to do the task of defining an agent. Evergreen content for a keynote. But I did it in a way that was like what a researcher would do. You survey what people say and then you sort of categorize and go, "Okay, this is what everyone calls agents, and here are the groups of definitions, pick and choose."
And then it was very interesting that the week after that, OpenAI launched their agents SDK and kind of formalized what they think agents are. Cloudflare also did the same with us. And none of them had memory. It's very strange. Pretty much like the only...
Obviously, there's conversation memory, but there's not memory memory like in a let's store a knowledge graph of facts about you and exceed the context length. If you look closely enough, there's a really good implementation of memory inside of MCP. When they launched with the initial set of servers, they had a memory server in there, which I would recommend as that's where you start with memory.
if there was a better memory abstraction, then a lot of our agents would be smarter and could learn on the job, which is something that we all want. And for some reason, we fall just like,
ignored that because it's just convenient to. But do you feel like it's being ignored or it's just a really hard problem? I feel like lots of people are working on it. It just feels like it's proven more challenging. Yeah, yeah, yeah. So Harrison has LangMem, which I think now he's relaunched again. And then we had Leta come speak at our conference.
I don't know ZEPP, I think there's a bunch of other memory guys, but like something like this, I think should be normal in the stack. And basically, I think anything stateful should be interesting to VCs because it's databases and you know how those things make money.
I think on the Overhyped side, the only thing I'd add is I'm still surprised how many net new companies there are trading models. I thought we were past that. I would say they died end of last year and now they've resurfaced. That's one of the questions that you had down there of, "Is there an opportunity for net new model players?" I would have said no. I don't know what you guys think. I don't have a reason to say no, but I also don't have a reason to say this is what is missing and you should have a new model company do it.
i'm like all these guys want to pursue agi you know they all want to be like oh we'll like hit you know sort on all the benchmarks and like they can't all do it yeah i mean look i don't know if ilya has the secret uh secret approach up his sleeve of uh of something beyond uh test time compute but it was funny i we had um noam shazier on the podcast last week i was asking him like you know is there like some sort of other algorithmic breakthrough what do you make of ilia and he's like look
I think what he implicitly said was test time compute will get us to the point where these models are doing AI engineering for us. And so, you know, at that point, they'll figure out the next algorithm breakthrough, which I thought was pretty interesting. I agree with you, Swix. I think that...
We're most interested, at least from our side, in foundation models for specific use cases and more specialized use cases. I guess the broader point is if there is something like that that these companies can latch onto and being there known for being the best at, maybe there's a case for that. Largely, though, I do agree with you that I don't think there should be, at this point, more model companies.
I think it's like these unique data sets, right? I mean, obviously robotics has been an area we've been really interested in. It's an entirely different set of data that's required, you know, on top of like a good BLM and then, you know, biology, material science. More of the specific use cases. Yeah, but also specific, like a lot of these models are super generalizable, but like, you know,
finding opportunities to, you know, where, you know, for a lot of these bio companies, they have wet labs, like they're running a ton of experiments or, you know, same on the material sciences side. And so I still feel like there's some, some opportunities there, but the core kind of like LLM agent space is tough to compete with the big ones. Yeah. Yeah. But they're moving more into product. So I think that's the question is like, if they could do better vertical models, why not do that instead of trying to do deep research? Yeah.
an operator on these different things i think that's what i'm in my mind it's like it's coming up too well yeah in my mind it's like financial pressure like they need to monetize in a much shorter time frame because the costs are so high um but maybe it's like it's not that easy to do
- You think they would be, that it would be a better business model to do a bunch of vertical? - Well, it's more like why wouldn't they? You make less enemies if you're a model builder. Now with deep research and search, now perplexity is an enemy and Gemini deep research is more of an enemy versus if they were doing a finance model.
you know or whatever like they would just enable so many more companies and they always have like they had have you as one of the customer case studies for gpt search but they're not building a finance-based model for them so is it because it's super hard and somebody should do it or is it because the new models are going to be so much better that like the vertical models are useless anyways like this is a lesson exactly it still seems to be a somewhat outstanding question i'd say like all the
all the signs of last few years seem to be like a general purpose model is like the way to go. And, you know, you know, Taylor, like training a hyper-specific model in a domain is like, you know, maybe it's cheaper and faster, but it's not going to be like higher quality, but also like, I think it's still, I mean, we're talking to, to,
Noam and Jack Ray from Google last week, and they were like, yeah, this is still an outstanding. We check this every time we have a new model, whether that still seems to be holding. I remember a few years ago, it felt like all the rage was like, it was like the Bloomberg GPT model came out, and everyone was like, oh, you've got to take a massive data set. Yeah, I had the GPFI Bloomberg present on that. Yeah, that must be a really interesting episode to go back on, because I feel like very shortly thereafter, the next opening out model came out and just beat it on all sorts of the...
No, it was a talk. We haven't released it yet. But yeah, basically they concluded that the closed models were better. So they just stopped. Interesting. I feel like that's been the... But he's very insistent that the work that he did, the team he assembled, the data that he collected is actually useful for more than just the model. So basically everything but the model survived. What are the other things? The data pipeline, the team that they assembled for fine-tuning and implementing...
whatever models they ended up picking. Yeah, it seems like they are happy with that and they're running with that. He runs like 12, 13 teams at Bloomberg just working Jenny Eye across the company. I mean, I guess we've all kind of been alluding to it right now, but I guess because it's a natural transition, you know, the other broad opening I have is just what we're paying most attention to right now. And I think back on this, like, you know, the model companies coming into the product area, I mean, I think that's going to be like
I'm fascinated to see how that plays out over the next year and kind of these like frenemy dynamics. And it feels like it's going to first boil up on like Cursor Anthropic and like the way that plays out over the next six months, I think will be. What is Cursor Anthropic? You mean Cursor versus Anthropic? Yeah, I assume, you know, over time, Anthropic wants to get more into the application side of coding. And, you know, I assume over time, Cursor will want to diversify off of
just using the anthropic model. It's interesting that now Cursor is now worth like $10 billion, $9, $10 billion. And they've made themselves hard to acquire. I would have said you should just get yourself to $5, $6 billion and join OpenAI. And all the training data goes to OpenAI and that's how they train their coding model. Now it's complicated. Now they need to be an independent company. Increasingly, it seems that model companies want to get into the product layer. And so seeing over the next 6, 12 months, just having the best model, you know,
let you kind of start from a cold start on the product side and get something in market? Or are the companies with the best products, even if they eventually have to switch to a somewhat worse, tiny bit worse model, does it not, where do the developers ultimately choose to go? I think that'll be super interesting. Yeah. Don't you think that Devin is more in trouble than Kirk?
I feel like Anthropic, if anything, wants to move more towards, I don't think they want to build the IDE. If I think about coding, it's kind of like you look at it like a cube. The IDE is one way to get the code, and then the agent is the other side. I feel like Anthropic wants more to be on the agent side and then hand you off to Cursor when you want to go in-depth versus trying to build the Cloud IDE. I think that's not... I don't know how you think about it. The existence of the Cloud code
doesn't show, doesn't support what you say. Like maybe they would, but I assume all this, like I assume both just converge eventually where you want to have, uh,
So in order to be-- so we're talking about coding agents, whether it's sort of-- what is it? Inner loop versus outer loop, right? Inner loop is inside cursor, inside your ID, inside of a Git commit. And outer loop is between Git commits on the cloud. And I think to be an outer loop coding agent, you have to be more of a-- we will integrate with your code base. We'll sign your whatever security thing that you need to sign, that kind of schlep.
I don't think the model labs want to do that schlep, they just want to provide models. So that would be my argument against why Cognition should still have some moat against Anthropic, just simply because Cognition will do the schlep and the bizdev and the infra that Anthropic doesn't really care about. I don't know, the schlep is pretty sticky though, once you do it. It's very sticky, yeah. It's interesting, I think the natural winner of that should be Sourcegraph.
Another unprompted mention of Uncomfortable. They're big supporters. I'm very friendly with both Quinn and Veeang. They've done a lot of work with Kodi, but not much work on the Outerloop stuff yet. But any company where they have already had... We've been around for 10 years already.
We have all the enterprise contracts. You already trust us with your code base. Why would you go trust Factory or Cognition as two-year-old startups who just came out of MIT? Yeah.
I don't know. I guess switching gears to the application side, I'm curious for both of you, how do you characterize what has genuine product market fit in AI today? And I guess I'll ask you more on your side of the investing side, more interesting to invest in that category of the stuff that works today or where the capabilities are going long term?
- That's hard. - I was asking you to do my job for us. - You were like, man, that's a lay-off. - Tell us all your investment pieces. - Yeah, yeah, yeah. - I would say, well, we only really do mostly seed investing, so it's hard to invest in things that already work. - Yeah, that's fair. - Because it means they're already late. But we try to be at the cusp
Usually the investments we like to make, there's really not that much market risk. It's like, if this works, obviously people are gonna use it, but it's unclear whether or not it's gonna work. So that's kind of more what we see to do worse. We try not to chase as many trends. And I don't know, I was a founder myself and sometimes I feel like it's easy to just jump in and do the thing that is hot, but becoming a founder to do something that's underappreciated, doesn't yet work, shows
some level of like grit and self, like you actually really believe in the thing. So that alone for me is like kind of makes me feel more towards that.
And you do a lot of angel investing too, so I'm curious how. Yeah, but I don't regard, I don't have, I don't use, put that in my mental framework of things. Like I come at this much more as a content creator or market analyst of like, yeah, it really does matter to me what has part of market fit because people, I have to answer the question of what is working now when people ask me. Do you feel like relative to the, obviously the hype and discourse out there are like
you know, do you feel like there's a lot of things that have product market fit or like a few things like a few things? Yeah. So I have a list of like two years ago, I wrote the anatomy of autonomy post where it was like the first like what's going on in agents and
and what is actually making money. Because I think there's a lot of genii skeptics out there that are like, these things are toys, they're not reliable. And why are you dedicating your life to these things? And I think for me, the product market fit bar at the time was $100 million.
right? Like what use cases can reasonably fit a hundred million dollars. And at the time it was like Copilot. It was Jasper, no longer, but you know, in that category of like help you write, which I think was helpful. And then, and Cursor I think was on there as like a coding agent plus plus. I think that
list will just grow over time of like the form factors that we know to work and then we can just adapt the form factors to a bunch of other things so like the the one that's the most recently added to this is deep research yeah right uh where anything that looks like a deep research whether it's a
Grok version, Gemini version, Perplexity version, whatever. He has an investment that he likes called Bright Wave that is basically deep research for finance. And anything where it's like long term agentic reporting, it's starting to take more and more of the job away from you and just give you much more reason to report. I think it's going to work. And that has some PMF, I think. Obviously it has PMF. I would say I went through this exercise of trying to handicap how much money
OpenAI made from launching OpenAI deep research, I think it's billions. The sheer upgrade from $20 to $200, it has to be billions in the ARR. Maybe not all of them will stick around, but that is some amount of PMF that is... Didn't they have to immediately drop it down to the $20 tier? They expanded access. I wouldn't say... Which I thought was really telling of the market, right? It's like where you have a...
I think it's going to be so interesting to see what they're actually able to get in that $200 or $2,000 tier, which we all think has a ton of potential. But I thought it was fascinating. I don't know whether it was just to get more people exposure to it or the fact that Google had a similar product, obviously, and other folks did too.
But it's really interesting how quickly they dropped it down. I think that's just a more general policy of no matter what they have at the top tier, they always want to have smaller versions of that in the lower tiers. Yeah, just get people exposure to it. Yeah, just get exposure. The brand of being first to market and the default choice is paramount to OpenAI. Though I thought that whole thing was fascinating.
- Google had the first product, right? - Yeah. - And like, you know. - We interviewed them, I straight up to their faces, I was like, "Opening, I mocked you." And they were like, "Yeah." - Well, actually, this is totally off topic, but whatever. Like, what is it going to take for, Google just released some great models like a few weeks ago. Like, I feel like-- - It's happening. - The stuff they're shipping is really cool. - It's happening, yeah.
I also feel like at least in the broader discourse, it's still like a drop in the bucket relative to... Yeah. I mean, I can't write fun on this. But I think it's happening. I think it takes some time. But my Gemini usage is up. I use it a lot more for...
anything from summarizing YouTube videos to the native image generation that they just launched to Flash linking. Multi-model stuff's great. Yeah, and I run a daily news recap called AI News that is 99% generated by models, and I do a bake-off between all the frontier models every day. Every day? Does it switch? Yes, it does switch, and I manually do it. And Flash wins most days.
So I think it's happening. I was thinking about tracking myself, like number of opens of ChaiGBT versus Gemini, and at some point it will cross. I think that Gemini will be my main. And I...
that will slowly happen for a bunch of people and then that'll shift. I think that's really interesting. For developers, this is a different question. It's Google getting over itself of having Google Cloud versus Vertex versus AI Studio, all these like five different brands slowly consolidating it. It'll happen just slowly, I guess.
Yeah. I mean, another good example is like you cannot use the thinking models in Cursor. And I know Logan Kilpatrick said they're working on it, but I think there's all these small things where like if I cannot easily use it, I'm really not going to go out of my way to do it. But I do agree that
When you do use them, their models are great. They just need better bridges. You had one of the questions in the prep. What public company are you long and short? Mine is Google versus Apple. That was also my combo. I feel like, yeah, it does feel like Google is really cooking right now. Yeah.
So, okay, coming back to what has product market fit. Now that we come back to my complete total sidetrack. There's also customer support. We were talking on the car about Dekogon and Sierra. Obviously, Brett Taylor is founder of Sierra. And yeah, it seems like
There's just these layers of agents that'll, I think you just look at the income statement or the org chart of any large scaled company and you start picking them off one by one, what is interesting knowledge work, and they would just kind of eat things slowly from the outside in. Yeah. I mean, the episode with Brett, he's so passionate about developer tools.
And yet he did not do a developer tools company. We spent like two hours talking about developer tools and like all of that stuff. And he's like, I did a customer support company. I'm like, man, that says something. You know what I mean? It's like when you have somebody like him who can like raise any amount of money from anybody to do anything. Yeah. To pick customer support as the market to go after while also being the chairman of OpenAI. Like...
that shows you that these things have moats and have longstanding, they're going to stick around, otherwise he's smarter than that. So yeah, that's a space where maybe initially I would have said,
I don't know if it's the most exciting thing to jump into, but then if you really look at the shape of how the workforce are structured and how the cost centers of the business really end up, especially for more consumer-facing businesses, a lot of it goes into customer support. All the AI story of the last two years has been cost-cutting. I think now we're going to switch more towards growth. Revenue. You've seen Jensen. Last year, GDC was saying, the more you buy, the more you save. This year, it's the more you buy, the more you make.
Hot off the press. We were there. We were there. I do think that's one of the most interesting things about this first wave of apps where it's like almost the easiest thing that you could get real traction with was stuff that, for lack of a better way to frame it, like some of the people had already been comfortable outsourcing to BPOs or something and kind of implicitly said like, hey, this is a cost center. We are willing to take some performance cut for cost in the past and
The irony of that, or what I'm really curious to see how it plays out is, you could imagine that is the area where price competition is going to be most fierce because it's already stuff that people have said, "Hey, we don't need the 100% best version of that." And I wonder, this next wave of apps may prove actually even more defensible as you get these capabilities that actually are increased top line or whatnot, where you're like, take AI Go to Market, for example.
twice as much for something that brought like because there's just a kind of very clean roi story to it um and so i wonder ultimately whether like this next set of apps um actually ends up being more interesting than the than the first wave yeah i think a lot of the voice ai ones are interesting too because you don't need 100 precision recall to actually
you know, have a great product. And so, for example, we looked into a bunch of, you know, scheduling intake companies, for example, like home services, right, for electricians and stuff like that. Today, they miss 50% of their calls. So even if the AI is only effective, say, 75% of the time, yeah, it's crazy, right? So if it's effective 75% of the time, that's totally fine because that's still a ton of increased revenue for the customer, right? And so you don't need that 100% accuracy. And so as the models evolve,
and the reliability of these agents are getting better, it's totally fine because you're still getting a ton of value in the meantime.
I don't know how related this is, but one of my favorite meetings at AI Engineer Summit. This was our first one in New York, and I just met a different crew than you meet here. Everyone here loves developer tools, loves infra. Over there, they're actually more interested in applications, which is kind of cool. I met this bootstrap team that they're only doing appointment scheduling for vets.
And they're like, this is an anomaly. We don't usually come to engineering summits because we usually go to vet summits and talk to the... They're literally... I'm sure it's a massive pain point. They're willing to pay a lot of money. But this is my point about saving versus making more. It's like if an electrician takes...
2x more calls. Do they have the bandwidth to actually do 2x more in-house visits? Well, yeah, exactly. That's the thing. I don't think today most businesses are structured to just overnight 2x, 3x demand. I think that's a startup thing. Most businesses can't do it. Do you make an electrician agent?
Well, no, totally. How do you do recruiting agent for electrician? Or like electrician training? How do you do Lambda school for electrician? I don't know. It's like-- It's a very whack-a-mole for the bottlenecks in these businesses. Yeah, exactly. Like, oh, now we have a ton of demand. Like, cool. Where do we go? Yeah. So just to round up the PMF thing, I think this is relevant in the sense of it's pretty obvious that the killer agents are coding agents, support agents, deep research.
Roughly. We've covered all those three already. Then you have to sort of turn to offense and go like, okay, what's next? Also, just like summarization of voice and conversation. We actually had that on there. I didn't put it as agent because...
It seems less agentic, you know? But yes. MARK MANDEL: Still a good AI use case. FRANCESC CAMPOY: That one I've seen. I would mention Granola. And what's the other one? Monterey? MARK MANDEL: I think a bridge was the one you wanted to mention. MARK MANDEL: I was going to say a bridge. Yeah. FRANCESC CAMPOY: A bridge? OK. So I'll just call out what I had on my slides for the agent engineering thing. So it was screen sharing, which I think is actually kind of underrated. Like people, like, watching you as you do your work and just offering assistance.
Outbound sales, so instead of support just being more outbound. You say outbound sales has brought in market fit? No, it's coming up. Oh, on the comp. Yeah, I totally agree with that. Yeah. Hiring, like the recruiting side. Education, like the sort of personalized teaching, I think. I'm kind of shocked we haven't seen more there.
Yeah. I don't know if that's like... It's like Duolingo is the thing, Comigo. Yeah, I mean, Speak and some of these like, you know, practice. Yeah, interesting. And then finance, there's a ton of finance cases that we can talk about that. And then personal AI, which we also had a little bit of that. But I think personal AI is harder to monetize. But I think those would be like what I would say is up and coming in terms of like, that's what I'm currently focusing on. I feel like this question has been asked a few different ways, but I'm curious what you guys think. It's like,
If we just froze model capabilities today, is there trillions of dollars of application value to be unlocked? AI education, if we just stopped today all model development, with this current generation of models, we could probably build some pretty amazing education apps. Or how much of all this is contingent upon just, okay, people have had two years with GPT-4 and six months with the reasoning models. How much is contingent upon it just being more time with these things versus the models actually have to get better?
I don't know, it's a hard question, so I'm gonna just throw it to you. - Yeah. Well, I think the societal thing is maybe harder, especially in education, you know? Like, can you basically like doge the education system?
Probably you should, but can you? I think it's more of a you, man. But people pay for all sorts of get-ahead things outside of class. And certainly in other countries, there's a ton of consumer spend. It feels like the market opportunity is there. Yeah, and in private education, I think. Public is very different. One of my...
My most interesting quest from last year was reforming Singapore's education system to be more sort of AI native. Just what you were doing on the side while you were waiting for our answer. Yes. That's a great side quest. My stated goal is for Singapore to be the first country that has Python as a national language.
Anyway, so, but the defense, the pushback I got from the Ministry of Education was that the teachers would be unprepared to do it. So it was like the, it was really interesting, like immediate pushback was the de facto teachers union being like resistant to change. And I'm like, okay, that's par for the course. Anyway, so not to dwell too much on that, but like, yeah, I mean, like, I think like,
education is one of those things that everyone like has strong opinions on because they all have kids, all been through the education system. Um, but like, I think it's going to be like the, the domain specific, like, like speak like such an amazing example of like top down, like we will go through the idea maze and we'll go to Korea and teach them English. Like, it's like, what the hell? Um, and, uh,
I would love to see more examples of that. Like just like really focused, like no one tried to solve everything. Just do your thing really, really well. On this trend of difficult questions that come up, I'm going to just ask you the one that my partners like to ask me every single Monday, which is how do you think about defensibility at the app layer? Oh yeah, that's great. Just give me an answer. I can copy paste and just like, you know, auto response. Honestly, like network effects. I think people don't prioritize those enough because they're trying to make the...
single player experience good but then they neglect the multiplayer experience I think one of the I always think about like load-bearing episodes like you know as far as you do one a week and like you know some of those you don't really talk about ever again and others you keep mentioning every single podcast and this is obviously going to be the last one I think the recap episodes for us are pretty load-bearing like we refer to them every three months or so
And one of them, I think, for us is Chai. For me, it's Chai research, even though that wasn't a super popular one among the broader community outside of the Chai community. For those who don't know, Chai research is basically a character AI competitor.
They were bootstrapped, they were founded at the same time, and they have outlasted character de facto. It's funny, I would love to ask Mil Moshazier a bit more about the whole character thing. Good luck getting past the Google Cops. He doesn't have his own models, basically. He has his own network of...
of people submitting models to be run. And I think that is short term going to be hurting him because he doesn't have proprietary IP, but long term he has the network effect to make him robust to any changes in the future. And I think I want to see more of that, where he's basically looking at himself as kind of a marketplace and he's identified the choke point, which is the app or the protocol layer that
between the users and the model providers and then make sure that the money kind of flows through. And that works. I wish that more AI builders or AI founders emphasize network effects because that's the only thing that you're going to have at the end of the day. And like brand leads into network effects. Yeah.
Yeah, I guess, you know, harder in the enterprise context, right? But I mean, I feel it's funny, we do this exercise. And I feel like we talk a lot about like, you know, obviously, there's, you know, kind of velocity and the breadth, you're able to kind of build a product surface area, there's just like the ability to become a brand in a space like I'm shocked, even in like, six, nine months, how an individual company can become synonymous with like an entire category. And like, then they're in every room for customers. And like, so the other startups are like clawing their way to try and get in like one, you know, 20th of those rooms.
There's a bunch of categories where we talk about an IC and it's like, oh, pricing compression is going to happen, not as defensible. And so ACVs are going to go down over time. In actuality, some of these, the ACVs have doubled, we've seen. And the reason for that is just people go to them and pay for that premium of being that brand.
Yeah, I mean, what I'm struck by is there was such a head fake in the early days of AI apps where people were like, we want this amazing defensibility story. And then what's the easiest defensibility story? It's like, oh, like, totally unique data set or, like, train your own model or something. And I feel like that was just, like, a total head fake where I don't think that's actually useful at all. It's the much less... You sound much less articulate when you're like, well, the defensibility here is, like, the thousand small things that this company does to make, like, the user experience, design, everything just, like, delightful and just, like, the speed at which they move to kind of both create a really broad product but then also...
every three, six months when a new model comes out, it's kind of an existential event for any company. Because if you're not the first to figure out how to use it, someone else will. And so velocity really matters there. And it's funny, in our internal discussions, we've been like, man, that sounds pretty similar to how we thought about application SaaS companies.
that there isn't some like revolutionary reason. You don't sound like a genius when you're like, here's applications, like application SaaS company A is so much better than B. But it's like a lot of little things that compound over time. What about the infrastructure space guys? Like I'm curious, you know, how do you guys think about where the interesting categories are here today? And you know, like where do you want to see more startups or where do you think there are too many? - Yeah, we call it kind of the LLMOS.
But I would say-- - Not we, I mean, Andre calls it LLMOS. - Well, yeah, well, we have-- - You, you, and Andre, the three of you call it the LLMOS. - Well, we have this like four words of AI framework that we use, and LLMOS is one of them. But yeah, I mean, code execution is one. We've been banging the drum. Everybody now knows where investors in E2P. Memory, you know, is one that we kind of touched on before. Super interesting. Search.
we talked about. I think those are more, not traditional infra, not like the bare metal infra, it's more like the infra around the model. - Tools for agents. - Which I think is where a lot of the value is gonna be. - The security ones? - Yeah, yeah, and temporary security, I mean,
There's so much to be done there. And it's more like basically any area where AI is being used by the offense, AI needs to be applied on the defense side, like email security, identity, all these different things. So we've been doing a lot there, as well as having you rethink things that used to be costly, like red teaming, and maybe used to be a checkbox in the past. Today, they can be actually helpful to make you secure your app.
And there's this whole idea of like semantics, right? That not the models can be good at. You know, in the past, everything is about syntax. It's kind of like very basic, you know, constraint rules. I think now you can start to infer semantics from things that are beyond just like simple recognition to like understanding why certain things are happening a certain way. So in the security space, we're seeing that with
binary inspection, for example. There's kind of the syntax, but then there are semantics of understanding what is this code overall really trying to do, even though this individual syntax is saying something specific. Not to get too technical, but yeah, I think infra overall is like
a super interesting place if you're making use of the model. If you're just, I'm less bullish, not that it's not a great business, but I think it's a very capital-intensive business, which is like serving the models. I think that infra is like, great, people will make money, but yeah, I don't think there's as much of an interest from us. How do you guys think about what OpenAI and the big research labs will encompass as part of the developer and infra category? Yeah, that's why I would say
Search is the first example of one of the things we used to mention on, you know, we had Axon, the podcast, and Perplexity obviously as an API. - The basic idea is if you go into the chat GPT custom GPT builder, like what are the check boxes? Each of them is a startup. - Yeah. And now they're also APIs. So now search is also an API.
We'll see what the adoption is. In traditional inference, everybody wants to be multi-cloud. So maybe we'll see the same where chatGPD search or OpenAI search API is great with the OpenAI models because you get it all bundled in. But their price is very high. If you compare it to Exa, I think it's like five times the price for the same amount of research, which...
Makes sense if you have a big OpenAI contract, but maybe if you're just picking best in breed, you want to compare different ones. Yeah. They don't have a code execution one. I'm sure they'll release one soon. So they want to own that too, but...
Yeah, same question we were talking about before, right? Do they want to be an API company or a product company? Do you make more money building strategy search or selling search API? - Yeah, the broader lesson instead of like going, we did applications just now and then what do you think is interesting infrastructure? Like it's not 50/50, it's not like equal weighted. Like it's just very clearly the application layer has like
been way more interesting. Yes, there's interesting infrastructure plays. And I even want to push back on the whole GPU serving thing, because together, AI is doing well, fireworks is doing well. I was going to say, that's the stuff that's worked. It's like data centers and inference providers. I think it's on the capital. Oh, I see.
Again, you guys have much larger funds, so I'm sure you have GPU costs. Yeah, so that is one thing I have been learning in that I think I historically had DevTools and InfraPyus, and so has he, and we've had to learn that applications actually are
very interesting and also maybe kind of the killer application of models in the sense that you can charge for utility and not for cost, right? Which is where most infrastructure reduces to cost plus. Yeah. Right. So and like, that's not where you want to be for AI.
So that's interesting for me. I thought it would be interesting for me to be the only non-VC in the room to be saying what is not investable because then I won't be canceled for saying your whole category is not investable. This thing is not investable. And then three months later, we're desperately...
chasing exactly so you don't want to be on the record it changes so fast yeah every opinion you hold you have to like hold it quite loosely I'm happy to be wrong in public you know I think that's how you learn the most right so like fine-tuning companies is something I struggled with and still like I don't see how this becomes a big thing like you kind of have to wrap it up in a broader broader
enterprise AI company, like services company, like a writer AI, where they will find you and it's part of the overall offering, but that's not where you spike.
Yeah, it's kind of interesting. And then I'll just kind of AI DevOps. And like there's a lot of AI SRE out there. Seems like there's a lot of data out there that should be able to be plugged into your code base or your app to sort of self-heal or whatever. It's just I don't know if that's like been a thing yet. And you guys can correct me if I'm wrong. And then the last thing I'll mention is voice real time infra. Again, like very interesting, very, very hot. But again, how big is it? Those are the main three that I'm thinking about for
Things I'm struggling with. Yeah, I guess a couple comments on the AISRE side. I actually disagree with that one I think that the reason they haven't sort of taken off yet is because the tech is just not there quite yet And so it goes back to the earlier question Do we think about investing towards where the companies will be when the models improve versus now? I think that's going to be in short term We'll get there, but it's just not there just yet, but I think it's an interesting opportunity overall. I
Yeah. My pushback to you is, well, it's monitoring a lot of logs, right? And it's basically anomaly detection rather than, like, there's a whole bunch of stuff that can happen after you detect the anomaly, but it's really just anomaly detection. And we've always had that.
This is not a Transformers LLM use case, this is just regular anomaly detection. It's not going to be an autonomous SRE for a while. So the question is, how much can the latest AI advancements increase the efficacy of bringing your MTTR down? Even if it's 10% improvement on beforehand, that's still potentially a lot of revenue.
That's the way, at least, I think I would think about it now. And then a few years from now, if it's actually an autonomous SRE just replacing altogether, then that's a totally different thing. Cool. I'll look out for it.
Yeah. I guess switching back to overly broad questions, what do you feel like is the biggest unanswered question in AI today that has large implications for the ecosystem? Yeah, I've been banging the drum on RL. And I think it's clear that you can do RL successfully on verifiable domains.
I would say whether or not we can figure out how to do that in non-verifiable one. So law is a great example. Can you do RL on contracts and documents? Marketing, sales, going back to outbound sales. Can you do RL to simulate what an outbound and the conversation leads to? Yeah, it's unclear. If not, then I think we'll be stuck with you're going to have agents in the more verifiable domains and then you'll just have co-pilots
and the non-verifiable ones because you'll still need a person to be the tastemaker. I think Zach's in. I'm trying to think of the implications where if it doesn't work, the world could be weird where you have fully autonomous AI coders and no one does any software or math or even some areas of science, but then to write the most basic sales email is still like,
It's always so hard to predict how the world... That is such a weird... Of all the sci-fi that was written 50 years ago, I don't think anybody foresaw that future. That is a really weird future. Did either of you have a different one for that? We'll go back and forth. Biggest unanswered question. I guess...
I don't know if this is a good answer, but Bob McGrew we had on the podcast, he was talking about the rule of nines they have at OpenAI where to go from 90% reliability to 99, it's an order of magnitude increase in compute, and then 99 to 99.9, order of magnitude increase, and that happens every two to three years. And so I think how are we going to scale sort of accordingly this sort of next part? I think there's a lot of unanswered questions just like from a hardware perspective. And then I think as part of that,
from the availability perspective, like is NVIDIA just going to continue to be dominant? Like obviously AWS is going hard into, what's their, Tranium chips? I'm blank on it, thank you. And so I think like there's a big ecosystem around CUDA that's obviously allowed NVIDIA to remain dominant, but just
what's going to happen and is there anyone that's going to come sort of combat that to increase the availability of GPUs or are we just going to be constrained going forward when we actually need way more compute going forward? Yeah.
My quick thoughts, I've been, I'm the only individual named as an investor in Med-X, which is kind of like really funny because everyone else was funds and then it was just me. And there's all these like NVIDIA startups, sorry, dedicated silicon startups that are coming up and trying to challenge that. And the simple answer is,
These GPUs are the most general things possible by design. That's why they do gaming and crypto and AI. And I think as long as the architecture seems stable, it seems like there's a case to be made for that. The only question is who will win that? And obviously there's a whole bunch of competitors, including I think AMD is trying to make a play for it. But so will AWS and so will every other, like Microsoft has a chip, Facebook has a chip.
So who knows who will win that? It's just, it's very interesting that like,
this seems to be such a valuable prize, like it's freaking Nvidia that you're competing with. And no one has really made a real dent there yet. But so I kind of I kind of agree with you. But like, I think that basically it's all about stability or workload. And as long as it's a bet on the death of transformers, basically. And if you're fine with that, like even and I think the even a state space model, people would agree that like it wouldn't really change that much. And probably
I think the overall consensus is that you don't even use state-based models individually. Like you would use them in a mixture with transformers anyway. So then like, yeah, just go bet on transformers, bake it into the chip and you will have much way more, you know, basically ASICs like,
for transformers and that's fine. And so like prima facie, there should be a company that wins that. I don't know who will win that. - I wish we knew. - I think that anyone, you have to start basically after 2019 or 2020 because anyone started before that will still be too general.
Because Transformers hadn't won yet at the time. I have one more. I think that the most emergent one that came out of the New York conference that I did was agent authentication. I think literally the information just published, like this is something that they're worried about, which is when operator or whoever accesses your website on behalf of you, how does it indicate that it's not you, but it's an agent of you? And I think like,
my general philosophy on agent experience or any of the sort of reinvention of every part of the stack for agents is that all or not necessary except for this agent auth thing. Like we really need
to be able to like new SSO effectively for agents. Is it going to be crypto? I thought crypto people are really amped about the... You know, it's really frustrating when Sam Altman is right, but maybe you have to scan your eyeballs. Maybe you just have to. Maybe he saw this five years ago and he was like, "You got to scan your eyeballs." And the rest of us are just behind him as usual.
I love it. Well, okay, now I'll move to the quickfire round where we'll go around the horn and get quick takes on things. So the first is going to be dream podcast guest. John Carmack.
John is like six steps away from solving AGI apparently, so we just ask him how far along he is. For us it's Andre. For me it's Andre. He's a listener and supporter of the pod. And basically when I launched the whole AI engineer push that we have, he was basically the first one to legitimize it. He was like, "There will be more AI engineers than ML engineers." And I think that made everyone else pay attention.
Latentspace only exists because he and other people helped to promote it. I also had Andrea, so I guess we're thinking the same thing there. Basically, mine's a little bit cheating, but I think at some point they're writing a book about OpenAI now. And at some point, somebody probably acquired will get to do the acquired OpenAI episode. But if unsupervised learning could... There's so many amazing stories of the last five, six years. So do you know about Doomers?
It's a play. I'm actually going to it this Saturday. Yeah, someone made a play about the board drama of last year. Really? Wow. That's cool. I don't know how it is. Yeah, let us know. We should have a director of that on the podcast. No, I think it's a lot of fan fiction, basically. But, like, someone will write the accounts and it will be interesting and fascinating. And a lot of it will be fake because it's a complex beast, right? You're just getting an oral history of what happened. Yeah. Yeah.
All right. For the next one, I figured you could shout out either a new source you use to stay up to date or a startup that you're not invested in that you're excited about. My new source is Sean. That's what I was going to say. I literally wrote Swix's Twitter. We have a Latent Space Discord. Any link that ever matters on the internet, Sean is going to post it in the Discord. So all I do, I open Discord.
And we have like, you know, 40, 50 different channels by topic. - That's very true. - I open Discord and I'm like, okay, AI, then I go developer tools, then I go creator economy, then I go stock and macro, then I go, and they're all there, so thank you. - We actually met because of the Discord. It was like a COVID thing 'cause everyone's at home and just started the Discord. And yeah, that was the origin of LoadingSpace, just chatting on the Discord. - It used to be called dev/invest. So it was all about developer tools investing.
And then we were at OpenAI in October of 2022. We were like, maybe we should do a podcast. And then OpenAI was the first guest. Yeah. I was not prepared about the news sources thing. I think maybe...
It's hard. It's really shitty to say, but like just in-person conversations. Yeah. And I think the reason I have to be here in SF is because I make friends with people who know things and are smarter than me. And we go for chats and they're nice enough to share some stuff. And so sometimes I wish I worry that I am being used in order to
put things out there that are maybe not true. So I have to exercise my own judgment as to what that is. I think one of the cool things about the podcast in general is just the opportunity to take these conversations that happen in closed rooms and try and bring them on to the airwaves. I'm curious, how much do you feel like the private discourse is similar to the public discourse?
In many ways, it is surprisingly similar. As in, people at OpenAI learn about things about OpenAI from us.
which is interesting. And then there are some ways in which it is drastically not, drastically dissimilar. And those are the things I just cannot repeat until it's public. This has been super fun. I feel like I've lived up to it. We were looking forward to this for a while. We want to make sure everyone around the horn gets an opportunity to plug whatever they want to plug. So we'll leave the last word to all of us, I guess. Where can folks go to learn more about latent space and all the exciting things you do?
We want to make sure our listeners have a good sense of everything. Yes. So we have a sub stack, laden.space is the website. And then please subscribe on YouTube. We're doing a lot of YouTube. We're trying to do better video and all that. He said our OKRs. It's basically all YouTube.
Come watch us on YouTube. It's very important for me personally. Even if you don't care, just open it. We have to increase our production value. Look at this. I know. We only have three cameras. Yeah, and then Sean does a lot of the writing outside of the podcast on the newsletter. Yeah, so it's like trying to be newsletter and community and podcast and
and whatever else that we do. Yeah, so I guess for me, I guess there's Dayton Space, but then there's also the other big piece, which is the conference that I run. And the idea is that I think sometimes you just get the good stuff from people if you just put them in front of a lot of people. And that's really like I'm mining people for content. And sometimes you put a mic in front of them and they yap for an hour. Other times you have to put them in front of like a prestigious conference and then they drop some alpha.
And so the next one for us is going to be June. It's the AI Genium World's Fair. And it should be the largest technical conference for AI. And ours is simple. We just run a humble podcast. So subscribe to Unsupervised Learning on YouTube. Thanks so much. This was awesome. Thanks for having me. It was good to see you guys. Thanks for coming on.
you