We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Ep 62: CEO of Cohere Aidan Gomez on Scaling Limits Emerging, AI Use-cases with PMF & Life After Transformers

Ep 62: CEO of Cohere Aidan Gomez on Scaling Limits Emerging, AI Use-cases with PMF & Life After Transformers

2025/4/15
logo of podcast Unsupervised Learning

Unsupervised Learning

AI Deep Dive AI Chapters Transcript
People
A
Aidan Gomez
J
Jacob Efron
Topics
Jacob Efron: 我很好奇在未来五年或十年内,哪种企业部署大型语言模型的模式将会胜出,例如咨询模式、Palantir模式和开箱即用产品模式。 Aidan Gomez: 长期来看,我认为一种折中的模式将会胜出。这项技术很复杂,企业需要一定程度的支持才能有效地采用它。AI 智能体只有在访问到与人类相同的信息时才能有效地驱动自动化,这需要大量的访问权限,这带来了隐私问题,因为很少有软件需要这种程度的访问权限。每个公司使用的软件堆栈都不同,这需要定制化设置才能将所有上下文整合到模型中。我们正在努力使我们的新智能体平台 North 更加易于定制和集成到每个公司的软件堆栈中。未来,我们将能够自动化部分问题,但不能完全自动化,因为访问敏感数据(如薪资数据)的错误代价很高。 Jacob Efron: 你如何对目前已经实现产品市场匹配的企业级生成式AI用例进行分类或列举? Aidan Gomez: 我认为目前大型语言模型在客户支持和增强人类研究能力这两个领域已经实现了产品市场匹配。在医疗保健领域,我们希望使医生的笔记记录和表格填写工作更容易。在电信、医疗保健和金融服务等各个行业,客户支持的需求非常大,而且技术已经成熟。另一个用例是增强人类的研究能力,使财富经理能够更快更有效地进行研究。

Deep Dive

Shownotes Transcript

Translations:
中文

I'm Jacob Efron, and today on Unsupervised Learning, Aidan and I hit a bunch of things, including why he thinks the best application companies will also build their own models. We also talked about the type of data labeling that will matter to make models better going forward. And we hit on why Aidan hopes transformers aren't the final architecture in AI. One ask before we dive in. If you're listening to the show on Spotify, please consider leaving a rating. Ratings help boost the podcast, which in turn helps us get the best guests. Now, without further ado, here's Aidan. You know, Aidan, thanks so much for coming on the podcast. I really appreciate it.

Yeah, stoked to be here. Thanks for having me. Amazing. Well, I figure there's a lot of different places we can start, but one of the most interesting, I think, that everyone's thinking about today is the future of enterprise AI adoption and usage. And obviously, I know this is something you think about a lot. One thing that I feel like I see is there's these different models for enterprise deployments of Gen AI today. It feels like

Some are just straight up consulting, like you have a big Accenture team come in and build something. Others feel like they're maybe more Palantir-esque, where there's some forward deployed engineering, some product being built. And then others are trying to just sell an out of the box product, right? Say here's something you use, go make it work. Obviously, it's a rapidly evolving space, but I'm curious, as you think about which of these models ends up winning out in five, 10 years, what do you think it looks like for folks working with enterprises in the long term in Gen AI?

In the long term, I do think that something in the middle is going to win out. It's a new technology. It's very complicated.

And I wish it weren't the case. I wish it were much easier for enterprises to adopt the technology. But sometimes they need help. So some degree of support is just going to be necessary to integrate it into the economy, into all of these different companies and targeting all the different applications that they might need to go after. I also think

AI is unique and these agents are unique in that an agent can only do its job effectively at driving automation if it has access to the same things the humans have access to. And so that demands a lot of access.

It needs to be able to see into your emails, your chat, your calls, your CRM, your ERP, your HR software. It needs so much context. And so that presents a couple of different problems. The first is privacy.

There's very little software that needs that degree of access. And so privacy is a way bigger issue in AI and agents than it is in other types of enterprise software. So that's something that I think we do uniquely well. The second piece is each company uses a different tapestry or like mosaic of software for their humans, right? Like there's no...

standard setup of a company, of what stack they run on. And so each one is different and each one requires some degree of custom setup to bring all of that context together and integrate it into the model. It's another thing we're spending a lot of time on is making our new agent platform, North, much more easy to customize and integrate with

whatever that tapestry might be for each particular company. I mean, obviously today there's a lot of like custom integrations you have to build, rules around, you know, who has access to what, obviously, you know, a bunch of the VPC deployments as well. In the future, do you have like AI agents that kind of like obviate some of the complexity behind this? And it's like, hey, just enter the tools you want to integrate with, enter kind of some of the basics. So like, is that kind of a little bit fantasy? No, I mean, definitely like that, that would be extremely helpful if,

setup could be completely self-serve and the agents just go off and do what's necessary. I think we'll probably have some middle ground. It won't be the extreme of, yeah, there's no humans involved and actually we can just install a copy of it and you tell it what you want it to do and it'll install itself. There will be some middle ground where

you know, we'll be able to automate parts of the problem, but not the entire thing. So the stakes of a mistake on access to like salary data or something or whatever it is are high enough that you probably want a fair amount of guardrails there. Yeah, I think so. Yeah. And definitely customizing

customer data, patient data, these sorts of things. You just can't make mistakes. One thing I think that's so interesting about the seat that you have at Cohere is obviously you work with a ton of different enterprises doing different things. It feels like the debate in Silicon Valley these days is what's experimental budget, what's real budget? You have this great lens into the stuff that's actually working today. How do you categorize or even list the enterprise use cases for Gen AI today that you feel have product market fit? It's really hard to categorize. I think it's

There's some vertical applications, like, for example, in healthcare, we really want to make note-taking and form-filling much easier for doctors. And so, you know, having passive listening mics and being able to listen to that doctor-patient interaction and pre-populate things so that they're not spending time

half their time, you know, typing notes and that type of thing. I think that's very vertical specific, but much more of what we see are kind of these general categories of customer support is something where the technology is ready and the need is very much there. And so that's moving quite quickly across verticals in telco, in

in healthcare, in financial services, everybody needs this capability and the technology is there for it. The other one, I don't know really how to categorize this, but I guess like research, augmenting humans with an agent that can go off and do

a month's worth of research in an hour or two. We're seeing a lot of demand for that sort of capability. So you can imagine if you're in a bank and you have wealth managers, the wealth manager might manage a pool of...

20, 50 different clients. And those clients might call that wealth manager up and say, hey, I want to hedge against this geopolitical event that might happen in the future. And that wealth manager then needs to go and do research and come up with a strategy of how do I actually even hedge that thing? And it's super time sensitive because that might be days away. Especially in this current moment, certainly. Yeah, exactly. It's very relevant right now. But if we can make that

dramatically more effective because these models can read hundreds of thousands of times as much as a human and come back with a very robust piece of research with citations back to all the source documents so the human can audit it. I think we can just make knowledge work

10x more effective and productive. I guess one question that folks debate a lot is, obviously we're so early in exploring the capabilities of the current generation of models, right? We've only had like 2G4 for a few years, the reasoning models since the fall. Does it feel like there's a trillion dollars of value just waiting to be unlocked? Like if we just froze model capabilities today, or does it feel like there's actually, we still need to continue on this curve of model improvement to like really realize the full vision here? Reasoning is so obvious.

It's insane that anyone would be content with non-reasoning models. Because what is the input space? It's language. It's everything. It's everything from, hey, what's one plus one chatbot to proof for Matz-Las theorem. And so the input space is everything. And we shouldn't expect the model to spend the same amount of time to answer those two questions.

So the fact that reasoning unlocks a capability for the technology to spend different amounts of energy on different complexities of problem is so obvious and intuitive. And we know we have to have that capability built into the technology. There's more that's missing for sure. For instance,

the status quo with models. We spend all this money training them. We spend $100 million building a model. And then we get our final checkpoint, final weights, and we just distribute that to the world. And they're frozen. And everyone is speaking to

the same version of the model, it doesn't remember your chats from months ago. And so there's no notion of learning from experience. And so that's a clear capability that humans have, right? Like we can start dumb at something new and then become an expert over the course of four or five years of practicing that. These models should have the exact same ability to learn from their experience out in the world and learn from feedback from the humans that they interact with.

So there are these obvious properties of intelligence that are missing in the technology and that will need to come. But I do think there's a change that's happening right now, which is the whole scale is all you need hypothesis is breaking. We're very heavy into diminishing returns of capital and compute. And we're just seeing...

we're going to have to become smarter and more creative in order to unlock the next step up in the technology. But I think that's good. It's good pressure on us as innovators to go out and explore and try things out. The old strategy was boring and dumb.

So I'm excited for the new era. It's all destroying more money to compete. A kind of understood game versus trying to figure out how in the world to make custom compute work for non-easily viable domains. And obviously some of the algorithmic breakthroughs other folks are working on. I think it seems like an exciting time on the research side right now. Definitely.

Definitely. I mean, I guess you kind of alluded to it, but I'm curious, like, I think another question folks are asking is, you know, are we on a path to kind of one model to rule them all and you use it, you know, if you're a bank or a healthcare institution or, you know, a world of specialized models for different use cases? You know, obviously, I'm sure you have a strong opinion on this, both on the fine tuning, but also, does it ever make sense to have, you know, different kind of pre-trained models for these different domains? And how do you kind of think that develops over time?

I mean, I used to think much more we needed specialized models. With MOEs, models are kind of able to self-develop experts, and these are just like sub-networks within the model. And so that's alleviated some of the pressure. I would say custom models are still important. There's still fundamental context about a particular business or a particular domain that are

from models that are built off of the web, which the web contains a lot of information about humanity, history, culture, um, science. Um, but there's stuff that is not on the web and these models need to be able to get good at that to provide value in the world. Um,

So I think that's where custom models make a lot of sense is closing that gap. There's not a lot of manufacturing data out on the web or customer transactions or detailed personal health records. So for these sorts of modalities or types of data,

What Cohere does is we partner with the organizations that do have this data to create a custom model that only they have access to that gets very, very good at working in those domains. But the general models are extraordinary and synthetic data is able to close the gap considerably. And so I think certainly there won't be tens or hundreds of models operating within an organization. You might have a handful.

But I don't see it being like every single team is going to have their own fine-tuned model. Yeah, basically, if they focus it on some sort of different type of data that the model hasn't been exposed to, you may do some fine-tuning or even some basic pre-training on it. But otherwise, it doesn't make sense to have throughout the organization. I guess you kind of alluded to the data side there. And I think I'm curious, obviously, in this first wave,

of post-ChatGPT moments, it was like a bunch of RLHF data. Now it obviously seems like it's moved more toward expert data labelers and encoding and more reasoning tasks. Synthetic data, which you alluded to, has obviously played a big role. What role does data labeling still play for folks like yourselves that are building model providers? And where are the interesting areas going forward? Is it still relevant in the synthetic data world or-- EVAL. Humans are still just far and away

the gold standard. Well, I guess definitionally, if you're building models for people, they're probably best suited to evaluate their usefulness. Exactly. Yeah, eval is where you just can't take a human out of the loop yet. I mean, the way to get the human out of the loop is to have an expert that is better than your current model observing it.

but that assumes you can build that expert in the first place. So why do you? So there's, I think, a hard dependency on humans within EUL. On the data gen side, it's too expensive. Yes, you're right. Like we definitely still need human data, but it's too expensive to go find 100,000 doctors and have them teach the model to do medicine. That is not a viable strategy. Yeah.

It was a viable strategy to teach the model general things about how to converse, chit-chat, that type of thing, to find 100,000 average people and they can teach the model to do that stuff. So we've had to become much more creative. But teaching the model to chit-chat and do all that stuff unlocked a certain degree of

freedom in terms of synthetic data generation which we can then apply to these specific domains like medicine and we can use a much smaller pool of human data you know maybe i go to 100 doctors and get them to provide me some lessons and teach my model and then i use that pool of like known good very trustworthy data to generate a thousand fold synthetic look-alike data

And like you say, in verifiable domains like code and math, it's way easier because you can check the results and use that to filter the synthetic data and pull out the garbage and find the gold. In other domains, it becomes much more complex, but it's still viable. So at this stage, I would say in terms of the overall data that Cohere is generating for a new model,

an overwhelming majority is synthetic. - Super interesting. And you kind of alluded to, obviously, I feel like one of the questions in the field right now is how far test time compute gets us and what spaces it will and won't work in. How do you kind of conceptualize that today, both what works well today and then also on the near-term frontier? - For test time compute, for reasoning,

Cohere cares way less about solving Math Olympiad. We want these models. You don't have enterprises coming to you begging for models that can do well in the Math Olympiad? No. But we do have them begging us to help them automate processes within the back office that use this piece of software and also go out on the web and do some research and then do this.

And so for us, it's very much about teaching these models to reason through solving the problems that exist within business using the tools that humans and businesses use to get them done today. That's what we care about. And it has been a complete step change in terms of improvement. There were tasks pre-reasoning. We just couldn't get a model accurate enough.

to accomplish that. It was like impossible. Models could not solve the problem. They almost always fail. And with reasoning, they almost never fail. Like pretty much everything you throw at the thing, it'll figure out a way to actually accomplish it. So reasoning's ability to unlock reflection and understanding why the first attempt failed and then using that to find another path, an alternative path to the same outcome,

That's been a real, real unlock. Yeah. Was there like some sort of crazy moment for you when you were initially playing around with these reasoning models and saw that on one of these impossible tasks? Yeah, like things that just don't work start working. And complex tasks, there's many delightful moments where you're like, how did it know to look there? And it's, oh, it's because it failed looking in the obvious places. And then you read the reasoning traces, you're like, oh, it's actually pretty smart. Yeah. I mean, the reasoning traces are just, it's like,

jaw-dropping. It's so beautiful that these models organically problem-solve and think through stuff, and they have their own little epiphanies, and they're like, oh, you know what? I should check over here, and maybe this happened, and that's why it's not over there. That

has been super rewarding to see. I guess when you're talking to enterprises, how do you kind of contextualize where Cohere fits in the ecosystem today? Obviously, you've got the really large AGI labs and OpenAI Anthropic. You've got a bunch of open source model providers that are out there and folks working with them. How do you kind of contextualize that? And also, it seems like you build a lot of these-- help

that you help these enterprises build a lot of use cases. How far into the application layer do you want to go? Do you want to have a support product? Let me hear your riff on that. Yeah, so with North, we're pushing into the application layer. And that was totally motivated by we were selling models and we just kept seeing our customers building the same thing again and again and again. And it would take them like a year to build an application that

took models, presented a UI, integrated with all the different tools and data sources, did the privacy and ACL inheritance stuff well.

And then it was usually built by the AI team inside of these enterprises. And so they're not product teams. And so it's actually not that nice of a user experience. And so the people within the company wouldn't want to use it. And so we said, what if we just solve that entire problem for the enterprise? We're going to create a truly consumer-grade product experience that people love to use. It's super intuitive, extremely low latency, but ships with all the...

features that an enterprise needs, like the ability to customize it completely. You can customize the UI, you can rebrand it however you want, you can customize the data connections and the sources that your model is able to pull from and the tools that it's able to use. You can even, if you want to, like let's say you've trained some LAMA fine-tune or something on your data,

you can plug llama in there and you can expose it through this this application and so it's really just setting a company forward 12 to 18 months in its product roadmap and they immediately can distribute this tech into the hands of their their employees um so that's been the push for us definitely cohere has some very nice strategic attributes like the fact that we're not locked into one ecosystem

We're not within one hyperscalers ecosystem. We're able to deploy anywhere. We also release our weights non-commercially. And so we're happy to deploy in VPC because we're not worried if you can access our weights. We're very happy for you to do that. So I think just far and away, we are the best platform.

Partner for for enterprise I guess I'm curious because obviously one of the cool things about cook here is that both you have like the model development side and cutting-edge You know you have great research team and then you also have kind of the applications that you go on top of the models You know, obviously there's some folks that would say hey like we'll just take the best models that are out there and we'll build the Support application or we'll build you know, whatever it is and we don't do anything on the model side We're just like we'll take what's that mean? We'll do some like fine-tuning on an open source model

How do you articulate the advantage you get from building both the models and the applications versus these folks that are just building applications on top of the underlying models? I think we have far more levers to pull to deliver the experience that the customer needs. Because we're vertically integrated, the next version of command, our generative model, is going to be optimized. Actually, the latest version already is optimized for the use cases that we know our customers need in North.

So it knows how to use an ERP. It knows how to build with a CRM. And so that integration between the technology, the models, and then what the actual customer needs is something that I think is crucial to product quality. And it's part of the reason why

A lot of the existing applications are, in my opinion, missing or not fulfilling the product promise that they're making to their users. They're a consumer of the technology. They can grab some llama weights or something, but they're not able to change that technology for the needs of their customer. And so that presents a fundamental barrier to product quality.

I'm curious, one question I feel like gets debated a lot in the venture world is the right teams to build these AI applications and how much AI expertise you need. Obviously, you could imagine a spectrum of just smart Stanford grads that know basically how to fine tune these models but are not super deep on the research side, all the way up maybe to a Transformer paper co-author like yourself.

Do you think generally, as you think about the right teams to build these AI applications, across the board, do you think you need this kind of deep level of model knowledge? Or how do you kind of think about that? I mean, if you look at the ones that are building the best products, they do have that knowledge.

They do. There's just like a deep familiarity with how to build language models. And even if they're not training models by themselves, they're trying to find ways to like approximate that and have as much of an impact at the model layer as they can. So I definitely think that's a crucial piece to a company building on these models succeeding. We try to play that role for our partners. So we power...

a bunch of AI features within the Fusion application suite with Oracle and a bunch of other large enterprise SaaS companies. And so what we can do is because we can intervene at that level,

Whatever they need we're gonna make it happen and we're gonna make it work for them What what you see is does it feel like our next obviously you mentioned the ones that are working today I'm curious, you know, if you think about the next 12 months Both of the currents at a model capabilities, but also where it seemed the director is going like what do you think? We're having this conversation here and I asked the same question of what as product market fit like any any that you're like Yeah, these seem these seem pretty close. I think any of the like deep research style use cases

the tech is ready and the market is just being blown away by it. They can't believe it. It's surreal. The model comes back with a report that would have taken a month and a half from someone who costs a lot of money, and it comes back in 30 minutes, an hour. So I think that very much is like it's ready for prime time. This is going to be integrated into every single enterprise in the planet. It's something that we're pushing super hard.

Beyond that, I think looking forward, I actually see more of the mundane back office stuff starting to come online. The easier it gets for an enterprise to automate this technology when the infrastructure for doing that automation gets in place, like North, so a system that has access to all the different tools and pieces of context that it might need access to.

as well as a user experience to build those automations, to ask of the model some task, accomplish some task for me. As that infrastructure gets installed and plugged in at these enterprises, I think we'll start to see a lot of the back office tasks start to go, start to get unlocked. So that's stuff in finance, in legal. Certainly sales is very much involved.

ripe for this sort of thing. Yeah, like if you think about what a seller does before they go into a meeting with a customer,

It's research. You're going through what is this company, what's their strategic imperatives, who's leading the efforts that I care about selling into. And then you're also doing research internally at the company. What previous conversations have we had with them? How did that go? What were the sticking factors?

And you need an intelligence briefing on the person you're about to meet, everything they care about, everything they've heard about your company. And that makes you dramatically more effective at your job.

Yeah, so I think sales is going to be another big one. And I guess, you know, obviously you take support, sales, there's a whole host of companies that like, that's all they're focused on, right? Whether it's like, you know, Sierra and Sport or, you know, Clay and folks on the sales side. Like, you know, I assume on the one hand enterprises would love to work with one vendor that can kind of help them across all of these. On the other hand, you know, if all a team's thinking about all day is one of these problem areas, maybe they build better workflow. Like, you know, do you think we, you know,

Do you think we kind of federate into, you know, hey, I'm an enterprise and I choose best in class solution for finance, for sales, for support? Or like, do I go to kind of one place that has pretty good general knowledge across all of those things? I think there will be like a scattershot phase that then consolidates. So I think initially it will be federated into all different teams are buying their own little applications. And then they're going to realize, wow, okay, like, you know,

sellers don't just exist within their CRM. So I have to integrate that agent into all the other pieces of information within the company. Same thing with the finance team. They don't just exist within their ERP. So I have to integrate. So now you have...

like this insane maintenance burden of all of these data source connections into all of these disparate apps and so there's going to be a strong push towards consolidation i want one platform plugged into everything that i can accomplish all the different um automation objectives that i have on top of um

So the long game is building that platform, which is what we're doing with North. Yeah, it makes sense. I mean, obviously all these things need to speak to each other to like really be able to ultimately answer the questions you want to answer on the research side or kind of like enterprise insight side. You know, I guess reflecting back on the, on the Cochure journey so far, you know, what sticks out as like some of the key decision points that, you know, that you had to make that really tip things that maybe were like, you know, 51, 49 decisions along, along the way. It's been a lot of, uh,

consequential strategic decisions, everything from like starting up in Toronto, right? That's one that has really paid dividends. And it was not a given. We,

really did not know if it was the right idea early on. We had some arguments for why it was. You know, like Jeff Hinton is in Toronto. The Canadian AI ecosystem is, you know, Ilya's come out of it and like so many of us have. Starting up there has given access to really the world's best talent and

it's made us a darling of the Canadian AI community or the Canadian technology community. I'm curious, you know, to hear you riff on this a bit. It feels like, you know, obviously there's this really interesting interplay between, you know, uh, you guys have obviously gotten a ton of support from, uh,

the global ecosystem in Canada. The government obviously seems super excited about what you're building. And I think there's this larger question about it does feel like a lot of countries want to have their own foundation model companies. I'm curious if you can just riff on that intersection of national politics and priorities and foundation models. And do we end up with a couple dozen because each country wants to have their homegrown player? We probably end up with a few. I think the

buy versus build question or partner versus build question is a hotly debated one. For Cohere, we're very international. My mom is British, my dad is Spanish. I have passports in both of those places, also in Canada. The leadership team is from all over.

So we try to be a partner to these countries in adopting the technology and making sure that it works for them. Our partnerships with companies like Fujitsu in Japan, we're deeply, deeply investing in Japanese in our models. And we will forever now. We have a deep commitment to Japan and making sure that our tech works as well in Japanese as it does in English. We recently announced a partnership with LG in Korea, and we're doing the same thing for Korea. So we're...

We're certainly in favor of supporting everyone to make sure that their economy can adopt this technology, especially in places where the majority aren't native English speakers or the data that you're dealing with in that jurisdiction isn't in English.

A lot of our work on the open source side with Cohere4AI, the AYA project, that was the largest data collection effort for any machine learning project. Thousands of people, native speakers of over 100 different languages contributed data. And we open sourced it so that not just Cohere's model gets better in those languages, but so does every other language model. Yeah, we have a...

belief that the technology will not be useful and huge swaths of the global population will miss out if it doesn't speak their language and it doesn't understand their culture. So we've invested very heavily and will continue to. It's super powerful. You kind of alluded to it, but I'm curious, the kind of state-of-the-art AGI labs, are

How many do you think we'll have in like five years and what actually ends up making one better than another? We're all starting to do different things. So there's this like differentiation or like the diversity of strategies emerging and

OpenAI is right now clearly pushing the consumer front. Gemini is coming for that crown. I know that Meta is planning the exact same thing. Anthropic is very good at code. That's clearly the thing that they are the best at. And we're focused on enterprise and business back office applications. So everyone's finding their lane and their niche.

I don't know. I think among foundation model companies, there will probably be a couple handfuls. And that will cover a lot. And then we'll see new generations of foundation models come out, right? Like foundation models for biology, for chemistry, for material science. We'll start to see new generations of the stuff. Yeah, I sit on an advisory board for Kaya. It's a

cancer data and compute sharing alliance with like the Fred Hutch and a bunch of other large American cancer research and treatment centers. And that is data that is not out on the web. Right. And there's this like question of, okay, what if we had like a GPT moment where massive capital flooded in and

that was applied on top of that data or those modalities that count for treating cancer. And how do we build conviction about that? How do we de-risk that in the offset? If we do need to deploy 10 or 20 billion dollars of capital into this thing and we could solve cancer, let's say that that's the outcome that we're driving towards. What are the steps along that that we can show? So I can already see

those ideas emerging. And as LLMs and these sort of like general image, video, audio, language models emerge, there are other quite different domains like protein sequence modeling, material science, etc. that need a similar global effort.

um, to yield the same sort of just incredible advancements. Yeah. And it seems that the big thing there is, is ultimately like you might need data generation, right? Like the, you don't have, you don't have showing tokens on the internet. And so even, I mean, obviously a lot of problem with cancer data is that it's not particularly well structured or linked or tied to, to outcomes, but we've also, you know, for a lot of these bio foundation model companies, you know, they all spin up labs, right. To actually generate a bunch more data or, you know, the robotic side, it feels like there's a, there's a similar issue too. And so I think it'll be, you know, I think maybe that ends up needing to be the first area, uh,

of investment. And then, you know, also you obviously don't get the benefit of getting to eval it right away. Sometimes it takes five, 10 years to figure out if it actually worked. But seriously, a fascinating set of problems that I hope the best and brightest are going after. Yeah, yeah. No, me too. There is a lot of data out there in these domains.

I don't know if there's like a token scarcity. It's just that it's siloed and locked up in a hundred different places that refuse to share with each other or talk to each other.

The data absolutely exists in sufficient quantity. Interesting. So a human problem more so than a data generation problem. Yeah, exactly. Sounds about right. I guess I'm curious also what you think of some of these newer foundation model efforts, obviously, you know, via SSI and thinking machines. Like, you know, felt like for a while we might be done with like net new LLM providers. And then obviously there's been a few in the last year. Like, do you think we'll see more of that? Any reaction there?

Yeah, I mean, I hope that tons of new companies come out with new takes on stuff and try their hand at building something valuable for the world. I think that's great. On like the technological front,

There's a lot of debate about the architectures that could come next and like our better architectures out there. I feel like you'd have a good take given you invented the current one. Listen, I think I've said this publicly before. I'm like the first person to be like, why the hell is the transformer still being used? What's going on? Yeah, I've been waiting for it. Like we named...

in our New York office, we have a meeting room called SSM. Because I was like, this is it. Transformers dead. Let's go SSM.

But then it turned out you can just kind of like steal the good bits of an SSM and put them into a transformer and then the need to swap to an SSM goes away. Now there's these discrete diffusion models that are coming out and they're a super cool UX. Like you start, it's like normal diffusion, right? Where you have a wall of noise, like a wall of noisy tokens and text. And then out of the ether emerges the response. And I mean, that's cool.

Is it actually a better language model than a transformer? I don't know. I don't see why that would be the case. So it's tough, man. How likely do you think it is that we do have a new architecture in the next five, ten years that becomes predominant? Oh, God, please. Please. If you had asked me in 2018, a year after the transformer paper got published...

what was the likelihood that in seven years we'd still be using transformers, I'd put it at pretty close to zero. So I'm not going to make an estimate. But yeah, the longevity of this thing has really surprised me. You've been doing, obviously, cutting-edge AI research for a long time now. I'm curious, what's one thing you've changed your mind on in the last year? Maybe the scaling hypothesis, right?

I feel like I was pretty loyal to that the past few years. Good reason to be loyal to that player. Maybe that's something I changed my mind. I mean, in the back of my head, I've always thought, really? Are all the capabilities we need going to fall out of scaling? Really? It seems unlikely. Yeah.

But the evidence just kept showing up. It was just, yeah, we made it bigger. Now it can do math. You know, like all the sorts of stuff. So I was like, yeah, I guess this is it. But now we've just been shown, really beaten over the head with it, that...

scaling is not going to get us there. Yeah, that's something I would say all the capabilities will fall out from scaling test on compute, right? That it's just like another scaling vector. Like, do you subscribe to that or is there still some algorithmic breakthrough needed on the horizon? I mean, then you're like really muddying the definition of scaling. Because a lot of like what scaling is now is actually data and getting diversity in the data, finding demonstrations for the model of how it can problem solve in specific domains.

I don't know if it's as trivial as the previous scaling law, which was literally like, "I'm going to build a 2x larger computer and press run on the same thing, and the loss will go down." Does anything change on the hardware side now, as you think about needs for hardware going forward, given that we're maybe out of the pre-training era and into more test-time compute and other approaches? Yeah. Test-time compute still requires tons of compute.

It makes inference, whatever it is, three to ten times as expensive. And for training, it still requires just tons and tons of compute. But I think compute is going to get cheaper and more abundant per flop. There's multiple options now for training compute, which there wasn't before.

And so that's a very positive thing for the world and for the industry. We can wire together more than just one type of chip and get a very effective supercomputer for building models with.

So all the trends on the compute layer of the stack are very good for those of us who have to consume a lot of it. What are the future model milestones that are meaningful to you? When the latest state-of-the-art model comes out, is there something you try? I thought the playing Pokemon was a fun eval from Claude. Not that any of your enterprises particularly care about that. But anything that you try whenever one of these new models comes out just to get what you think would be a meaningful capability breakthrough? I mean, for my own...

Models like for command. Yeah, the first thing I do is just I try to use it as part of my day-to-day work I try to get to like automate stuff that I have to do. What's the latest thing you've automated? I do a lot of like prep for meetings using the model And basically like we have an internal in North like deep research, but it's plugged into not just the web Everything like I can see

Every single call transcript, everything that I've said, every conversation that I've had in DMs all over the place, it just sees as much of my life as I can possibly make it see.

And it's able to give extremely compelling responses. So yeah, like that's most of my day, right? As a CEO, like I'm meeting customers, investors, talent, and I need to be ready for those meetings. So that's where I plug it in. I can't do that with other models. So usually when I'm playing with like,

a new third party model. I do like riddles and stuff. I don't know. So the reasoning, the reasoning models were very good. Do you have a go-to riddle? I do. It's one of those like, you have to come up with a matrix and like figure out

who is what based on all this information and, you know, who checks what boxes. And it's about spelling words. And so the ability to break a word down into its characters, similar to like the strawberry thing, which we know models suck at, um, is involved. I usually go to that one. Um,

And then math stuff. Just like, can it solve math problems? But it'll shift, right? Like, as models become able to do all of that, like they are now, I've stopped using the math ones because it solves all the math ones. It's not interesting. Like, there's nothing to ask there. Yeah, most of it is about trying to break it, right? Like, you're trying to find the boundary of the...

application of this model. Well, I mean, shifting gears, you know, I feel like obviously you were part of the original Transformer paper crew and, you know, been a part of a ton of really interesting AI research work. I feel like people like to write these thought pieces about, you know, the culture that enables these things. I'm sure you think about this a lot in the comments to building the Code Hero research team. What have you learned, I guess, or what lessons do you take away from like the most successful AI research groups you've been a part of?

With Google Brain, what was so incredible was it was a home for really brilliant people to do their best work. You got pretty much complete research freedom to do whatever you wanted to do. Tons of compute, tons of software infrastructure to use that compute, and really great people to do it with. And so I think that...

setup led to some really interesting things. But it's not the setup that delivers an incredible product. And so for Cohere, the needs are different. We have a very targeted focus. There is something to accomplish. We have an objective, which is driving this automation, making this technology good at using all of this different software.

And so the problem set is quite focused. You can't work on whatever you want. There isn't time for that. And so we have to be much more focused and targeted in terms of where we invest. But what we preserve are incredible people, tons of compute, and the same sort of world-changing ambition that I felt back at Google. And also a warm culture. I think...

Maybe things have changed in different labs. But when I was there, it was like a deeply warm and meaningful, welcoming place to be that cared about something. And so that's something that we, I think we preserve quite well. Yeah, no, that's awesome. Is there an impact of like continued model improvement that you think we're under thinking right now as a society? Learning from experience.

Once models can learn from interacting with the user, just from like a what you can build perspective, that capability unlocks so much. Right now, with a model that doesn't do that, I give feedback. But when I start a new chat, it's forgotten all that feedback. It's just wasted time. Like if it can't do something, I'm just annoyed. And then I close the window and I go do it myself.

But if a model is kind of like an intern where, hey, it's the first time it's doing this thing, it's going to mess up a little bit. I'll teach it. I'll guide it. I'll give us some feedback. And then, hey, look, you did it. And it never

never makes those mistakes again. How does that happen? Is that like you have your own personalized model or is it just stuffing that in the context window or like putting it, you know, in some database that you tie, you know, via RAG or something? Or how does that actually get actuated? Well, it doesn't exist yet. So I think people are experimenting with all of the above. I think it probably looks like the, was it the last one that you said, which is like putting it in a database that can be queried and looked up and there's just always a context of the previous thing

history of interactions available to the model when it's generating. Probably looks something like that. But yeah, you think about what that unlocks. Like I'm so much more invested in interacting with this model because it's learning from me. It's growing with me. It starts to know me, know my preferences, know how to do the things that I need it to do because I've spent so much time teaching this thing and I've turned it from my little intern to like me 2.0. And so that connection with the system is something I'm

I'm very excited for. Right now, you're right. It feels like every time you're working with a model, it's a new batch of interns that just came in in their first week. And then you do it for a week, and then it's, again, someone else's first week. It's like Groundhog Day to some extent. Obviously, I'm sure in the course of your work with models and general AI research, you probably come to believe that the advancements in AI are going to happen sooner than you previously expected. Has that changed at all, the way you live your life or think about the future?

Not remotely. Not at all. Not at all. I'm super excited. I'm super excited. I think, you know, um, my dad is a cancer survivor. You know, uh, I really hope, you know, he lives for, for a lot longer, um, and that treatments continue to get dramatically better. I hope that, um, I hope that we're able to drive costs down and that, um,

supply increases in the world. Because I think we are supply constrained. We're not demand constrained, which is why I don't think there's going to be any sort of like mass unemployment. I really don't believe in that. So long as we're able to move people onto the spaces that we need them and we can retrain folks, humanity has infinite demand.

And so we are fully supply-side constrained. And so this technology isn't going to displace people. It's going to augment them and let them do more and deliver more of what the world is wanting and needing. So I'm really excited about the future. But I don't think it's a, you know, utopia. I think it's a much better world. It's a much better world. But it's not a utopia. It's not like you should be...

you know, liquidating all your assets and just going on vacation permanently or something like that. Yeah, no, it's just going to be a way better world. Yeah. To what extent do you worry about like some of the AI risks that folks talk about, both, you know, in the short term and then like, you know, the existential ones? I worry about them. I worry what sort of capabilities bad actors will be able to access, especially at the state level. I want to make sure that

Liberal democracies are the first ones to get it and that they establish an advantage I worry about do we have the infrastructure in place to facilitate moving people to new careers if particular jobs are impacted I want to make sure that we have that that capacity to move them and retrain them and put people on new more fulfilling Work

But I'm not afraid of like the X risk, like Terminator breaks out of the box and gets all of our nukes and blah, blah, blah. Or it's like nefariously manipulating people to do what it wants. I really think there's so much to worry about with this technology that the doomsday scenarios aren't where we should have the public and policymakers focus.

right now. There's more than enough we've got to get right in the near and midterm. It's been a fascinating conversation. I'm sure there's a ton of threads folks will want to pull on. I'd love to just leave a lot forward to you. Where can folks go to learn more about you, the exciting work you're doing at Cohere? Anything you'd like to leave our listeners with, the mic is yours.

Yeah, I'm not that interesting. I think Cohere is the interesting one. And so you can go to Cohere.com. You can follow me on Twitter and you'll get all the alpha on AI stuff. Love it. Thanks for having me, Jacob.