We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Imbue CEO Kanjun Qiu on Transforming AI Agents Into Personal Collaborators - Ep. 239

2024/12/16

The AI Podcast

AI Deep Dive AI Insights AI Chapters Transcript

People

Kanjun Qiu

Topics

Kanjun Qiu：Imbue致力于构建AI代理，使之成为个人协作工具，而非简单的自动化系统。其目标是赋能用户，让他们能够塑造和定制AI代理，使其符合自身需求，而不是依赖于无所不知的外部智能。这需要在技术和用户体验方面进行创新，发明与人类现有认知相符的概念，例如将AI代理比作文档编辑器，方便用户理解和使用。Imbue认为，AI代理是用户与计算机之间的中间层，通过代码抽象编程，让普通人也能进行编程，从而重塑个人电脑，赋能创造。未来，AI代理将不仅仅是自动化工具，而是能够与用户协作，增强用户能力的协作系统。 Kanjun Qiu：Imbue认为，AI代理的真正潜力在于赋能人们完成以前无法想象的事情，例如创建个性化软件来解决特定需求。这将打破当前软件由少数人开发，多数人使用的模式，进入一个每个人都能创建定制化软件的时代。这不仅能提高效率，还能让用户拥有对数字环境的更多控制权。Imbue希望通过构建易于使用的AI代理工具，让更多人能够参与到软件开发中，从而创造一个更加个性化和去中心化的数字世界。 Kanjun Qiu：Imbue在AI代理的开发中，面临着诸多挑战，例如如何平衡模型能力与用户当前水平，如何设置合理的预期，让用户理解AI代理并非完美无缺，如何改进模型的推理和验证能力等。Imbue认为，解决这些问题需要在技术和用户体验方面进行创新，例如通过模块化和结构化的代码库来提高AI代理处理复杂系统的能力，通过交互式的工作流程来提高用户体验，以及通过强化学习和微调来提高模型的推理和验证能力等。 Kanjun Qiu：Imbue认为，未来大部分软件都将成为AI代理，这将是一个更加智能化和个性化的软件世界。Imbue希望通过提供易于使用的工具和资源，例如700亿参数模型的预训练模型和评估工具包，来促进AI代理的普及和发展，让每个人都能创建属于自己的AI代理，从而提升工作效率，并改变人们对自动化技术的看法。

Deep Dive

Key Insights

What are AI agents, and why does Imbue focus on them?

AI agents are systems that act as intermediaries between users and their computers, enabling users to program their computers intuitively without thinking of it as coding. Imbue focuses on agents because they aim to empower individuals to create and own their own software, rather than relying on centralized, rented software.

Why is the traditional delegation model of AI agents problematic?

The delegation model is tricky because it places the burden on the user to define the problem and scope clearly. Often, the agent's output may not align with the user's expectations, leading to negotiation and additional effort to refine the result.

How does Imbue redefine the concept of AI agents?

Imbue redefines agents as collaborative systems that work alongside users, empowering them to create and shape their digital environments. Instead of being autonomous personal assistants, agents act as a layer of abstraction over programming, allowing users to interact with their computers more intuitively.

What is Imbue's vision for the future of software creation?

Imbue envisions a future where most software is agent-based, allowing individuals to create bespoke software tailored to their specific needs. This democratizes software creation, enabling users to build and own their digital environments rather than relying on centralized, corporate-controlled software.

What challenges does Imbue face in building AI agents?

Imbue faces challenges in ensuring reliability and robustness, especially in a collaborative model where users may need to check and refine the agent's output. Additionally, creating intuitive user experiences that align with human mental models is crucial for adoption.

How does Imbue approach the issue of AI-generated software accuracy?

Imbue focuses on verification and robustness, ensuring that agents can verify their own outputs. This involves improving the models' ability to self-check and correct mistakes, which is a key research direction for the company.

What role does reasoning play in Imbue's AI agents?

Reasoning is integral to Imbue's agents, particularly in the context of verification. While models can perform reasoning, they often lack the ability to verify their own outputs accurately. Imbue's research aims to enhance this capability, making agents more reliable and user-friendly.

What does Imbue's future vision for AI agents entail?

Imbue's vision is to democratize agent creation, enabling everyone to build their own agents. This would shift the focus from automation to empowerment, allowing individuals to automate parts of their own jobs and create bespoke software tailored to their needs.

How does Imbue address concerns about AI-generated software being deployed on the web?

Imbue envisions a future where individuals can create their own software to defend against intrusive or harmful automated systems. This includes building bespoke software to disrupt spam, scams, and other unwanted automated behaviors, empowering users to take control of their digital environments.

What is Imbue's stance on the relationship between AI agents and software scalability?

Imbue sees scalability in terms of a future where software is less hard-coded and more interactive, allowing for a vast ecosystem of bespoke software. This would enable individuals to create their own automated systems, leading to a more personalized and less centralized digital environment.

Chapters

This chapter explores the concept of AI agents, their purpose, and Imbue's approach to building them. It highlights the shift from agents as mere personal assistants to agents as collaborative tools that empower users to program their computers intuitively.

AI agents are evolving from simple personal assistants to collaborative systems.
Imbue's mission is to reinvent the personal computer by empowering users to create and own their AI agents.
The most effective way to interface with a computer is by writing code, and agents act as an abstraction layer on top of programming.

Shownotes Transcript

Translations:

中文

Hello, and welcome to the NVIDIA AI podcast. I'm your host, Noah Kravitz. One of the big transformations being enabled by AI is the way we create software. From coding co-pilots to in-development systems built to translate plain language requests into fully functional applications, generative AI is fueling a new wave of tools to help us create software faster. Our

Our guest today is Kanjun Q. Kanjun is co-founder and CEO of Imbue, a three, three and a half year old company somewhere in there. They're founded in 2021 that is building AI agents that work with us to translate our ideas into code and bring them to life. There's a lot more to it than that, but why hear it from me when you can hear it directly from Kanjun. Kanjun Q, welcome. Thank you so much for taking the time to join the NVIDIA AI podcast. Thank

Thank you, Noah. It's great to be here. So let's talk about software development and AI and Mbu and all of that good stuff. But maybe let's start with agents. Agents are kind of a, I mean, I don't want to use the word hot because I don't want it to sound, you know, fluffy, right? But agents are a thing right now.

We've had some guests on recently talking about agents in different contexts. And in Vue's approach to agents is something worth delving into. So maybe we can start there. What are AI agents? Why do we need them? Why isn't Vue working on them? Yeah, agents are really popular right now. So Vue was founded in 2021, early 2021. And

At that time, our goal was to figure out how to make and work on general AI agents. At that time, people thought we were totally crazy. Like, what are agents? Everyone's working on AGI. You know, AGI is going to rule everything. But what we were really interested in is how can we have systems that we as people kind

kind of can mold shape to what we're wanting, as opposed to, oh, this external intelligence that knows all the answers. And so we started as a research lab at that time because the technology was certainly not good enough to build general agents that could reliably do anything that we wanted them to do.

And, you know, in the very beginning, we thought of agents in a similar way to how a lot of people think about agents today as these kind of, you know, think about what an agent is. Often people think about kind of an autonomous personal assistant type thing. You ask it to do something. It does it for you. It comes back to you. Now everyone has their own personal assistant.

And actually, a lot of our learning has been that that's a really tricky user experience. And we experience it ourselves with human agents, which is that often when I delegate something, it comes back, it's not quite what I wanted. And now I have to negotiate how to get what I wanted.

I was listening. I told you before we record, I was listening to a little bit of your fireside chat with Brian Cuttenzero from GTC this year, which listeners go check that out. It's a great listen. And you were talking about the difficulty. I can relate to this so much. The difficulty inherent in delegating work to stakeholders.

someone else, right? And to your point, thinking of it as humans, you have to break the problem down. You have to sort of figure out, well, how much do I tell them exactly what to do? Yeah. All that. Yeah. What context does it need ahead of time? What instruction should I give? Delegation is actually a really tricky paradigm because it actually puts all the onus on the person who's delegating to define the problem, define the scope. Of course, the person who's being delegated to the agent might come back with some questions and

and stuff like that. But it's a very tricky thing to trust. So what we've learned over the years working on general agents is we've actually started to think about agents in a very different way. And this is both from a kind of pragmatic business like user perspective and also from a mission perspective. The way that we think about agents is if you think about...

what an agent is, what this personal assistant is doing. What it is, is it's kind of this intermediary layer between you and your computer, and you're kind of telling it stuff or interfacing with it in some way, whether it's a UI or a natural language, and it's interfacing with your computer. And the most effective way to interface with your computer is by writing code. That's what your computer is made of. And there are really kind of two types of code. There's like every

Everything is hard-coded, which is the default for software today. Everything is hard-coded. So now your agent can only do a subset of things.

Or now with language models, you can generate code. You can write code that does new stuff that's not hard-coded. And now you have an agent that's much more general. It's able to do sets of tasks that I didn't program into it ahead of time, but now it's able to do. And so the way that we think about what agents are is actually as this intermediary layer between your new computer and it's essentially a layer of abstraction on top of programming that allows us as regular people to...

to be able to program our computers without even thinking about what we're doing as programming. And so we think of our mission as essentially trying to reinvent the personal computer and really deeply empower everyone to be able to create in this computing medium of the future because this digital medium is becoming more and more important in our lives.

And what we want is actually not what I want, at least personally, is actually not a super centralized assistant that someone else has decided is able to do this, integrate with that. What I actually want is something that I can make and own that is mine and that does what I want it to do, kind of serves me. And today we're kind of in a world where like all of our software is rented. It serves other people.

So that's kind of what we think of what agents are, is this layer of abstraction on top of programming makes it so that it's very intuitive for everyone to program. And that actually requires quite a bit of invention. So can get into that and historical nuance as well. Well, the way you were describing it, I know you weren't describing the sort of AI, the assistant, the agent layer.

All the A words, AI, assistant, agent. You weren't describing the agent layer as replacement for a user interface. So you mentioned, you know, UI, but that's kind of what I was thinking. I read that abstractive layer that's kind of in between. So how is Imbue approaching it sort of on the ground and working with you or primarily with enterprise clients? No, we primarily with, I would say it's prosumer. So people who are, so the way technology

to think about what we're doing is instead of thinking of agents as automation systems, right now we're in kind of the agent automation agent paradigm. We think of agents as collaborative systems. So how do we enable a system that empowers me and helps me do more of what I want to do and work with, I can work with it.

And so in the beginning, we're actually, you know, enabling you to write code. Right now, these models are not that good at writing code. As they write code, you actually have to go check it. So you have to understand how to write code to go see, oh, did the agent do a good job? But over time, as these models get better at writing code, now you don't have to check it anymore. And so in the beginning, we start with software engineers as a prime, or we call them software builders. So you don't have to be an engineer anymore.

I'm a software builder. I'm not a software engineer. We start with software builders who can... I'm a prototype builder then. I wouldn't go as far as software. Okay. So you could probably be a soon-to-be user once we get to a place where you don't have to read and write touch-to-code so much.

Right now, we're targeting software builders who can read and touch the code and be like, okay, well, it's not quite right. We want to adjust it in this way. And over time, as the models get better and you don't have to be so low level in the code, now more and more types of creatives, types of builders can use these systems to really mold and shape their computer to what they want it to be.

And what level of complexity, what level of depth is the actual software that users are building within Vue? There's the issue of accuracy, obviously, as you were saying. None of the models are creating 100% perfect code yet. But I also wonder how complex can these things get? And that kind of brings us to talking about reasoning. We don't have to get there yet, but kind of edging towards that conversation. Yeah, I think one of actually our biggest learnings has been that

If as a user, I have a code base that is fairly modular and I've kind of broken things down, then the model is actually pretty good at dealing with a very complex system because it doesn't have to load that much stuff into its head and it doesn't have to like cross-check all these dependencies. So just like with humans building software where you don't want a ton of dependencies, also, you know, if you have a slightly more isolated system, it'll do a better job.

Similarly, there's a lot of kind of user expertise in using this kind of product. So our product, it feels very collaborative. It's almost like a document editor and kind of like interfacing and interacting with an agent in that way. And so as a user, you can basically, we learn to give it tasks it's more likely to succeed at. And we learn to structure our work so that we can delegate it to agents. And we've seen this with other AI tools as well, like Copilot.

our team definitely writes code in a slightly different way so that Copilot can work well for them. Right. To your question of complexity, it really depends on the user. Some of us can make it work with really complex things. Yeah. Where are you seeing agents being used the most or perhaps it's more that they're having a dramatic impact where they are being used? How does that translate into businesses remaining competitive, having a competitive edge?

Folks that I've talked to and I keep talking about, you know, 2022 last year were kind of the year of the models coming out and mainstream taking notice of Gen AI. And perhaps this year has been the year where people are trying to figure out what apps do I build to leverage these things, you know, as an app, as a software building business or as a business that does other things that wants to leverage this. So where's Mbu seeing that?

impact being made or even, you know, looking to areas in the near future? You know, the interesting thing about agents is it's such an ill-defined term right now. And

People are calling all sorts of very trivial things agents, and that's fine. But I think there's a spectrum of kind of agent usefulness, effectiveness. There's like a system that scrapes the web and then aggregates the data and pulls the data out in some way. This is kind of like basically just a Gen AI model. Like, you know, it's kind of very similar to ChatGPT, but you like put something on top of it to like dump the output into a different system. You could call that an agent. So some people call that an agent. We see that kind of thing being implemented in all sorts of places. Right.

But I think the much more exciting thing when it comes to agents is these more general agents that enable people to kind of start doing things that they previously didn't even imagine that they could do. You know, I think like some of the really simple examples right now are for us, like...

Some researcher or scientist, a biologist has a ton of data that they need to process and they're not a software engineer, but they're technical enough that they can kind of like pull in the data and then get something out of it, get some analysis out of it. If they're able to use something like this, that

lets them work at this slightly higher level. Or, you know, kind of over time, a very exciting thing is as we start to build the tools that we need, like an example is my grandmother gets a bunch of scam calls in Chinese. But all of her calls are in Chinese. And if I want to build a piece of software that filters out her scam calls from her other calls, like this is

very hard right now, even for me as someone who knows how to build software. And it's such a niche market. Like no one else is going to build that software for her. We've tried to find software like that in the U.S. doesn't really exist. And so, exactly. So right now we're in this world where software is built by... Not to interrupt you, if it exists in the U.S. for English language spam, it doesn't work that well either for my... Exactly, exactly. Exactly.

So, you know, right now we're in this world where other people build software for us. We have to rely on other people to build software for us. And it's actually really strange. Like, we don't really own our digital environments in that way. Like, everything is kind of built by someone else because it's too hard for us to, like, build our own things. And I think there is a future in which, like, I could actually pretty easily build something for my grandmother. Yeah.

Or for my community or for my group of friends or for my church to manage registrations or whatever it is. Right. And that can be really tailored to my particular use case and to me, my community, my friends. And so I think the really exciting thing about this future is like...

all of this like bespoke software as opposed to today where we have this kind of centralized software. It's almost like people don't often think of their digital environment in this way, but the digital environment is like the physical environment. And today it's as if we all live in corporate housing. Mm-hmm.

I used to be so excited that I could listen to any music I wanted for, you know, 10 bucks a month. And now I'm thinking like, but I don't, I don't own any of it. They could take it away from me in a second. Yeah. And honestly, I think a lot of the kind of frustration people have about big tech, about technology is that we don't feel like, and I don't feel like I have control over these things that are a huge part of my life. And so that's what,

At Imbue, what we want to do is give that control and power back to the people. And we do that by creating these tools and systems that collaborate with you to help you be able to create stuff for yourself.

So how hard is it to build these things for folks to use? What, you know, as you mentioned, there are different, many different, you know, voices, individuals, companies talking about agent-tigentic AI, and a lot of them are defining it, talking about it at least slightly different. I'm sure taking, you know, different approaches kind of under the hood.

What are the challenges? What are the things that, you know, we can get a little bit technical here as you like. Some of the things, some of the problems that, you know, you and your teams are solving to

to make it easier for the rest of us to translate our ideas into software? Yeah. So some problems I would say are kind of universal across all of these different types of people who are building agents. And some problems are unique to us and what we're trying to do. So I would say, you know, most people are building agents in this kind of like workflow automation paradigm I mentioned earlier. And so for that paradigm, robustness, reliability is really important. Like, okay, you

you know, I built a thing that responds to customer service tickets. But if it 3% of the time, it says something really terrible to the user, like this is not a usable agent. Yeah. For us, reliability and robustness is important, but it's actually a little bit less important. As it gets better, the user experience just gets better. As a user, I don't have to check stuff as much. Right. But even if it's not the best, it's still okay. Like I can still use it as the user and I'll like fix the bug that, you know, the model will produce and that's okay.

So a lot of what we think of is kind of like, how do we get agents to meet both the model capabilities and also users where they are today? So that expectation is built in that we're not at the stage yet where it's error-free. And as a user, you need to know that. And it's not just like,

okay, you have to accept that, but it's actually like your experience is going to wind up being better, right? Because you know your part in it. And it's, again, as you said, it's not sending it off to do something and, you know, giving us the final result we had no part in. Yeah, exactly. I think, you know, people often think of agents as a research problem, but we think of it as both a research problem and a user experience problem. And this user experience part is really about setting the right expectations so that with the experience, so that it's like,

I'm not expecting it to go off on its own for 10 minutes or 10 hours or 10 days and come back with something magically correct. Instead, I'm kind of iteratively working with it and seeing, oh, it's kind of like clay. You know, I'm molding it, shaping it, shaping the output.

I think the workflow automation agents, some of these agents kind of, they're a little bit more, the bar is higher for how accurate they have to be because what we have found is that as a user, the longer it takes to come back with an answer, the more I expect the answer to be correct and I'll get frustrated if it takes a really long time. So we're very much on the like highly interactive, don't take a super long time to come back with something, be an agent that really works with the user side. Thinking about the idea of moving from

I'm expecting from a computer, right? The utmost inaccuracy. My calculator always says two plus two is four. Kind of moving from that to just a different frame of mind as an end user saying, okay, we're prioritizing, you know, I don't want to put words in your mouth, but kind of the speed and experience and you know going in, it's not going to get them

all right. Is that something that, you know, because this is the nature of AI and gen AI and people are kind of used to that by now, people are accepting of? Or is there still kind of a, I don't know, maybe I'm just old. Is there still kind of a mental hurdle to getting past, right, that expectation? Yeah, so

One of our core philosophies is that we need to meet people where they are in terms of what mental models we have in our heads as people. And so actually a good historical analogy is back before the personal computer, people were really excited about the supercomputer. When the first personal computer first came out, everyone made fun of it.

They were like, this is a hobbyist toy. And the supercomputer, you know, you accessed it with a terminal. You were time sharing on these supercomputers. It was not especially usable. And so a very small set of people were able to use it. But as time went on, a small group of people at Xerox PARC actually invented a lot of these primitives that led the personal computer to be usable by people. They invented the desktop files, folders. These are like concepts that we understood as humans at that time.

And so for us, you know, part of actually building a good user experience around agents requires invention. It requires inventing concepts that match, kind of like, are able to map this technology to what we currently understand as humans. So earlier I was saying, it's kind of like a document editor right now, you know, our current product experience. And it may not...

ultimately be that way. But a document is something that I, as a person today, understand how to edit and work with. And it's almost like an interactive editor that helps me accomplish my tasks. And to your question of how users receive it, one thing that's really interesting that we observe is software builders, the feedback set

So far, it has been, wow, like, this is really interesting. It lets me work at this higher level. I don't have to dig so far into the code all the time. I can kind of stay and thinking at this higher level. And it's actually able to go down into the code and surface to me, like, do the task.

And then I check it. And that's pretty cool. It like lets me move a lot faster. And that's really interesting. You know, that for us, that's kind of the primary thing. Like I want people to be able to work at our human problem level of abstraction with software instead of having to get so deep into it.

into the weeds. Yeah, yeah. No, I can relate from the standpoint of learning how to use these tools when I'm writing, when I'm not on the podcast asking people like you questions. You know, a lot of my work is writing. And if I'm working from a document, a transcript, source materials, it's that same thing when I can

use the tool to just kind of surface to me, you know, did anywhere in the documented kit or in the transcript, did Kanjun talk about pineapple on pizza, you know, and just it being able to surface that back, right. It saves all the time of going through the document. I don't need the exact words. I don't need it a hundred percent. Then we'll get to that later, but I can just kind of go back and check like, oh, right. She said she is, isn't a fan of pepperoni. You know? Yeah. And it's, it's, it's,

It's incredibly helpful. You know, it's not just a saving time, but I think you said it really well. It allows you to stay at that level of thinking. Exactly. Yeah. I think our core, the thing I really care about is helping people be able to be creative in this way. And often there's such a barrier between our ideas and the execution of those ideas that we never get to do a lot of things that we really want to do with our lives.

And so I think the true power of computing as a medium hasn't really been unleashed yet, and we want to unleash it. And what that looks like is that people are able to kind of take their ideas. You can take your ideas about writing this piece or aggregating all of the pieces you've written and being able to draw insights out of them in order to create your book. Take these ideas and aggregate them.

actually be able to work with them at this higher level so that you're not always bogged down in the weeds. And I think the true power of AI, the true power of computers, like that's what it enables. And we're not there yet, but we can get there. And it's not just about automation or business automation, business workflow automation. Right, right. Now, what was the, in your conversation with Brian from GTC, what was it that he said? I've got all these ideas and I sit down to code and I'm like,

import what I want to import, right? And that is great because you're derailed immediately and I can relate in the work that I do that, you know, yeah, yeah. Yeah, 100%. Yeah, one of our users said, wow, I never realized how much context switching I do when I'm writing code from high to low level. Same with when you're writing normal stuff. I'm speaking with Kanjun Kew. Kanjun is co-founder and CEO of Imbue. And we have been talking about Imbue's work recently.

on AI agents that help people code and really fascinating approach that, you know, as we've been talking about, I think goes beyond just expressing in code, but code being the way that we interface with computers and get them to do the things we want them to do. Well, I want to ask you about AI models and reasoning. And then I also want to ask you sort of about scale and what goes into, you know, building an agent for,

yourself and then what goes into building agents and multiple agents and agents that collaborate for users at scale. Is there an order we should go in? Should we talk about reasoning first? Is there a relation? That's interesting. Let's talk about scale first. Okay, cool. Yeah. So one way that people think about scale with agents is a lot of agents interacting with each other and kind of what that looks like. And some people do that by giving different agents different prompts. So they have different personalities and things like that. And

Honestly, I think that's a little bit, it's interesting. It's a little bit limited because we already have agents today. All software is agentic. The whole point of what an agent is, is that

something that takes action. And it uses your computer to kind of like execute something. And so almost all software is executing something. It's kind of like changing the state of your computer, website, data, etc. Now, the difference between most software and like AI agents, what we call AI agents today, is that AI agents can process stuff in a way that's like not fully deterministic. But even so, we still had AI

AI agents in Facebook newsfeed, for example. Recommendation Engine is an agent that non-deterministically decides what to show you. So we've had agents since forever, since we had software. And so, you know, kind of the way I think about scale and agents is actually about, is the same as the way I think about scale of software. So in the future, next 10 years, I think there's going to be this explosion of software and the software is going to be slightly less hard-coded than before. It's going to be able to work with

more ambiguous input. It might be more interactive. Hopefully, a lot of people are going to be able to create it if we succeed. And so now we end up with this giant ecosystem-like world of software that's far beyond what we have today. And in that world, what happens is now you have a lot of different automated systems interacting with each other. And that's

this is actually could be super exciting. Every person could have their own automated systems that do things for themselves, for their lives. They have, you know, I'm supported by the software that surrounds me as opposed to today, maybe being bothered by it. Right.

I was listening to you and I was thinking of how to phrase, how to try to phrase the question, getting back to your point about, you know, sort of, I was thinking of it as sort of one size fits all software, you know, that's deterministic and it does what it does versus, you know, me being able to create and reshape things as we go. And you answered it for me. So that's

great. I love this. I love this one size fits all software today versus future bespoke software. That's a great term. Are you worried at all about, I think, I don't know if the term AI slop applies to code, but this idea of, you know, AI models creating text that's kind of meaningless or valueless, but it's being automatically put out onto the web and stuff.

From a very sort of, you know, relatively ignorant point of view, the notion of model-generated software that can run itself being deployed out on the public web, you know, is a little scary to me, but also I'm sure there's stuff out there. But how do you think about that? Yeah, the way I think about kind of AI slot for software is automated systems that, like, inform

infringe on us. So scam calling or spam calling is a good example of an automated system that infringes on us. Or like apps that have lots of notifications or, you know, games that are meant to be extractive. Those are our systems that infringe on us. And, you know, I actually think the default path of where things are going with centralized AI systems and the kind of like returns to scale of infractions

improvements of the underlying models is that we will kind of as humans be a little bit disempowered and beholden to whoever controls the automated systems. Right. It's not necessary. You know, I've kind of told you about this beautiful vision of a future where everyone can create software and everyone creates bespoke software. And

Like, that's the future we want to build. But it's not necessarily the future that is the default path. The default path could be that there's a lot of audit, like even more software out there today that's trying to get our attention and trying to extract from us. And what we need, I think, what I want is for people to create defenses and to kind of like get rid of that stuff and disrupt it. You know, hopefully when I can build my own software, I can do it.

I can actually disrupt a lot of the stuff that's being built today so that I have software that's serving my own interests. I have agents that are serving my own interests and helping me do what I want to do.

So to your question of AI slop, I think there is definitely going to be people making more automated systems that are bots and bother people. And just like in cybersecurity, I think there's an attack defense dynamic. And what we want is to enable people to create opposing systems for themselves that help defend their environment and themselves and help kind of protect us to do what we want and live the lives we want.

And hopefully there's also kind of some, you know, there's already some regulatory aspect of this that exists and, you know, there hopefully will be more in response to what we see. So that effect is real. Right. All right.

There's been a lot in the generative AI news cycles. There's a thing. That's a thing. This year in particular about models and reasoning and future models, models being trained that have reasoning capacity, that kind of thing. Recently, OpenAI launched a new iteration of a model talking about reasoning capabilities.

Is reasoning something that does or can happen in an LLM? Is it something that the agents bring to the table, so to speak? How do you think about reasoning? How does Imbue approach reasoning?

building reasoning capabilities into your products? Yeah, that's a great question. So, yeah, reasoning is a buzzword right now and models definitely do reasoning. Yeah. And in the way that like underlying LLMs definitely do reasoning in the way that humans kind of, it's not exactly how we do reasoning, maybe, and it's not always perfectly correct and it's often not

continuing to justify its own reasoning, although humans do that too. I was going to say, that's familiar. It's so unclear how similar or different it is to humans, but the underlying LLM definitely does some reasoning. One key difference that we observe right now is that the underlying LLM isn't

necessarily as good at verifying if its own answer is correct. And as a person, when I'm doing a task, actually, we don't notice this, but we're always constantly checking like, is this right? Did I do that right? Is this what I expected? And there's not that loop. So that loop is kind of added by the agentic piece. Okay. Yeah. We think a lot actually about our research direction as around this kind of verifying verification. Is it correct? Did I do it right? And if I have an agent that's writing code for me, I do want it to check itself like, hey,

hey, did I do that right? I did. Oh, I didn't. Let me fix this mistake and then come back to you. And so the better it is at verifying its own answers, the more like a better user experience it is. And so, you know, when we talk about reasoning, we mostly talk about this kind of like verification and robustness. Like, is it able to verify what it's doing? We've actually learned some pretty interesting things when working on verification around

It turns out that in software development, when you write a function, you also often write software tests. You're testing, okay, did the software have the behavior I expected? And given really good tests, actually the underlying models are pretty good at creating the function or creating the piece of software. Right.

But given the piece of software, the underlying models are not very good at creating good tests, which is kind of interesting. Yeah. One... Any idea why? Yeah. One, you know, it's partly because the models are probably not trained that much on this particular task of creating tests. Two, though, maybe it's possible. We don't know. We're not 100% sure. But it's possible that actually verifying if something is correct is a harder reasoning problem than...

kind of creating the thing in the first place. Yeah. So it kind of requires this like analysis and judgment. And so our research direction is primarily focused on verification. How do we get to models that actually are able to properly verify that the output is correct and kind of is what the user wanted? And we think of that as the hard problem

in reasoning for agents. At the same time, Mbu has, did you pre-train a model? Did you build a foundation? You didn't build the model from scratch, but it's a 70 billion parameter model. That's right. We actually did pre-train a 70 billion parameter model from scratch. It is from scratch. Okay. Yeah. We actually learned a ton from that process. And one of the things we learned was like, actually, we don't know if we need to do tons of pre-training going into the future. We'll see. But we got a lot out of post-training on that model.

And so for a lot of the verification work, we're actually really interested in post-training fine-tuning, reinforcement learning on top of the underlying models. That seems like a good place to ask this question. What does the future look like? I almost want to leave it just there, but that's not fair. What does the future of AI agents look like? What is Imbue's approach? Do you have a roadmap you can share any of? Where is this heading? And I know that's, you know, an impossible question to answer in many ways.

But I'm also guessing you have something of a vision. Yeah, that's a great question. So, you know, I talked today about trying to enable everyone to build software. But really internally, the way we think about it is all software in the future will be agents, basically. I mean, not all software, but most software. It'll be a little bit smarter, kind of just like living software. And what we want to do is enable everyone to be able to build agents.

so that in the world, you know, we're all able to build our own agents for ourselves or use each other's, copy someone's, modify a little bit from myself. And that's kind of what our product is meant to do over the long term. And so actually in our 70 billion parameter model, we released a set of blog posts that taught people how to set up infrastructure to train such models. We expect most people, you know, most of us won't train our own models, but it's kind of

part of this desire to democratize a lot of these capabilities. We also released a toolkit for people doing evaluations of their models and with clean data and everything. And so, you know, in terms of kind of what's the future of building agents, my hope is that agent building, unlike software building, is not something that only a few people are able to do well. My hope is that there's actually something that's like widely democratized and where everyone is empowered to be able to create their

And, you know, I think right now we have this very scary view of somebody else is going to create a system that automates my job. And that sucks. That's like really disempowering. I don't want that for my job. But the fantastic thing is,

But the thing I love doing is automating parts of my own job. Yeah. I like, you know, love making it better in all these different ways. And that's what we want to enable people to be able to do. Like by giving you the tools to make your own agents, that means that you can make your own things that automate parts of your job. And now your job can be higher leverage and higher level. And now you can do a lot more.

And so we want to give kind of, you know, someone else automating my job is very disempowering to me, but someone giving me the tools so that I can make my own tools for myself, that's very empowering for me. And I think this mentality shift is actually really important. Amen.

Kenjun, for folks listening who would like to learn more about Imbue, you mentioned a blog as well. Where should they start online? Website, social media, podcast? There's a podcast, I think. Where should they go? Great question. So imbue.com is where we are on the internet. And you can follow our Twitter account, imbue.ai.

And we will have, you know, as we start to release the product a little bit more publicly, we'll probably have announcements and things where you can start experimenting with what we're doing. So please do follow us. There's also a newsletter signup where we send

extremely rare emails. Because we mostly focus on building. You're not an application trying to extract constantly? No, no. Not trying to get your attention, trying to make a useful product. Good, good, good. Kenjin, this was delightful. Thank you so much for taking the time to come on the pod. And best of luck with everything you're doing. Maybe we can check in again down the road. Definitely. Thank you, Noah. This is super fun. Thank you.

Imbue CEO Kanjun Qiu on Transforming AI Agents Into Personal Collaborators - Ep. 239 33:36 Share