My name is Fausto Albers. I am the co-founder of the AI Builders Community, which is a really cool community that has meetups in Amsterdam and Berlin every month. We get together with a bunch of AI nerds and people working on really cool stuff and we share ideas, problems, etc. and some cool talks. I'm doing research consulting in the AI space from a background of behavioral science,
owned restaurants. I was a flair bartender in London and I can use all this stuff in this amazing, super interesting space. I always love talking to you, Dimitrios, about the things that are possible and may not be possible and that sort of thing. And let me get some coffee because I do drink a lot of coffee.
And that is actually because I swear, I am a carrier of a gene that is associated with a very fast metabolism of caffeine. When people ask me how I drink my coffee, it's a lot. I literally drink a liter of coffee before lunch. I can drink a double espresso before I go to sleep and I will sleep like a baby. Drinking coffee is like making love in a canoe. It's fucking close to water.
A quick little tidbit for you all before we dive in. I met Fausto and instantly fell in love with this guy because he was running a restaurant business. He also is super creative and he doesn't let any of that stop him from being an engineer just constantly iterating on AI and AI products. He brings ideas.
so many diverse perspectives to the table whenever I talk to him that I am in the middle of trying to con him into coming back monthly and chatting with me about what he's been seeing. It was an absolute pleasure talking to him and I hope you enjoy. Let's get into it. What were you just telling me? And I said, hold on, don't tell me that yet. Let's hit record. You're doing job interviews and you created something. What is it?
Yeah, that's something that almost feels as uncomfortable as being announced as the Fausto Hour. But no, so I'm doing these job interviews at the moment, right? And of course, I'm recording these conversations. I also have some other tricks in the book that I will review later. But I'm recording these conversations and I built my AI analyzer to analyze the transcripts, sometimes the video as well.
And it's just, it's very confronting to see yourself or like to hear yourself drift off on these paths to nowhere. And while you're thinking like, man, she was just asking a question, answer it. But it does help because being confronted with your own mistakes is shame is a good way to learn. And what are you doing? The analyzer I imagine is just grabbing things.
The audio, converting it into text and then throwing that into ChatGPT. There's so many different tools out there that can actually record your conversation. Even Google Drive, Gemini, Supernormal, many of them. And basically all you need is the audio or sometimes they're already transcripted for you. A lot of them do it. That's true. Yeah, but to have something that really...
That's really useful. Your goals may differ from call to call or for an AI to have as much context as possible is, of course, gives a better result. So you can do it in JGPT or you can let a tool just do it for you. But you can also use, for example, Instructor, which is a really great open source library. You might know it right from Jason Liu, which can help you extract information.
the data that you really want. And you can use rich descriptions, pidentic schemas, et cetera, to do it. Much like structured extraction from OpenAI, but it has a bit more to it. Yeah, that's how I do it. And you're doing that, throwing it into...
some kind of a model and saying, where could I be better? Is that the gist of the prompt? Yeah. Analyze this. Yeah. But it's fun to just to play around with. And I think what's really useful is for an AI to really understand the personas that are in the conversation, which may not always work, of course, with the example that I just gave you, like a job interview. But for example, when you have a meeting with your team,
Then when I have a meeting with a team, I also record the conversation, just put a device on the table. And then I always have everyone introduce themselves to the AI. Like I am this and that name and this is my background, etc. So then when we are analyzing the conversation, then we do this. Everything that is being said is analyzed, taking into account who said it.
And in general, I think that's one of the interesting things with AI and where we're going is personalization really means also who is talking to the AI. Imagine that you have this really cool, sophisticated RAG system for a company and all
And all the information of the company is in there. Maybe you have a layered access, the CEO can access everything, all the containers or container-sized access. But then there is maybe a legal worker from the company and that's accessing this information. But the AI or like sort of the intelligent intermediate layer between the rack system and the user should understand who's asking the questions first.
Because the next step will be that the AI can also suggest questions to us because you don't know what you don't know, right? It's really hard to ask good questions if you don't know all the information available in the database. And so it's so funny that you mentioned that because this is something that we've almost been doing with recommender systems.
for a while, but in a different way. It's the interactions on a web page build up a certain profile and they build up features about someone. But you're talking about, can we know the features about someone? Like this person works in the legal department. They're actually an intern. They've only been around for six months.
So they should be doing these things or they've interacted with these files over the last three days. So those are the highest priority. Probably they're working on this project X, Y, Z. But then what other stuff do you want the AI to know about you to give it more context? Because like we mentioned before we hit record, context is key.
one step back, like the blank canvas problem, right? Like you sit down with a blank canvas, whatever it is that you're going to do, like a drawing, write a story, learn, but it's really hard to start. And when you help people use JetGPT is one of the first things that you help them with is what to ask, right? Because it's really hard to be super imaginative and a lot of cognitive load. Yeah. And
And yeah, so where should the AI step in? And based on whatever knowledge it has of this user,
For example, of course, with JGPT now, they recently, I think, extended their memory function. Did you find that? It used to say memory is full, and now all of a sudden it's gone. So I assume that they updated this. I didn't verify, but I'm pretty sure. So that AI, so to say, already knows a lot about you, right? And is therefore better able to serve you, respond in a certain way. But I assume that we're going to see more and more people
customization that the AI is also going to suggest what you should ask, right? One way of looking at all of this
And I think like when I'm going to go all the way back to when I was introduced to GPT 3.5 before GPT. And I was having a walk in the woods with my girlfriend. And I just I couldn't stop talking about it. I was like, this is so cool. And that day I made like a drawing that like an image that I still use when I give talks. The way that we manipulate the digital world is through binary code, which is essentially a language.
That we don't speak, right? We don't understand. So we have to work with that. We have an abstraction around that. And that abstraction is programming languages. And there's, no, there's different programming languages because each abstraction comes with the upside. That is that you can actually manipulate the underlying complexity. But the downside, there's always trade-offs, right? With compression comes information loss, unless you're a zip file. Yeah.
But I think it's a pretty universal thing. I'm probably going to get destroyed by any one with more wits on physics. But in this case, we can say that programming languages are an abstraction around binary code. And now, all of a sudden, we had language, like natural language as this new abstraction around these existing abstractions as a way to manipulate the digital world and therefore have an effect in the real world.
And I thought that was fascinating. And still to be in this space in the last two, three years is there's still a lot of programming involved and a lot of abstract thinking, which is great, but we're increasingly going to see more abstractions
Sam Altman recently said that he hated the user experience of JTPT with all these different models, but something we can all agree on. But you can also see this in RAC. There's all these different options and there might still be a lot of choices to make for a user or even the engineer. And more and more, we are going to see those hard decisions going to be abstracted away.
with the upside is that we can then again even work with more power and complexity and the downside is of course there is always loss there's information loss and we'll see it reminds me of someone who gave a presentation back in the day on using vertex AI as
as their main platform. And they said, the best part about Vertex AI is that it's a managed service and it does a lot of things for you. And the worst part about Vertex AI is that it's a managed service and you can't get under the hood. And so it's that double-edged sword in a way that we want that abstraction, but at what cost? What are we willing to give up for that abstraction? Human behavior here is funny as well because...
It's really hard to say that it would be pretty preposterous, or as you said it, like a nice word for arrogant, but I don't mean arrogant. Pompous. Yeah, to say that, to state that you're at the perfect level of abstraction because you were born now, 40 years ago.
So that programming language that you are really good at, that is the perfect level of abstraction. That's bullshit. You know, how would you know? And in the end, it's all languages, whether it's human languages, programming language is...
ways of communicating this and I'm going to go off the philosophical deep end here this raw thing that we call creativity or like the human condition like I have language to understand myself right I have words and to yeah and of course feelings there is a weird mix there and when I want to transfer that to you my condition my ideas I use language right
So in the end, we're transferring creativity, like the human condition, into whatever it is that we do, like communicating with other entities. I heard the term last week and it clicked with me, of course, the way I use AI, but it's called vibe coding. Yeah. I saw that too. I saw that. That's funny. Yeah. It's the new buzzword. Yeah. And my subtraction. Yeah. Yeah.
A hundred percent. But going back to this idea of, okay, we have language to basically encapsulate feelings and thoughts and things that we are trying to do and get in this physical dimension. The language is almost like a signpost towards what the symbols are inside of us, right? Yeah. Yeah.
It's really because also when you define something, whether it is like symbolic or with language, you also make it real, right? It is so because we call it. It is only there when we observe it. As we found from the cat. Exactly. Exactly. Schrodinger. Yeah. I have a very nerdy sweater.
That says Schrodinger's cat, and then it says dead and alive, depending on how you look at it, dead or alive. Nice. Going back to, I think one thing that I love from talking to you a lot is how different you attack the problem that sometimes we can get into when dealing with AI. And a lot of it comes down to
this context idea and how can we get more context for the models because if we can give them more context then they are going to better be able to
carry out the tasks that we are asking of them. And I know that just an example of that on this call is when you're talking about how you ask people on your team to introduce themselves to the AI so it can have more context on who they are, what they're doing. Another one that we had talked about a lot was when people join the MLOps community, could we figure out a way to have them go through an onboarding call with some kind of a
AI persona that could give a lot of downloads on what the person wants to do, why they're joining the community, what are the biggest things on their mind, like challenges they're working through. And that way we can better create different paths and suggest different activities that we do in the community because we have so many different activities, right? So how can we make sure that each person is guided
finding out about the activities that will best suit them? Essentially, it is a paradox of choice problem, right? In a world where information and options are abundant, we get paralyzed by the amount of options that we have as a personal experience kind of thing. But no, logically, it's just
For example, within a community, right? In that MLOps, they are builders and meetups that I organize. There is a hundred people in a room and they are there with some objectives and they may not even be fully aware of their objectives. So that's not a discussion, but let's say rational humans, we have objectives.
And there is, it's given, there's limited time. You can only speak to so many people. You can only listen for so long. The attention span is, there is constraints, right? So how do we optimize or make this, that always sounds so economic, but like how do we make this the best experience for any given individual? If we would want to use AI for this, and I think it's,
Like what a lot of AI does, whether it is AlphaGo within the constraints of a game or it's a chat GPT in a conversation,
What AI does, it's making representations of the world, of a world. And so that got me thinking that when we, indeed, when we onboard people and we have a goal, like, all right, let's say the goal is that we want to connect the people that have a lot in common or have some interests that are overlapping or whatever.
Then we would need to have that information. So we would onboard these people into the event. You could actually do that with a human to human interview, but you could also imagine a voice AI interview or anything that is basically not a form.
Yeah. Because the forms, it's hit or miss. Some people will fill it out, others not. Yeah, it's Jufa Harari in his latest book, Nexus. Great book. He explained, took that example of explaining bureaucracy and like a form is...
forcing you as a complex human being to to fit these slots however complex you may be or multi-dimensional and you have to fit the slots you have to tick a box or that sort of thing and you can see a lot of AI it
is a much more flexible form in production use cases, right? Whether it be a sales application or maybe onboarding as we're talking about here. It's like a dynamic form, but it's a better experience. Anyway, we would end up with
a database or with all these individuals and there's all this information about them and not in a super structured format so we might have to then use some AI with structured extraction and maybe knowledge graph that sort of thing or like information storage we could even augment that information there's a I saw this these guys lately
doing a presentation or doing a presentation at one of my meetups. It was called Airweave. And they explained some pipeline like that, that you could onboard a user and then augment it by connecting it to the Perplexity API, doing research, scraping LinkedIn, et cetera.
Anyway, your end here would be a lot of information on each individual. Then I thought I actually made a POC on this. Very interesting to do is to automatically create a virtual entity of each person, like an agent. It's basically an agent with a complex set of instructions and the information taken from the information that we've gathered.
And then we ask it, like we make it to embody, as a matter of speech. Strong word. This person. And with the goal of then joining this virtual arena, the meetup floor, if you will, and to chase their goals and to connect with other agents there and communicate.
Now, there's some really interesting research. So my background is in sociology and social sciences. And there's a lot of like experimental research in there where you would put two people in a room, give $100 to one of them. And with the goal, with the task to share this $100 and then the person that receives the $100 that can make an offer to the other person, right? Can be anything and the other person can say yes.
And then they share that cut or it says no. And then they both get nothing, right? Oh. Now, rational theory would say if, let's say, I've got $100. Let me give you $1.
I'd rather have nothing than you walking away with $99. Because the other person knows that it's $100. Yeah. So there's this sense of fairness, like in negotiation and then that sort of thing. And it turns out that AIs are actually really good in mimicking this. I found a repo that did this and I took it and I made it a bit more complex to give him like a sort of extra...
reflective brain, thinking out their strategy. I was
experimenting with all this sort of like psychological research, things that were actually done in humans to find out how to set up this arena in order to get good outcomes. Because if you just let the AIs go, they're going to find common ground with every other AI. That's just how they're really nice. You know what I mean? Before you know it, they're building a thousand startups, which is not really realistic, right? And they're also...
It might be, let's say there's 100 attendees and there's 99 super smart machine learning engineers, MLOps people, and there is one investor. And then there's one person that's really desirable in that room because the other ones might not have to add too much to each other. So this is all given on context. So I thought, why not use the constraints of a game?
and apply some rules and I'm not going to go into detail but you can imagine that if there's rules and that scans are really good for this type of research is they have to be they're constrained and then with the end goal of and this is
maybe confusing but taken from game theory to find Nash equilibrium like that is the position where two actors are placed in such a way that if one of them moves then everyone loses like that is the optimal interdependent rewards that you're trying to find between actors and
And yeah, with giving that, you're basically just going to give every attendee an advice like, all right, go to this and that person and hear some opening words to start the conversation. We think you might have something to talk about. And because the whole idea, I remember when you were walking me through this, is that for a meetup, wouldn't it be cool if...
we had this information on the folks who signed up for the meetup, and then we could simulate out a few turns on this game, and we could see, oh, our game is suggesting you talk with these three people because you all have these things in common. You might want to have a deeper conversation about X, Y, Z. And so it cuts through. Like you were saying, it would be great if we could all talk to everyone at a meetup, but...
Sometimes you talk to someone and you don't want to be in that conversation. Other times you just... Sometimes I've been in this situation where I don't want to talk to anybody because I feel really shy and I feel like I don't have anything to talk to anyone about. And I imagine I'm not the only one who's felt like that in a meetup. Absolutely, yeah. There can be like two ways. When I attend like a meetup that I've never been like...
I find it hard to maybe even get to speak to people. And when I organize the meetup and I've been just like hosting it, like I find it hard to, there's something, they're queuing up. It's funny because we try and do this in some of our meetups by just saying, hey, we're going to have a game at the beginning of the meetup. And we put people into teams by their birthday month.
And so you don't really have a choice. You're just going with everybody that's in the same birthday month as you. And then you have this opening. There's like this bond that forms because you're on the same team. Whether you win or lose the game, you have your team. You raise an interesting point there because this whole idea builds upon the assumption that there is such thing as the perfect outcome.
Now, I used to own a restaurant. And back in Corona, we started building for the restaurants a QR order application that was with the goal of having people spend as little time as possible on their mobile phones. Because to me, as a restaurant owner, it was very clear what the benefits were to us. But it was also, as a visitor, I hated it, how to...
Make this experience better for people with what if we predict what people want. Basically, they can order from the menu, like they have access to the menu through their mobile phone. But they can also, yeah, the menu is ordered so that it's in the right composition for them. Yeah. Long story short, in the very beginning, I thought there was such thing as the perfect match between a given menu item in a given context with a user.
And the more I learned about it, the more I realized that, and I could have known this because I've been on the sommelier and the cocktail bartender side a lot, that it's really out there until the match is made, so to say. Given that the entity that gives you the recommendation, whether it be a nicely tied up sommelier or a recommender system digital, recommender system AI, the trust that you have in that system
is affecting your experience. Also known as selling ice to Eskimos, right? You're a good sommelier, you can sell a cheap wine for a lot of money. As long as you give the person that buys the wine social recognition. Yeah. And actually, for some reason, I'm thinking about Schrodinger's cat again, because it's you're not
getting that match until the moment. And so there isn't that perfect match that you can show someone. It's really in the moment that I decide that's when it's actually real. Yeah. Yeah. And then there's, there's so many factors at play there. Um,
There's so much like the medium is the message here. And also another presentation that I saw lately was much more about the interface UX UI, building AI applications. And that's also fascinating because AI, Gen AI, is an interface. And it needs an interface because voice is an interface.
It's a way of communicating. And chatting is an ability of this, the communication part is what it can do. But it's also very much affecting your experience, whether you communicate through chat or through voice or through predefined buttons or buttons that are generated on the spot for you but still have to be clicked. And there's so much there. I've spoken to so many people that are building products in the last few years and
I think this is still one of the very much unsolved problems is you can have this amazing application, but how do you get people to use it? Like one of the first rough lessons I've learned is that people don't give a shit about chatbots. Yeah. And ideally, we do not want to use them. I'm one of those people. So to take it back to the abstraction of the beginning of the conversation, what we really want is something...
that knows that we want now. Oedipus. Exactly. Where are you when we need you? And it's so funny because on that, it goes back to what we were saying with the being able to suggest something in the right time, in the right moment, as opposed to us having to
have that cognitive load and figure out what it is that we are trying to do in this moment, and then type it out and type it in a way that AI is going to understand. So maybe we don't get what we want on that first time around. And we think, is it because my prompt isn't good enough? Do I have to rephrase this? And then it's more cognitive load. And it's just, it creates this
poor user experience. And so I've been all about that, man. Like the UI piece for this feels like the most important part of everything. The way that some folks are doing it where you can say, let's just click around and you have... We're so used to that. It has been around for a long time. Yeah. And when you get like those, you probably know what it's called, but those pictures where you have two...
two sidewalks and then there's the trail in between the two sidewalks because it's the shortest path there's a whole paradox around that too but the that's what it feels like where we are still trying to figure out what is that short path right now we just have the sidewalks and we're trying to figure out where's that path that we can just make a shortcut
Do you refer to what we call in Dutch, olifant, elephant pets? When there's a root that people will always find the shortest root. Exactly. That poor design in the physical world, we see that, wow, that's not how we as humans want to use this park or whatever. It's a form. Yeah, this form. Yeah.
I was taking on what you just said, like to express what you want. And still we're in this, like in general, you know, as using AI is do use AI. Don't refrain from using it. But if you use it, don't expect it's going to be like this miracle thing. Like you have to put in some hard work and that hard work is often difficult.
to understand what it is that you want. Yeah, great point. Good prompting is just basically explaining what you want and all the facets that are part of that.
But it's hard. And in the beginning, we saw all these easy reg systems where there's a user query that gets embedded and then finds the nearest vector in a database and then augments the response with this vector, like with this information. But of course, we know that the query is not always semantically similar to the information that is needed to answer that query. So then...
What do we expect? Do we expect the user to ask good queries? Because forget it. Or are we going to... No, that's I think in 2024 we saw...
This huge explosion of agentic RAG frameworks, like multi-hop agentic RAG frameworks, right? Where one very important part of that is query decomposition, right? Like to query understanding, intent recognition. It's as old as NLP and everything, right?
And I've been building a lot of those kind of things as well. And what occurs because there's this query decomposition and then there's query routing. Where are we going to send these queries? What databases? Then there's also, of course, the type of embeddings because not everything should be or can be done with the off-the-shelf embedding models.
But the complexity of a REC framework, I think, is centered there because this is true, right? Like given that a decent LLM has access to the information required to answer your question, right? You ask the LLM, what did I have for breakfast yesterday? Obviously, that's not part of the training data. But if the retrieved chunk gives the answer to that, then...
it's going to answer correctly and then you can generalize to more complex problems. So the problem of RAG is retrieval and retrieval is basically precision and recall. Like
Precision, do we retrieve the chunks that we want? But are we also retrieving precision and recall? Precision being the percentage of the chunks that we retrieve relevant to the question. Is there not too much noise? And recall, did we retrieve all the chunks required to answer a question? Which is actually the harder problem because you don't really know what you don't know, right? There might be some...
hidden chunks there that if not present make your model hallucinate. Now with all these agentic frameworks I mean it's useful but it gets so complex and it's still prompt engineering like you're not updating the model like you're making layer upon layer and it's really hard to actually
check if you're doing a good job. Like you have to, the concepts just mentioned, precision and recall, you have to measure them. Otherwise you don't know what it is that you're doing. And I think also often people are trying to optimize their RAC system by looking at the user query input and the generated output. And then if that's not a good parameter, starting with prompt engineering, the real problem is retrieval because giving data
that if the model has access to the right information, it's probably going to be able to answer your question. Now, I was working on a job or a consultancy job for helping an organization
make an AI and like a virtual service engineer for like the make industry complex machinery. But I think like aircraft motors, that sort of thing. You can imagine that those asset manuals, like those technical manuals are insanely complex. Yeah. So it's, the goal is to help an engineer find the right information there. And so I started playing around and like doing some multi-hop retrieval and other math, super complex. And then deep seek just immediately.
dropped like a fucking bombshell. Yeah. And we all had our sort of paradigm to look at it. Like people with money started running from Nvidia stocks and stuff. But in the technical community, I think like it's all been like
Everyone I speak to has been so stoked about this algorithm, the GRPO algorithm of deepseeks, so a group relative policy optimization, right? Really a real step up from the PPO algorithm that was originally, I think, invented by OpenAI in 2017 to train models with reinforcement learning, used to be done with human feedback, or at least a...
a model, a reward model. And now with this new algorithm, we can do this without all of that, like without human feedback, without having an external model to check our results. And there was such...
a cool insight that this was possible, right? But this works, it's reinforcement learning, right? So it works with domains that have a definite answer, like mathematics, coding, does it execute? Is the answer correct? And then I thought, what if we look at RAG as a closed domain, closed source problem? Because the chunks, like you have a question,
And there's an answer. And to give that answer, let's say that there is this database and the answer can only be provided when chunk 12, 15 and 20 are being added to the context of the response model. Then we have a closed problem because it's not about the answer. It's not about the information on the chunks. It's just the chunk ID's.
And imagine that we would have a set of, I assume, synthetic data. And there's a bunch of cool pipelines out there that can help you create this sort of synthetic data where you feed in this complex surface manual of a thousand pages. And you have it with understanding there's like images, tables, headers, text. You make different chunks of all of them. You add metadata, you map relations. So you have a model do this, right? Yeah.
And then you ask the same model with access to all of that context, which is different from a RAG system where the context is hidden, right? It's masked. It's in a database. It needs to be retrieved. But in this case, the context is there. It's like a puzzle that is iterative.
Already made, but it's still pieces. And the model can see that. And then you have it generate questions that can only be answered giving a certain chunk or a set of chunks or maybe even like a set of chunks in a certain order. And then you are creating this source of truth that you can use this to test your rag pipeline.
Because you have this definite source of truth and you can measure precision in recall. But you can also use it to train it. And this is the idea. We started talking about it with some really cool people from different backgrounds. There's a guy who's building models for cancer research, PhD. There's a professor in computer science. There's all these different people from the community, which is amazing.
The cool thing of communities that we have all these different expertises. Yes, diverse. Yeah, yeah, yeah. And we started thinking like, wow, man, like this is almost, imagine that you can, if you want, we can go into the technical details of how this would work. But basically imagine that you can, this tacit knowledge that a domain expert has is often not directly related to what they know.
but their skill set of understanding the domain and knowing where to find it. If you ask someone who's been working in a research lab
canist and that thing to to do a certain test and they need they may differ they may not even need that they may not even know yet what exact compounds they need and where they are stored but there's good odds that that person is going to be able to do this a lot quicker than you me right so
Because, yeah, it has all these sets of knowledge or to ask the right questions, to open the right drawers without knowing that the answer is in that drawer. But you get me, right? It's that baseline understanding. Yeah. Yeah. And in the case, for example, of a virtual service engineer, and then a real human engineer might have a question and that service engineer knows
Needs to decompose that question into sub-queries and needs to make a decision where to find the data, how to interpret the state of the machine, giving like sensor data and that sort of thing, and how to use that context to make queries, to find right information. Now, let's say that we have this source of truth.
Given this machine, we have questions that have correct answers and we have the chunk IDs to get to these answers. And these chunks are just to give it
more information that it can reason over or the answer is inside of these chunks i didn't quite get that or it doesn't matter all right so a very a very simple example let's say that the engineer wants it's missing a part and there's part a and there's part b and there's a sorry it's missing a connector there's part a there's part b and it's um yeah the question is what part connects part a and b
Yeah. Then, and this is a simplified example, of course, but let's say there is three other parts on the left connecting to A and there is three parts, unique parts on the right connecting to B and we're missing the part in the middle. Then the question could be, what parts do connect to part A and what parts do connect to part B?
And if it is that in the retrieved information, we have only one common denominator, well, then we have the answer, right? That's like a logical deduction. You can imagine way more complex, but this is the beauty of what I think that can be the beauty of reinforcement learning is given that synthetic data set that we would have, then we would give these queries to, so we'd give these queries to a solid model saying like Gwen-B.
32b, you need some solid hardware to be able to run this. And there's been experiments out there of people with way smaller models as well. Like TinyZero is a really cool one. I highly recommend checking that out. But let's say we have a bit more body model and then we give it these queries. And then we, each query, we let it generate 10,000 reasoning paths.
And we give it access to a function, function calling. We give it access to a database. It can retrieve data in a cyclic manner and whatever. Do what you need to do. The only thing that counts is then the answer at the end. And then we have this GRPO reward model that then given certain arbitrary functions, but in this case, we can imagine that our answer, final answer needs to be correct.
But more importantly, our retrieved chunks need to be the right chunks. Then, well, we have this new subset with correct answers, correct chunks. And then maybe we want to further refine this subset by having a function that, I don't know, like optimizes for the shortest reasoning path or the longest, or there's different things you can do there. But you maybe end up with the 10 best answers for each query.
And you do that, then it updates the parameters of the model, giving this algorithm. And you do that many times.
Then I would love, maybe I am like way off the chart here, but I'd love to hear if someone thinks so. But yeah, can't it be so that we are squeezing this, it's hidden somewhere in this model already, the ability to use the correct reasoning path to find the right information, not necessarily...
Return the correct answer to optimize the reason bug. So let me play this back for you and make sure that I am understanding it correctly. It's basically saying that we need to give the right context to the model. What we don't necessarily need to do is...
give it only the right context in one shot. It can be various iterations of giving it context. And since the model can reason, we can now say, figure out if that information is there, because we've already done a bit of training or fine tuning on it to show it what the right
context looks like and how to know if you are retrieving the correct chunks in reasoning models. And mind you, we start here with a non-reasoning model.
And this is referred to in the DeepSeek paper as the aha moment, right? That the model starts to reason and self-reflect on that reasoning and then take another path because I'm not sure. And so it's a very, if you see these reasoning models, like a one, but it doesn't review all his reasoning tokens, but DeepSeek does. If you see these reasoning models reason, it's a very anthropomorphic, like human-like way of reasoning. And it shows a lot of doubt, right?
Perhaps and maybe and or and this makes it, first of all, very much understandable for us. But it's also an important thing because when there's too much doubt and our problem that we want to solve is relying is like multi-hop.
For example, we want to know how many atoms a compound has given the combination of compound 1 and compound 2 and then do this and that to it. So then we have to know, we have to be sure that we understand the atom count of compound 1 and compound 2, right? So if we make a mistake there, a false assumption, then our final answer is never going to be right.
So if you have a reasoning model, reason over this, you'll see this a lot of, or maybe perhaps, et cetera. Now, there's this great paper. It's called the O1 Search. It was, I think, published just before the DeepSeek thing, or at least it's not referring to that, but it's a way, and they open source the whole code as well, to have reasoning models recognize their own ambivalence, like if they're not sure enough, and then stop reasoning.
the reasoning process and basically perform a search for more information like connect to a RAG system. It does some other things as well and it's not function calling. It's actually making a, it's formulating a question encapsulated between string tags and added with a special token that gives the script the information
The call to stop this process, find this new information, and it's actually sending it to a parallel chain that has access to all the steps so far. It's cleaning those chunks for better precision and then adding it back to the main loop and then continuing. And it can do that many times. So this is using a reasoning model to be more sure. And you could imagine that you could...
make this ability, like to improve this ability a lot. And I think you could imagine using that formerly mentioned reinforcement learning way there as well. I'm trying to just make sure that I understand this. Basically, the reasoning model recognizes when it doesn't know
if it is right or wrong, or it doesn't, it recognizes when it does not have the right information or enough information to be able to confidently make a decision. Yeah. And it can then go and find that information. And you're saying instead of like with the O1 search, the potential that we could have in a RAG system is that instead of it going out and trying to find that information on the web,
we can just go and try and find that information in our embeddings. Yeah, in any external source. It could be the human. It could be web. So it asks another question. The important part is in any reg system, like I've been going to, like very much into the sort of hypothetical solutions, but if you take a step back, is a reg...
system needs to know if it needs to find more information? Is it sure that the information that it has access to in the training data, is that enough? Do I need more information? Then I get more information. Is this enough? Is this what I need to answer the question? This is something that is, you can't possibly do it rule-based for real-world scenarios, right? Yeah. Then part of that, but more technical, is that the system needs to be able to
the query for the user to decompose that query in the optimal way and making the search queries optimal, whether it be for web, human, in the loop or vector. But that's more a technical thing. Like it boils down to how good does this system understand the
its own limitations and how can it be sure of things. And we cannot prompt, you can improve it, but like how could you ever do this in a sort of like rule-based manner, right? But what you could do is use reinforcement learning, given that you have a set of correct answers, correct chunks, and then just squeeze it out like blood out of stone. I think the fascinating piece here and one of the hardest pieces about this is that
You're trying to coerce the model into this unconfidence. And as we all know, anyone who's worked with LLMs is their confidence level is off the scale. And so they... But that is different with reasoning models, right? I think what we see with scaling in front of a computer is this sort of not being so confident. Yeah, to not think that it knows everything. But still...
I think that's the big unlock here is being able to say, because of the way that this question was phrased, I can only get this far. So maybe we need to go back and retry one of these steps. So that like low confidence score is huge.
Yeah, and I think because it's so hard to explain this in rules, that if this sort of approach would work, that it would probably be limited to certain domains, just like in human domains where...
yeah, it's, you know, given this, given that, good old Bayesian, given this, given that, and there is inflammation, this would be the right question, right? And this is what the best information looks like. Now, I can generalize that to maybe when I'm a bartender and I know my way in the bar and I can maybe generalize that to the kitchen, but I can't generalize that to the lab, right? There are so many smart people working on reinforcement learning, on making better rag systems in all sorts of ways at the moment, but
But we're going to see this, like we're not going to see AGI just yet in that sense, whatever it means. Fuck for another one. Yeah, totally. But this closed source of smaller domains where a combination of reasoning models, agentic frameworks are working together to solve problems within a closed source. Like Ethan Mollick said it, I think he was that he said, gave a good example of
We have now the OpenAI Browser Agent, which is like this, why you can do anything. Let's say that's the AGI kind of thing. And we have Deep Research, right? And both are combinations of agents and reasoning. And with Browser Agent, it's nice, but it still really sucks. Like it's not useful. Deep Research is amazing.
Not the Holy Grail yet, but I'm using it. Like, I'm not using browser agents. That's what we're going to see, whether it is in domain-specific rack intelligence or tasks. Like, yeah, it's fascinating. And I bet we're going to see a lot of progress in the tactical. So take me back to what you were doing with the manuals for all of these. I think it was the airplane manuals.
engineering diagrams and all that stuff. Machine diagrams that are very complex. How does this play into that? I know you gave the very simple example of there's a connector and there's those two pieces and we got to figure out which one can connect the two. And you're assuming that there's only one answer there. Yeah, but there might be more answers, of course. What are the other examples of this? How you build the RAG system to make it
What does that all look like, I guess, is what I'm trying to figure out. I was asked in this project to look at how we could test the knowledge of engineers. And there's a contradiction in terms because try asking someone. The whole definition of tested knowledge is that you can't really explain it. It's just you do what you do because you know how to do it, because it feels right.
And yeah, the idea, however complex it may be, and I think we should really start very simple to verify if it's even, and we're chasing this pot of gold at the end of the rainbow, but that if there is an answer, and you could indeed use a human knowledge for this as well, but if there is an answer to a complex problem and there's external information that's needed to come to this answer, then
we should be able to use reinforcement learning to find the right path because there is many possible paths, right? But try, give it enough tries and you might just end up on a very good path and do that many times and
then you have examples or you train a model to behave that way. And I think what it comes down to, what is so fascinating is that people say that models cannot answer questions outside of their training data. Well, that's a truism, of course. Yeah, of course. But what is in their training data? Is that a set of discrete words?
Or is there emergent capabilities of all this information stored in this network? And one of these emergent capabilities seems to be reasoning. That is that aha moment, right, of the DeepSeek paper with reinforcement learning. If reasoning is already present in this set of information, in this set of data, what else could be there? And
I don't know, I don't find it a weird idea that there's much more present. So yes, models cannot answer questions outside of their training data, but that's only really relevant for real-time facts as what did we have and what did I have for breakfast yesterday. When it comes to capabilities, I think there's much more possible than we've seen so far. I like that you say that if there is an answer and we can correctly...
make that answer be known. There's a yes or no answer or a way to do it and task complete, task not complete type of binary way of looking at it. Then we can figure out how to get AI to try and
do that. I love talking to you about this stuff. I don't know if they actually use this, but they shout out to Greg Kamrit and he launched this snake game so you can compare models playing snake against each other and a great way to test the capabilities of the models, problem solving skills, etc. And then someone in the comments was on LinkedIn said, I wonder how long it's going to take before each game is over instantly. Oh, wow. And it's just
a roll of dice because they all panned it out they played it out and you can like in this sort of constrained space of a game like that a relatively simple game like that you can imagine a dream version of the game can get to that I'd love to hear people who know more about it to say that this is possible or not I don't know