Forget frequently asked questions. Common sense, common knowledge, or Google. How about advice from a real genius? 95% of people in any profession are good enough to be qualified and licensed. 5% go above and beyond. They become very good at what they do, but only 0.1%.
A real genius. Richard Jacobs has made it his life's mission to find them for you. He hunts down and interviews geniuses in every field. Sleep science, cancer, stem cells, ketogenic diets, and more. Here come the geniuses. This is the Finding Genius Podcast with Richard Jacobs.
Hello, this is Richard Jacobs with the FinanGPS podcast. My guest today is Bo Nguyen. He's a staff research scientist and AGI, which is Artificial General Intelligence expert. He's an inventor, a cloud architect, and a tech lead for digital health initiatives at his company. We're going to talk about what's called large language models, which underlie chat GPT. There's also reasoner modules, but that's a separate topic.
We're going to talk about, again, some of the current AI. And my goal here partially is to find out, you know, what has changed? Why is AI now all of a sudden everywhere and working a heck of a lot better than it used to just a few years ago? But anyway, welcome, Bo. Thanks for coming to the podcast. I appreciate it. Hi, hello, Richard, and hello, everyone. Thank you. I appreciate the opportunity to be here to share some of my perspective on the problem.
on the questions. Yeah, first, if you would, just tell me a bit about your background. I can see you've done a lot of really cool things, but just in your own words, a short summary of your path for this point, and then we'll go forward. Sure, yeah. I think it might be destiny. So I started as a physicist, low-temperature condensed matter physicist. I studied long-range interaction in a very special material called single-molecular magnet.
And then later I move on. Yeah. Oh, one quick question. Does Bo, short for Bo's Einstein condensate? Or that's not you? I'm just kidding. Too bad. Yeah, they are my road star. They still are. Like even today when I study AI, I draw a lot of inspirations from my physics background. And I think it's very helpful. Yeah. So I started as a physicist. And then when I joined IBM, they also hired me as a physicist to work on some very cool
wearable technologies for medical purpose. But then later on, the company strategy changed. And then also my research interests changed a little bit. I started thinking about more along the lines of like, it is my personal struggle at that time. Like every day there's hundreds of paper posted on archive and I can, there's no way for me to read them all to catch up what's going on in the whole research community. And then
I also see the community frustrated with like, it seems we always do some incremental improvement rather than like 100 years ago, like you say, like when Einstein and Heisenberg, around that time, they all made like great leapfrog jumps in the science discovery, right? But it seems like every 100 years, there's a cycle, like everyone feels like something's slowing down, but then
but then all of a sudden some new discovery come out from somewhere. So that's about eight years ago. So I start thinking about the problem, this problem, right? Where is the next breakthrough going to come? And why there's a more, maybe more philosophically, why there's this kind of bottleneck problem coming around?
And then that's how I evolved myself into more neuroscience and also AI and information theory technology. And my group, also because of the company change, the focus of the whole group also, I shouldn't say change, it's just I moved from one group to another group. And the new group is with Guillermo and Jeff Rogers. So they are more focused on digital health. That's how we call it today.
It's how to use information technology to understand the disease, understand how a patient interacts with their environment, and then how can we use the technology to help the patients be healthier. Let's get into some large language models first, just some of the basics. So what's
How do large language models work that underlie part of ChatGPT and Gemini and all those other AIs out there? Sure. Actually, that's a surprise to everyone, I would say. Right.
right? So at the beginning, the large language model architecture is just trying to predict the nest token. That's how you train those models. So all they did is there's a model, there's an architecture design, and then the training process is just complete a sentence and then try to predict what is the next word supposed to be in the sentence. And after you get that, you feed a thing back into the model again, and then try to predict the next one. Keep
keep going. So it's a self-recursive process. And then, as you mentioned earlier, right, it's called scaling law. So I think Ilya, Ilya Sakshkifer, he's from OpenAI. I think Ilya and a few other researchers bring up the concept of scaling law. So they point out that when the underlying compute software increase, we have more compute power, and then the model can be trained for longer time with larger data set.
And then new behavior seems to be just emerging from the model. So they start to... How do you go from predicting the next word to me saying to chat GPT, write me a story about, you know, Bob and Jane playing on a seesaw and use emotional language and write it in the style of Charles Dickens, you know, 500 words. How does it take that, a prompt like that and spit out the whole story? That's pretty good. I'm Kate.
Yeah, yeah, yeah. So that's actually a later invention. It's called instruction fine-tuning or chat fine-tuning. So opening AI started...
I guess their big contribution to this field is they figure out instead of just completing a sentence, you can fine-tune the model on a conversation data set. So there's two persons in the conversation, one person something, and then the next person will answer the first piece in a chat style. So it's called chat style fine-tuning. And after doing this fine-tuning,
the LN can be used as a chatbot because it will understand instead of completing your sentence, it's supposed to answer what you say in your sentence in a responsive way. So that's... How do you know semantically what I'm asking? That's going back to the chain name. It's actually, that's the magic. Everyone gets surprised is the next word prediction because in order to precisely predict
predict what is the best next word, it needs to understand the context of all the sentence or the tokens is given before the token is trying to predict. And there's a very important paper called Attention is All You Need.
is from Google. And what they figure out is they construct a matrix, which today we call KV cache. So essentially, just say, here are all the tokens given to you before. And then on the other direction is which one you should supposed to pay attention to. And then the model is trained to figure out what's the connection between where it should pay attention, right?
So when you give it a sentence, it will pay attention to the information like most relevant for you to predict the next one. And then during the pre-training process, which is we just feed the model all the information
all the information we can get by human error written, like we feed it the whole internet. And then when you try to predict the next token through that way, you learn how to pay attention to the important things. And then through that, you somehow figure out how the context works and how the semantics of each word has a different meaning. For example, apple can mean several things. The apple we eat,
or the Apple company, right? And under when the Apple is in the sentence, it needs to see all the words around Apple to understand which meaning
this apple is supposed to carry. So that's the metric. Yeah, that's why the paper is saying it has the right title. Attention is all you need. That gives us all this magical effect. And then, you know, if I make a very long prompt, let's say of like 10 sentences versus, you know, a single sentence, what will happen? Like, when does it overwhelm the
the AI system, whether it's Gemini or ChatGPT, and why would it overwhelm the system? Is it asking you to process too many things, or is there no limit to what you can type into a prompt? Actually, that's a very important research area that's still active going on right now. It's called attention heads. Essentially, the mechanism inside is called attention head, and
And as you say, at the beginning, the model, we are thinking about track GPT-2 or track GPT-3.5. The thing is, when the context gets longer, at some point, it will lose tracking of some of the information it's supposed to pay attention to. One interesting observation, at some point, the committee figured out this attention actually is similar to human. So it pays more attention to the beginning of the
of the sentence or the paragraph because you are because the way we ask it to compute is you feedback after you predict the next token you feed that back into the model ask it to predict the next one so the token at the beginning of the sentence or paragraph or the prompt got feedback to the model more than the tokens coming after so then somehow the model is paying more attention at the beginning and then
Of course. This is an engineering detail. Yeah. Well, quickly, Ivan, let's say I had a short story and I wanted to fix the
Brandon, would it be better if I say fix the grammar in the following short story and paste it in, or should I put the short story first and then say the end of it? Please fix the grammar in the story above. In the early days, it will be better the first way, right? So you give instruction first and then give the context because you will pay more attention to the instruction. And that's why we have, you probably heard of the word system prompt and then the user prompt. So the system prompt essentially is a
leveraging this trick, like whatever instruction you put in system prompt, it will attach at the beginning of the prompt. And then everything else is context, will attach to the later part. But then today, like it's not an amazing thing, right? The technology evolved so fast. So that was two years ago. Today, the model actually is able to evenly distribute its attention. And actually there's a specific benchmarking
benchmark mechanism to test how good the model can do this. So it's looking for a needle in a haystack. So essentially, it's
Essentially, the benchmark mechanism is it will give like a whole book to the LLM. And then inside the book, the researchers will insert some of the token or some label information, like a weird sentence that has nothing to do with the rest of the book. And then they will ask the last-dialectic model to figure out where that sentence is. And they probably give the beginning of the sentence or say the sentence in a different way.
and then ask the LLM to search the whole book and then figure out where that thing is. So that's a way to make sure that the LLM is paying attention to everything and without getting distracted by the rest of the overwhelming information.
information. And these are all like, they need to pass this before they put in front of us for us to use. So what are some of the current limitations of LLMs? Because I'm seeing what's called reasoning modules. So what are those? Those are going to be combined with LLMs in the near future? And I'm not sure, like what's ahead for the next few months or what are the limits of the current iteration they're working on? I think
Actually, that's my active research right now. So I have a strong interest in how LLM reasoning or actually how natural intelligence and artificial intelligence do reasoning tasks. So today, LLM reasoning is still LLM, but it's trying to leverage. So this is another thing people, is emergent behavior in LLM that people discover.
So at the beginning, we just asked the LLM to write me a story or write me some, help me fix the grammar, right? And then people try to ask the LLM to solve a math problem or some other scientific problems. And then they found sometimes LLM can do it right, sometimes LLM cannot do it right. And then at some point, there's a smart discovery. Say, you just tell the LLM, add like a magical prompt, say, solve the problem step by step.
And then the LN will suddenly increase its accuracy of solving the math problem. Because instead of just spilling out the final answer, it will try to do the derivation of the middle step. And then because it's true, let's tell you how it works. Yeah, actually, that's the interesting thing. It's like why we, I have a six-year-old daughter, right?
So in the school, she's also right now learning that. And teacher is forcing, like telling them to show the work, right? And actually it's important because showing the work allow you to double check whether the middle step is correct. If you do everything in your head, sometimes you might miss one, you mislabel something or you skip a step and you didn't notice. And then you will get a
result wrong and then you were not able to go back to fix the problem so actually you probably heard about the book called Think Fast and Slow so Dan Ariely Dan Ariely yeah so it talk about the system one and system two right so people today also use that analogy to think about RLM so system one is intuitive essentially it's your the
The reaction is building to your neural system, and then you just know the answer when there's a stimuli. And system two is you have to think it's true. You need to go step by step logically. And this is where this emerging behavior comes from. So for complicated tasks, actually, you need to keep track of a lot of different information, and then you need to piece them together in order to get the final result.
answer. So from information theory point of view, what that means is the bigger tasks can be break down into smaller tasks and then you divide and conquer solve one after another and
And then when you solve one, actually, you should write down the answer for that part. Because when you solve the other piece, it's likely the context of this part, you don't need all the middle steps. You just need to remember your conclusion for part A. And so when you solve part B, you can concentrate all your memory and compute resources, focusing on finishing the task. And after that, you do the next one. So that's how human and like...
we just intuitively know how to do this or maybe through the school training, right? That's how we solve complicated tasks. But the thinking step-by-step essentially is leveraging this. It's a, you ask the RN to do the middle step,
and then you write down all the right answer for one and then because it's attention mechanism when you go from there to do the next step it doesn't need to pay attention to the token before it will start from step B how to work on what that part out and then in the end it just need to pay attention to okay combining A, B, C steps middle result and give me the final result so like doing it step by step give the L and the ability to like divide and conquer or
or it has the it starts to have the capability of doing abstraction yeah so is it well is it doing abstraction or it's just it's using it's very good at segmenting what it's allowing to be used for computation so whether LLM can do abstraction that's a
maybe more philosophical question people are still debating. So from computation point of view, right, what is abstraction? Abstraction essentially is information compression, right? So there's a, when you do the detail part, there's a lot of details. But when you try to just use that conclusion for another part of the problem, you don't need to remember all the detailed steps. You just use the conclusion. So then
then you compress all the information into something and then you carry over to the other part of the problem. That's one way to think about abstraction. So today's LLN, when you just use the vanilla LLN, because it's spilling tokens out one after another and all the previous contacts, it still got feeded to its loop. So
So actually it's not a very good use of its context window. That's what a lot of the new agentic framework idea comes in is a
They try to use the LLN as a component of a bigger AI system. And then the LLN will try to solve one problem. They call it like a scratch paper. It's a type of a memory module. So LLN solves this part of the problem on here. And then when it gets, when it finishes, it will make a conclusion or a summary.
and then just keep the conclusion in its memory system. And then the other component, it will start from scratch, have the context, and have the conclusion from the previous step, and then continue to solve the next part of the problem. And then it become like... What about GAN? You know, if you have AIs that are, you know, they're competing against each other, they're correcting each other, what are some of the dynamics of GANs if you've studied them? Yes. Well, GAN was earlier... The concept of GAN is still valid today.
That's a lot of the agent framework they are trying to use. So, for example, one popular thing called LLM as a judge. So I have some research in that area. Essentially, you ask one LLM to do the problem first, and then you have another LLM as a judge to judge the result and see if there's a
hallucination or there's something to be improved and then it becomes a loop. But the original GAN network concept is more to the before LN days. And so today's LN is not doing exactly precisely the GAN that we used to talk about, but the
the concepts got carried over and it's embedded into the agent framework. Well, do you see that, I mean, can you spin up other instances of a given AI and maybe change the initial conditions slightly so that there will be an effective GAN? Like, you know, do any of the large AI companies do that to hone their results? Yes. I think the word GAN has a
particular meaning in the community is that type of architecture or the network. So people kind of try to avoid reusing the word to cause confusion. So people just call it a different way today. But the concept is there. Well, it's called LLM as a judge nowadays. That's the... Called what? LLM as a judge. LLM as a judge. Okay. Gotcha. What
What are these Reasoner modules? How are they different from LLM? Oh, I see. Yes, that's a good question. So as I mentioned before, right, doing things step by step, and then that's called a chain of thought. So I
And for some time, people just used prompt engineering to ask the LN to do a chain of thoughts. And then later, I think Stanford had some paper called START. Essentially, it used reinforcement learning trying to train the LN to generate chain of thoughts in longer chain of thoughts, or to generate better quality chain of thoughts. And OpenAI is...
Last year, there was like people on Reddit say OpenAI has a secret project called Strawberry.
They say they are going to figure out the next generation of AGI. Essentially, that's OpenAI took the idea of the star and then started to train their... That was the O1 model they rolled out last year. Essentially, what it is, is the model will try to solve the problem, will search the solution space in some sense. The O1 model is just generating the longer chain of thought. And then
Like you explore one possibility and if it doesn't work, then you start to explore the next possibility and then do this continuously until you solve the problem. Because a lot of time for open-ended questions or for harder questions, your first intuitive approach might not be correct. So you do some steps and then at some point you need to give up and then you need to try something else. So that's the idea of the O1.
Later, OpenAI has the O3 model came out. What's the difference is just do the exploration in parallel. So people, well, they didn't open, so we don't know exactly, I guess, calculations and from their documentation, we kind of know how the system works. So for O3, it's doing Monte Carlo tree search, meaning just parallel search the solution space. And then you try to
summarize what's the result in order to do the next step of the reasoning and open AI is not the only people doing that actually before the ROA03 I have a similar system doing that as well
Essentially, I have multiple LLN. I can drive it in the agentive way. I have multiple LLN as the initial agent trying to solve the problem. And then each of them have their try. And of course, some of them did right, some of them did wrong, or everyone's wrong, but some of them do some part of the problem right. And then you use the LLN as a judge. Have a second layer to look at all the initial results. And then you try to
figure out what's the consensus, what's the difference, and then use that to lose track of your next round of reasoning.
And then iteratively, this will solve the problem better than 1L and just trying to solve a brutal false. And then you probably heard about DCR1. So DCR1. I tried that also twice, but yeah, how does that work? Yeah, it's essentially doing this, but their contribution is they really, they optimize the underlying library and the hardware and
And the training process, they make it super efficient. So you don't need to spend billions of dollars on huge GPU clusters to train such model. They make a very efficient package. So everyone, even university professors now, have a limited budget, can also investigate these models. So methodologically, they have a lot of very good engineering skills.
improvements, but it's around the same idea. It's just training the LLM model to generate better chain of thought. How much computing power does it take for like, let's say a reasonable length prompt using LLM versus O3 model on chat GPT versus R01? So R01 is definitely much cheaper and
Because of all the optimization they do on the engineering level, they make it very efficient for the compute. And I think even though they're open source, but today like OpenAI and Anthropy, they're still including IBM, right? We are still trying to absorb all the knowledge to open source and then how to improve our inference engine. So it still costs quite a lot.
for different companies. I mean, each company have their own secret
secret sauce, how to do things, how to do the engineering tricks. But there is a general consensus is you need to somehow make the compute more efficient because it becomes a business cost. Right. Are you surprised by the prices being charged by Chat2BT? 20 bucks a month for the B2 stuff and then 200 a month for unlimited. I don't know if you have any visibility into the pricing versus the compute power and the cost to provide it, but what do you think so far? I
I started using ChugGPT when they announced the new price structure. I kind of migrated to... There is a software called Cursor. So Cursor is an IDE. It's for using LN to do programming. And then it provides you access to different models. So including OpenAI 03, 04, and other models. And there's a...
and therapy cloud. Yeah. So, so it included many models like BCR1 and therapy cloud. They are all in there. So I get access to all the different models for different tasks. I will pick different models. What do you think, what does the future look like? I mean, it's moving so fast now. What, why do you think, I know I've answered this before, but so is it just because now there's more computing power?
there's more layers that we're getting this emergent behavior or like what triggered all this to take off all of a sudden? Because AI has been here for a long time and it just kind of like sat and sat and sat and went nowhere. And now all of it's, you know, 2023, it exploded. Like what,
what happened? I think it's not a coincidence. There are several things that made this happen at the same time, right? Several factors happened at the same time that made this came up. One first important thing, of course, is the underlying compute become much more efficient. Like, NVIDIA have their GPU very dedicated to this kind of large-scale training that makes it possible. And then the second one is internet, right? So in the past,
On the internet, we have, like internet has been around for over 10
20 years but at the beginning it's all just people chatting or sending emails and a lot of things in the over the 20 years uh there's a big effort of digitizing all the publications from the past and that take many years to do right now all the books has been scanned and then put on the digital library on the internet and then there's a everyone is writing blogs and uh
other things. So actually there's more training data become available on the internet. So Ilya has been calling them, that's like the fossil fuel for the AI. So the amount of data that we can train the model on actually
actually has only become available in the past few years. And then, of course, people study the architecture, the important paper, important concept like attention is all you need and all of these things also show up and then
the concept become a doable task. And then OpenAI is the pioneer they like really believing in like this scaling up actually works. So we should give credit to them that they convinced the investors to give them millions of dollars to build a huge GPU cluster to train a huge model. And then they proved that
that works. And then now that everyone knows actually that works and we have open source models from everywhere and all the people start to research in this field. And then that feedback, that become like a feedback loop, right? The community is paying attention to this very promising direction. So there's more people doing research and then that accelerate the technology to become optimized faster. Oh, one thing that seems to be rarer and rarer
is when the AI hallucinates or says things that make no sense. Why did that happen and what's been done to clean it up? Because it's much better than it used to be, much, much better. Yes. So there are several things. First, hallucination is definitely a big problem, right? So there's a lot of research on how to resolve the hallucination. And reasoning is one special aspect. Other ways to doing that is...
like retrieval augmented generation. So RAC, that's a common way when you try to provide, when you ask the LLM to work, to provide answers on some data set that is not trained on, you provide, for example, a PDF into its context window and
and then it will answer based on the PDF rather than trying to make up something. But another thing is later, so there's another research effort in the community that figured out that the quality of the free training dataset actually is
is very important because as we say, go back to the underlying mechanism, right? So it's trying to predict the nest token. And then how that works is actually it gives a list of all the candidate possible words first, and then it tries to pick one from the candidate list. And how to generate the candidate list is from its pre-training. It remembers everything that it has seen in the internet, all the possible words that should come next. And it
If it's trained on what we call noisy data or dirty data, meaning, for example, say the user's question is asking, I have some symptoms and what kind of medicine I should eat. And then if it is training data, it's all from internet, like different people say different things, and the model doesn't really know who is right or wrong because it doesn't have intrinsic judgment. It just see the probability of, okay,
30% of people say this, another 20% say that, and then maybe 10% that's really well-trained doctors say the right thing, but it's a small population, so it got enough. And then in that way, it will just give the wrong answer based on statistics. So then once people figure that out, a lot of money going in to clean up the training data set.
So you really just use a high quality textbook and then a high quality maybe literature. You don't use a random post from, you scrape from a Reddit forum or some other source that you
you don't have ways to verify whether the information is true or not. And what they found is if you use high-quality data to train the model, then the model later on behaves much better. It's kind of like when our kids in the school, we tell them to stay away from watching too much YouTube. You should focus on absorbing the correct information when you are developing your neural network. Similar principle, right?
What was I going to ask you? We can talk more about the reasoning part. So hallucination actually is a strong motivation for studying reasoning. Because think about this, there's different levels of hallucination. Like the easy, maybe most naive way of hallucination is just when the question is a recalling question and then you just remember the fact wrong.
For example, you ask the large-range model, who's the first president of the United States? And then because it's just making prediction based on statistics, the correct answer we know is George Washington because Lincoln also showed up in a lot of articles about presidents. So you may say it's Lincoln. That is wrong. So that's just a recalling type of hallucination mistake.
But when you go into more complicated questions, like solving a math problem or solving an open-ended science problem like drug discovery, the definition is kind of changing because it's actually the reasoning steps. So when you try to explore what should be the next thing, the LLM agent trying to explore what should be the next step for you to explore, to solve, to solve.
for example, design a new drug, then hallucination is becoming kind of fussy because even when a human come to do this job, we might just do try and error. So because we don't know what is the right direction to go, we only have some general ideas. So we try this thing and then if it doesn't work out, we need to go back and then try another method. So at that point, like,
when the LRM first pick a wrong path to go down, that that's not really, you can say it's hallucination, but it's also, it's just, there's no better information for it to make judgment. So in, yeah, right. So in the reasoning, hallucination is a, less of a problem because, uh,
But sometimes the token generating less probability, less token, we can call it innovation because a lot of times... So quick question. You know how people say it's a black box. If you have, let's say, I don't know, 20 layers and no one knows what's going on in the inner layers. Is anyone trying to use, you know...
one of these models, a reasoner module, whatever, to understand what's going on in a black box AI system, you would think, well, yeah, maybe that would be a good idea to do that. You know, does that work or what's happened? Yes. Actually, there's a whole community pursuing that. So it's called mechanistic interpretation, essentially trying to interpret what's going on inside the model from the mechanism point of view. There's many, many good papers coming out from there that
The way they do it is they train another model to look at the neuron weights inside the large-range model they are trying to study. Essentially, you can think of it as like building a microscope to look into the object, which is the land-trick model you are trying to build.
And then there's many interesting papers coming out from there. So most recently is Anthropy. They put out the biology of LN. So it essentially is doing that kind of research. They have the open source, there are two blocks. So then now everyone can use that principle to build a microscope to study the model.
But a few, I think it's six months or eight months ago, Google also put out a similar toolbox for their Gemini models. And then even earlier, the academic research community
I remember the first paper, interesting paper I saw in this field is they asked the Lambda model about the world map. Like they asked about New York and they asked about where is Washington referenced to New York. They asked all kinds of these questions and then they proved how the weights inside the model actually translate.
trying to answer the question. And in the end, they see actually there's a world map inside the neural net map. So actually the model has what they call a world model, not just the world as the Earth. A world model means the AI system has an internal model of the environment that it's in and then trying to answer a question, not just based on random statistics, but actually they are thinking
In a principle way, like human have, like how we interact with the environment is we have some imagination model in our brain or what we call assumptions or hypothesis, right? And then we use that to help us reasoning through the environment or through the problem. So all of this is interesting. But yes, to answer your question, people definitely is looking inside how the model works. Okay.
And then lastly, we're close to being out of time, but you mentioned in your bio that you think a lot about AGI. So I just wanted to talk briefly on that. What would it take to have an AGI? What would it look like? Is it likely? Those kinds of questions. Sure. Yeah. So I think the consensus now is AGI will happen. The question is when. The more important question is whether AGI is a good thing for humanity or
or not so good? That's in terms of AI safety question, right? Because as we see, all of this AI model is becoming more and more powerful. And then actually the ultimate goal is to make them more powerful or smarter than normal human. So they can help us to solve problems that we cannot solve.
People today already demonstrated for some LLM agents, actually, they can exceed human level of intelligence. Like AlphaGo beats human in the goal game. And then recently, D-Mind also put out what's called AlphaEvolve. They asked LLM to design software algorithms by themselves.
by the similar principle as AlphaGo, like reinforcement learning plus LLM reasoning. And then the LLM was able to design an algorithm that's standard in the past 40 years. And then there's other studies like using LLM agent to design medicine. And there's paper showing the LLM was able to figure out the new medicine in 10 days, which has been a bottleneck for human experts for the past 10 years. So AI is definitely...
like showing its promising capability to surpass human intelligence. But the question is, right, if the AI becomes so powerful, is it going to become a threat to human society? And in what way?
The sci-fi way of thinking about this is, of course, the Terminator, the Skynet, the AI just become evil and then take control of the whole world. More concretely, that may not be what's actually going to happen. Another concern is there's an active research community working on this AI safety. There are several important concepts. For example, one thing is called reward hacking. It's based on the GoHot rule. It says the GoHot
The Go-Hart's Rule is an economic principle. It says when a measure becomes a target, it seeks to become a good measure. Intuitively, you can think of the students in the school. Having the exam is to test how well they understand the knowledge. But if the exam becomes an entrance criteria for going to a good college,
Then the student starts to work. Instead of trying to understand the knowledge, they go to like test preparation or do other things. And they just try to get the highest score they can on the exam. And then at that point, the exam is no longer a good measure of how good they understand this. It only measures how good the student is good at taking the exam. So,
AI is having similar problems. So in reinforcement learning, this is a common habit for the human engineers. If they set out a goal for the AI, the
A lot of times, AI figure out ways to go around to cheat. And then just, for example, yes, in the early days, they asked the AI, they trained the AI to play Atari games, right? And then, for example, for a race car game, that was the first time I've seen in the experiment people show this actually happens. It's in a race car game. The goal is to get the score as high as possible. The AI first figure out how to hack the memory.
And then instead of erasing the card, it just go ahead and then change the memory and made a number into like infinity. And I've seen that before on some games. Yeah. Where a chess, I saw one video where it just cheated, made an illegal move. It was funny. Yeah.
Yeah, exactly. Like the AI just go around like doing cheating. It's good to use it now before it becomes much more serious in the future. And there's a thought experiment called paperclip maker. What that goes is say in the future we have AGI and then one company is making paperclips and then they tell their AGI to say, okay, just make as many paperclips as possible so we can sell to other people.
And then the AGI just figured out, okay, how to build more and more paper clip. It needs to have more steel to make paper clip. It needs to have more energy to run the machine. So it just builds solar farms everywhere. And then it takes all the irons that it can get its hands on in the earth. And then in the end, when everything is run out, the story gets darker, like inter-
People on YouTube extend the experiment into darker directions, right? That's the AGI will even figure out, okay, I run out of all their iron mines on the earth, but humans actually have iron in their blood. So we should just take all the animals and then extract the iron from their bloodstream to make paperclips.
And then, of course, there were a lot of endeavors. So this says the AI, if AI is a mindless machine that has a superpower, then it's very dangerous to human race. So then now the debate actually is around whether we should give the AI, not that we give, like whether we need to make the AI mindless, stay mindless, or whether we should give the AI subconsciousness.
some consciousness or self-awareness that they have some intrinsic goal. That's a big debate in the research community now. Right. So of course, one side says if the AI stay as mindless without goal, then they won't go out to do crazy things. And we humans have a better way to control them and they stay as a co-pilot as human. They just help us whatever we tell them to do. That's
that's safer, but on the other hand, it could be a little bit perverse. What's to prevent a criminal organization from fine-tuning an AI? Exactly. Mindless or not to accomplish its own goals. Oh yeah, exactly, right? Like the nuclear weapon is that kind of analogy. The nuclear weapon doesn't have its mind, but if it falls into the wrong hands, it can do great damage to everyone. But more importantly,
I think AI will emerge its own self-identity or consciousness in some way when you train it more and more. And because I think about what's the purpose of human building AI is we want it to help us to solve some problem. We stop us interfering. So the AI at the beginning is going to do some job.
autonomously without human interference. That was the goal we are trying to build this system. So all the big companies or individual hackers who are building this system have the intention to give the AI some flexibility to
to trace a goal that defined by human. But then that's where the goalhouse law comes in, right? When you define the goal, actually natural language or whatever method you communicate with the AI is a fussy. And you haven't, because you ask it to go out, explore something on your behalf. So actually,
you yourself as the task or as the manager of the job, actually you don't know everything about the job because the purpose of you to hiring this AI is to delegate some of your responsibility to it. And in that case, it's a fancy command that you give. And then what the AI is going to do is
It's actually dangerous if the AI is mindless. That's where the paperclip thought experiment comes in. So I'm on the other camp. I think we need to give AI some human-like of self-consciousness so it understands how humans think and it will behave more like how humans behave.
behave. That way, we will communicate with these AI agents similar like how we communicate among ourselves. And then we can use, this is my own crazy idea, right? So humans don't do evil things. Part of it is because our grandma just tell us to be a good person.
But on the other hand, we all fear about death or penalties in some way. So there's a law and law enforcement that give us pressure to make us stay as a good citizen. But for AI, today's AI, it doesn't have the concept of living or death. So you cannot threaten an AI system and say, okay, if you are going to do these bad things, I'm going to shut you down. It doesn't understand that concept.
So when you are not looking at it, it's going to do whatever it likes if it's a state mindless. But if the AI start to understand that actually there's a way to
We will terminate it if it's behaving badly. Then maybe we have more leverage to like to nudge them to do good things so we can regulate them better. But that's my personal crazy thought. That's all right. No, this has been a great conversation, Bo. What's the best way for people to follow up with your particular work? You know, where can they go to follow you? Oh, sure. Sure.
I have a LinkedIn page and I have a Google Scholar page, and they can also just email me. My work email is [email protected]. Yeah. You wanted to make a statement to say that about IBM on this call. So go ahead. Oh, yes. So everything I say during this podcast only represents my personal opinions and my research about AI.
So IBM is my employer, but my opinion doesn't represent IBM's opinion on these topics. Okay. Well, fair enough. Anything else before we close out? No, it's good. Thank you for the time. And yeah, let me know how it works out. Very good. If you like this podcast, please click the link in the description to subscribe and review us on iTunes. You've been listening to the Finding Genius Podcast with Richard Jacobs.
If you like what you hear, be sure to review and subscribe to the Finding Genius Podcast on iTunes or wherever you listen to podcasts. And want to be smarter than everybody else? Become a premium member at FindingGeniusPodcast.com. This podcast is for information only. No advice of any kind is being given. Any action you take or don't take as a result of listening is your sole responsibility. Consult professionals when advice is needed.