We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

OpenAI & Google Struggle on Model Training, Suno's New AI Music Model & More AI News

2024/11/14

AI For Humans: Making Artificial Intelligence Fun & Practical

AI Deep Dive AI Chapters Transcript

People

主

主持人

专注于电动车和能源领域的播客主持人和内容创作者。

Topics

主持人对AI模型训练速度放缓的现象进行了分析，指出高质量数据的数量可能已经达到峰值，并探讨了推理计算在提升AI模型性能方面的潜力。同时，主持人还关注了OpenAI组建联盟以应对来自中国的AI竞争，以及Anthropic的CEO Dario Amodei对AGI到来时间的预测。 Kevin对AI发展现状进行了深入分析，他认为AI模型训练和推理计算的结合才是AI发展的关键，单纯的训练数据规模扩大并不能反映全部情况。他同时指出，推理计算的提升空间巨大，这为AI的进一步发展提供了新的可能性。此外，Kevin还对Suno V4音乐模型、苹果公司的新AI智能家居设备、DeepRobotics的越野机器人、NVIDIA的机器人技术进展、小鹏汽车的人形机器人以及Meta AI聊天机器人的错误信息等进行了评论。

Deep Dive

Chapters

The chapter discusses the potential slowdown in AI scaling and the differing views from industry leaders like OpenAI and Anthropic.

OpenAI and Google face challenges in advancing AI models like Orion.
Anthropic's CEO predicts AGI by 2027.
The quality of data and inference time computation might matter more than raw scaling.

Shownotes Transcript

Translations:

中文

AI advancement is grinding to a halt if you ask some of the major players like google or OpenAI.

But anthropic CEO still see A G I coming by twenty twenty seven, a little over .

two years away. So what is causing these different interpretations? And if everything is slowing down, then why does .

north amErica need an AI alliance to rival china?

Or diss latest A I progress a brand new music model from suno, which is astonishing.

and and robots that can offer their way towards A I humans, everybody.

The big story this week is that A I is plat towing that is right. We are getting information from all sorts of places we're going to get in to wire way. Why not? That may not be true. The sorts of frontier training runs that happen in the very beginning stages, when you try scrape data from everywhere and then make an A I model is not working as well as I used to. In fact, there is a big reder article that just came out where ellia setsu ver, the cofounder opening eye, who is now in a shack somewhere developing safe super himself, has said that these models are not delivering the results, I think.

to the public at large, to the forty two point headlines that i'm seeing on every website. This means the AI bubble has burst and that the billions of dollars of investment, where are, are just quick. And A I fan snapped, and it's all gone, right.

that everything is exactly right. A I is, is in our house. He's stealing from our cabins. He's opening the door, taking the old meal that I buy now, walking out the .

door way he through galaxies and different dimensions, like tripping through time to get these infinity stones just so we could steal your quake ker ros.

He's a big fan of quicker routes, maple and Brown sugar, that he can only get that in my kitchen so anyway, can tell us. So first.

I want to have set that is what you would think if you were reading these headlines because the bigger are saying that this super intelligence, a technology that is as important for as electricity, is slowing down. And by one metric, IT might actually be Gavin. That's right. So I would do want to follow up.

There's a couple other stories. This is not just the earlier story. There is a story over the weekend where the information quote a couple open a eye siders that said that they are also seeing a slowdown in the same sort of thing.

So what they are seeing, just to be very clear, is the idea that the initial training runs of a eyes, meaning that the data they trained the initial frontier model on as not delivering as much improvement as they had in prior training run. So there was a third at one point that the more money and more data you would throw at these initial training runs that they were just continually scaling at this kind of very high hockey tic growth. And what these stories are saying is that the major AI models are not seeing that as much. But this is not all bad news, right? There's a silver lining to this as well.

Yes, there is a silver lining which we will get to. But to put a point on a ellia setsu ver said the twenty tens were the age of scaling. Now we're back in the age of wonder and discovery. Once again, everyone is looking for the next thing. Scaling the right thing matters more now than ever.

And so again, that was the the hypothesis was scaling as all unit because back in the twenty tens, IT did look like just throw more added, grab every reit comment, every blog post, yes, bring that kitchen sink over here, elon. Throw IT all into the model because what comes out is intelligence. And now we're seeing diminishing returns though, maybe the quality of the data and what you do with that data at inference time might matter more. And I think that is the silver line that you are driving to Gavin, which is that if you scale the time you take to compute the output, there's plenty more gains to be made still.

That's right. And I think that's exactly the underlining I am talking about. And I think the argument people are making out there right now is that we might be just running out of quality data, and there's only so much data that you can find that is actually worth training an AI model on.

And then we may have just hit that peek. And there's a lot of talk about synthetic data, which I know is weird term. You might scared some of our Normal in the audience, but the idea for a while that the computers could make up their own data and weren't not really sure where that lands.

But to Kevin's point, the inference computer, which is the idea that you throw computer at something once it's been trained and then using something like opening eyes or one reasoning model to continue to think about that data, is something that we continually are hearing, is scaling pretty significantly. So much so that none Brown and other people at open a eye have continue discounted this idea that scaling is slowing down. Because in their mind, scaling is both the training model and the inference model kind of together.

Now we are in a world like I said earlier, where the metal compute that's going into pretrail ing for things like large english models is very, very high, but the influence costs are very low. And there was a reasonable concern um among various people that this was we're going to be seeing diminish returns from AI progress because the costs an amount of data that you need for pre training will become so astronomically. And I think what all one the really important take away from all one is that that wall doesn't actually exist, that we can actually push this a lot further because now we can scale up, influence compute. And there's so much room to scale inference compute.

There's another story in the information he is kind of covering the speed pretty well that o one is looking to launch before the of the year. Now a lot of people have been rumors that o one, the follow one, was going to be coming like last week or the week before, but before the into the year is Better than never.

We mentioned on this show that we have not extracted the full capabilities of even the o one preview out there because we use IT to do things like imagine futuristic snacks or to pit animals against each other in a death match racket, I think .

was your usage.

If if you're listening to this, you're watching this and you haven't spent time with a one IT IT does feel different when the AI takes a beat and thinks through the problem, you will see the results improve in some remarkable ways. So we know they have this one in the hopper. We know this one is coming out.

What happens when they apply these reasoning tactics to whatever the next model is? This room of orion model where maybe we are saying, like, oh, you know, we thought that would be a ten x improvement, but it's only a six x improvement with just raw scaling of data going and enrich ching IT. Okay, fine, what happens when you apply that test time compute, that inference compute, so we can think about the result? There are still plenty more in these models, the ones that we have right now, the ones that you and I have access to. There's plenty more in there that we haven't extracted yet.

He other thing that's going on cave right now that that is like on the opposite spectrum of this is that OpenAI is putting together a coalition where they can kind of prep the U. S. Government with all the AI data acknowledged that is possible in the tools to essentially kind of lead, uh, what looks like a global fight against china for A S supremacy, which you going to read the story is like, that's interesting. It's kind of seems like some of the echoes what we talked about last week, but this is just getting more serious.

Every week we talk about these models meeting incredible energy, right? This is going to be a lot of energy consumed to train them and then to run them. So we might need to return to nuclear.

Well, how are we doing that? We know microsoft going there direction and thropp s probably going to go there is amazon has new chips coming out. They're trying to build new data centres closer to power plants.

So there's less transit time and energy loss. And so when you start about all these plates that are spending IT does make sense that there would be a unified front between these companies and between the different branches of our government. Like this article name checks getting the navy involved because of their expertise with, you know, tactical nuclear reactors at sara. And it's just like IT. One of this will be .

real until you and I are in front congress and we're testifying. So the minute somebody in this order and sees one of the two of us sitting in a suit, sweating bullets in front of congress, that's when, you know, this is really gotten real.

I've got a couple text message threats with the brows, and one of them has called exhibit A. I thought that would be the reason I was testifying. The means are so dank on that one. But no, it's gonna. It's gonna this. Yeah.

that's right. The other thing that's interesting about this, so dario mode, the sea of anthropic, went on less freedom, the forecasts for five hour podcast. Dario, I find, is an interesting talker about a in general. Obviously, dari o was at open eye for a long time and now runs anthropic is the in charge anthropic. So he kind of echoed a little bit about what we heard from sam Allen last week where instead of talking about how their companies are slowing down, and again, we've said this before, all of these companies have a big, big interest that making sure that checks keep coming in. But dari o does see a world where some version of A G I could come by twenty six or seven, which is a really .

short time from now. There will be like a zillion like people on twitter who will be like A I C O and twenty twenty six playing and will be repeated for like the next two years. Yes, we will die and sorry, yes we will. Like this is definitely what I think it's going to happen.

Um so who whoever has exerted these clips crop out the thing I just said and only say the thing I have to say um but I have to say that anyway um so so if you extrapolate the curves that we've ve had so far, right if if you say, well, I don't know, we're starting to get to like P H D level and and last year we were at uh undergraduate level. In the year before we were at like the level of a high school student. Again, you can you can cripple with at what tasks and for what we're still missing modalities, but those are being added like computer use, with added like imagine was added.

Like image generation has been added if you just kind of like and this is totally unscientific, but if you just kind of like, I bother, rate at which these case of these are increasing IT does make you think that will get there by twenty twenty six or twenty twenty seven. I love that is the same tactics that I use that when making any dessert just ah. Yeah.

I bought IT. The fact that ellia said that we're slowing down on the on the training data is the thing that I think is real. But also, ellia has his own pathway to what he wants to do. I think the thing that's really interesting here is just that obvious dario and sam, both believe were on the athlete of something significant. Oh, one is a path to something significant versus what we've seen before.

And I think you have to just kind of understand that no matter what, as you said, even if we stayed still, even if there wasn't any sort of improvement in the AI models, what we're looking at as a world where for the next ten to twenty years, even we're looking at the ways that like you can, people could get Better at making things with this tool. And like, that's the important thing for most of our audience out there, you're going to hear slow down. That s not the idea we moving away from egypt to cycle or something like that. This is just the idea that like one pathway might be slowing down. So then it's not an exponential gains so that we're suddenly living in the future where fried eggs come out of our hands and we could just kind of feed our ex all day long that we mean not beginning there as fast as I would like, Kevin, but at some point will get to friday universe did.

Is that something that iron man's repose sers could do? Could he?

Actually, you can try me on the reposted, actually, that would be really interesting iron man news case, right? Like you use iron, our name, Kevin, much.

And yeah, there we were there. Dario did say, by the way, with this prediction, some of those that i'd like to scoff at these time lines would say, but what about and that could be everything from running out of training data to a chip shortage, even mentions like a war or taiwan being raced off the map as something that could happen.

And even in the face of those hurdles and those potential pitfalls, he still thinks twenty twenty six or twenty twenty seven for agi, which again, it's just another inflection point. It's not like up that moment hits. We wake up and yes, we all got the friday g palm capabilities.

IT does set us off for the next few years of development. When these machines gets so capable that they can go off for weeks at a time, self supervised to accomplish a task, we might see these ais spinning up businesses and running them and managing them on their own. And the beginnings of this could be a year from now, conceivably.

That's right. Maybe even more importantly, Kevin, in this interview, we got the admission that A I naming is terrible and that they made a mistake by naming. I don't know he's actually made in the thing, but lex women did ask him about why is claudes saw at three point five prentice's new called that, and not just three point six. To me, this gave me great justification. Everybody knows this podcast knows that we've been talking about the terrible names .

that AI companies give their models forever. Gt, if these machines are so damn smart, why can't they name themselves Better like that should be, that should be the .

benchmark that product. Number one, number one, get Better named A I. And you know what else, Kevin, this is another important thing that everybody in the really everybody this say they should know is that you should be subscribing to A I for humans on youtube. You should also be sharing and listening .

to the podcast on audio net where I have a car on hazards on. They want to see which country is gone, to pull over faster, to lend a hand right? You've got to spare, you got to love.

And well, that we are the car on the side of the road that the hazards done are right. And we're response. We have a little come they be a little flared on the line. But you're wishing by right now, and I can feel you fast fording this podcast are clicking through this youtube do not posit and share IT with a friend. It's a little of the only way we grow be the car that dares to pull over on .

this information. Super on my face just for you. okay. Now you've paused that you shared IT at that exact moment. I'm sure we're going to get a thousand people unsubscribe, but thank you so much everybody. Kevin, I wanted jup into sono v .

really cool.

Sono, if you're not familiar, is one of the top, if not the top, really AI music engines. The four is not out yet, but a my friend, across the way has gotten early access one hundred percent.

Thank you to the sino team. A hashtag, not an ad that they did give me early access to v four. And I poked around as predicted.

Gavin, line go up. Whatever your perspective that do not of my video is flipped in the myr, but the line go up right over time, these things get Better. Nothing different about this here.

V four sounds Better across the board. Voices are clear in the performers. The instrumentation is well defined. The mastering, the EQ of the actual of results, you can hear IT.

And so just like you would generate any other song, you hit a button and IT spits out to v for tracks. They come out really fast. You can start streaming them almost immediately.

And you'll hear from the two different versions. It's like they're playing with different EQ behind the screen. That is the way the highs, the midds in the lows, the actual frequencies of the music, the way that they are baLanced, that's what the E Q. is.

I guess they could be made up a kill box, which is pork, but sometimes it's a beef, and feel whatever. Shut up.

And if you hear IT just sounds Better. IT sounds like we've hit that point now there's a audio that I am getting out of soil that I like up. I would and have been listening to that.

And IT works with everything else that exists. You can remastered song that you've already made with sono. So if you've experiment with this tall before, let's you got a hot dog city power ballot. Or you have like a hari for humans.

I for humans and hollywood, A. A, uh, you've got a country ballot about taking a dump. That's something in that I made a while back. I can't wait to hear that .

in the new version.

You're gona love the dolly. Utmost version of that. I think when I hear that IT really comes away there always to the songs when they come out, that feels like it's mostly gone now, which is that is the biggest improvement .

is the way that the A I. And it's fun to go back and re master the older songs, Gavin, because what used to be muddy or tiny or rattle or something he has to make a decision on, because the new model is that good.

So I had to go, is that reverberation of the singer? Or is that a shaker? Is that a guitar bleeding into the piano or other two separate instruments? And so when you remastered stuff you've already made, you get to hear the AI, this new model, making a decision about whether that's a voice or a tambourine or whatever.

And you hear the way IT shifts. And just again, the vocals are our Christian on point. The base will actually sump. There's real low end here.

I'm so excited to share some samples of some original stuff and some covers that I did just that sounds really, really good. So headphones off, which is the heads off in the .

audio world to this, to no .

team that tip the airport. D A wide of the airports in your direction.

you know, a Kevin, I want, I want to build to play my uh, country western of the dumps. Ong, while i'm in my own bathroom with my new apple device powered by A I, this is what we're talking about. Apple has a new device coming out. According to a markman at bloomberg, who is a big apple reporter, he often breaks big apple news at a time.

He has talked about the idea that there is a brand new device in the works that has meant to compete with alex a that is going to be the size of two iphone, and it's gonna essentially a kind of in between and an ipad and iphone, but it's gonna voice Operated and driven by essentially their AI software, apple intelligence. I'm kind of excited about this. I'm also kind of not because i'm not super thrill with apple intelligence.

So first, that to me is the biggest thing he we're announcing a new product for this technology that we have not refined that hasn't drastically shifted any user habits whatsoever and hasn't provided immense amount of value in any ecosystem where IT already exists. But we're going to have a premium version of the two. According to the rumors, the premium version, Gavin, might be mounted in the on robotic ARM that will follow you about that. The rumor is that they gonna have a dot version and a robotic ARM version like, that's great. I don't need bad technology to be following me throughout a room like they've got to have a hit with core apple intelligence before this product matters.

Why there's been a couple stories on the verge about those apple intelligence modification summaries, which I think we can all agree are pretty bad. Now we talk a little bit about a last week, but just getting summaries of like three emails when half of my emails are spend me in some ways the things I signed up for that i'm still trying to unsubscribe from doesn't work that well.

Ah the people are writing about the summaries and not for the reason that apple, once I have to share one. Andrew waited this out. The text from his mother was that hike almost killed to me? But the apple AI summary was attempted suicide, but recovered and hiked in redlands in palm springs.

Yeah, thanks, mom. Thanks for bring that in my life. mom.

Is there gonna a new product category for A I enable all of the things. I'm not convinced that this isn't just like a bell can holder for an ipad. I don't know what is different about this.

They mentioned that it's going to run this hybrid O S. That will, for example, Gavin, and display the temperature of the room from a far. But then IT as you approach IT, IT recognizes that you're near and then maybe IT changes IT to a control panel where you can adjust the temperature, right? So that intelligent distancing, okay, fine. But could we not just do that with an ipad air and sell a special mount?

Well, IT reminds me if we have one of those relaxes. So relaxes are of the thing. I keep wanting them to put A I into because we still have fifteen of them in our house or whatever and not their team, but at least seven of them in our house. And I would just love to be able to ask an elexa question. And having been .

powered by anthropic would be, did you get visited by john y ex or something? Now we are everywhere ecosystem, because honestly.

like having IT, like we will use IT as an internal inner comm. Like the things that we use lexi's are actually inner comes. So we have a family of talking to people, or we use IT for playing music, or we use IT for timers. And those are literally the only three things you cannot ask like a legitimate question and try to get an answer out of IT.

So if there's a way to make this happen at some ways, but the thing is going to say is we have on those alex is in our kitchen that has a screen on IT, which I swear to god, its main job is to deliver us advertisements in our homes so that we now have to look at a fricking ad for something like IT does nothing else because because this sound like you're going to say, hey, show me how to do this because system doesn't work well enough. So like i'm not one hundred percent convinced that a screen based home thing makes this big a difference as a voice. One actually would because voice having a voice thing, if you're okay with you being listen to which I know what people have, but if you have thing throughout your entire house, you can say, like, hey, black.

give me this information or tell me something that feels super valuable OK mount rocky terrain where where men have trouble traversing, let alone machines. And you want to adjust your thermostat, but your smart screen is miles away at the base of that mountain. And so you have to run your window and scream, hey r alexa, please adjust my temperature. And the robot comes screaming on all four leg wheels, and then does a part core move up the hill and runs up to now, does that sound like an exciting future?

IT sounds exciting until I figured that that robot might have a flame thrower on its back. And instead of coming for the next is coming for me, come for you. This is a video from deeper robotics that just came out of an off routing four wheel leg robot that, again, coming out of china.

We've talked about this before. China is lapping america. And a lot ways on robotics and mostly because they are doing IT at scale and they are moving very quickly.

If you are only listening to the audio of this, whatever you are imagining from what gave is going to say it's not IT, it's crazier than that. Sorry.

that's okay. So in this video, you're seeing a robot basically jumping downhill backwards and kind of like catching itself along the way, but balancing itself. And IT looks like an athlete in a lot of ways to me. And like IT looks like an athlete that is moving down a completely uneven terrain with dirt, a little bunch of other stuff, and it's doing IT very well.

I mean, the thing I always get shocked about these videos, cavs, you know, the boston DNA ics videos, the big dog videos, s came out a ten years ago, and now these videos come out and you're just like IT seems like to come out and nowhere. But this has been being worked on for a while and this one is just another step of the direction. You're like at some point, these things are going to be way more capable than we are at moving in the world.

And this just goes to the point again of like AI slowing down or not, like is this sort of thing keeps happening. We are not very far away from these robots being everywhere. Like I think in ten years or now, there's a real world where people know you hear this thing about like robots without number. Humans, like in ten years, is very possible if these things can do the stuff that IT looks like they are doing.

Now this stuff is coming. So look for search and rescue. Really fantastic to play one of these let to go or post a poc olympic c rub hub.

You like if you live off the great tricks, right? Imagine the exam body forward. What ten is what they can do with the keyboards? Like, imagine the robot does a thirty six hundred flip. Now we're talking, now we're talking entertainment, baby.

Then throw a no scope shot at the end of that. And we ve got robots to doing a weird ninja warrior called a duty thing. And that is the sky at olympic.

Now what that to happen is, once were all being hunted, it's not going to be enough to hunt down.

They going to be trick shot us.

And only like the fortnight clips right now, except it'll be us, and the robots will find us like .

six miles away. Bro, look at this seven, twenty two, one army wheel and no, look, knee capper. Bang, bang. wow. I mean that sick, but I do need to turn na IT.

What else get their other robot update this week? We might will talk about these things.

Yeah, know, a while ago and video announced a whole sweet of A I advancements for robotics, and we're starting to see some of that come definition. So project group was one of nvidia's initiatives and they unveiled some updates now on what we see is a robotic ARM with like a thumb and sort of three grasping fingers.

And you see this ARM go and grab different objects of different sizes with different corners, or some are rounded, some are squshy eta. But IT trained this ARM completely in a simulation, and then deployed IT basically to the real world. ARM n yeah. He says, IT includes environment generation with twenty five thousand plus 3 assets， motion learning and advanced dexterity trends. So they're just throwing model after model in the simulated environment.

And I think that's just an important thing is like again, from our top of the show, like you hear the word people saying I down, it's like AI is now not just like l ms. Or ChatGPT. IT is spread to a vast variety of products. And robotics is one of the ones to really keep your eye because this is a place that is going to move very fast.

Another company, Gavin called x pain, showed off a five, ten humanoid robot that weight one hundred and fifty three pounds. So ah, this is my way class. I will be fighting this in the future. I need to learn how to do a robot.

but I might need to defy this one.

IT remains to be seen how autonomists this thing is, what sort of AI and centers is packed with. But yet it's yet another traumatic unveiling of another humanoid robot coming from some other company that might be a hunting us in the near future. You know.

I don't love that this Kevin is five ten. And we talked about the feeds in the past, like these robots have been inching up slowly higher and higher. And like five ten is getting off for close to six foot, which I believe is the crossing point when the robots get to be six foot, then were all screwed. Because up until now, five seven to five nine has been like the sweet spot for robotics. And I think they understand they can get too tall because they're going to start forcing us, uh, you know, a mainly men in the world, to really say, you can do this and we're going to push down.

Finally, just everybody knows that was a joke.

I want to be clear.

IT might not be the six foot tall robots that try to kill us again. And I might be met as A I chatbot.

Because then why me doing that sooner right than later.

that they might have already done that we have to talk about fun guy. A I G IT was an auto generated meta chapt. If you haven't been using a facebook or messenger products, you might not notice this.

But if you're in group chats with people sometimes met, I will just decide, hey, how about an A I agent in your group or your question to answer questions that went unanswered or to post the group with a stick to try to keep the conversation go on. Well, fun guy friend was apparently a custom AI chap hot that got inserted into a group of forgers. Folks are like to go out into the woods and forage for func es. And fun guy A, I gave them instructions on how to cook and prepare mushrooms, which aren't fact deadly to human beings.

This is the danger of misinformation that is now being provided by A I itself, right? And this goes back to the idea of, like, gosh, these are not really trustworthy things as of now. And I know hycy ation are something that people have been talk, talking about for a while, and we have to kind of make them less.

But please do not trust the things that A I says, because right now, IT is untrustworthy. And this is an important lesson. My wife, prepaying scrapple, last night, and I decided, like all use a ChatGPT advances voice till like to give us what kind of make with these letters and and they gave me four words.

IT basically said, here are four words that you can use with the letters you have, flog foil folio jolie, which is an no word meaning jolie and then I said, what was the first word you said? And I said, the first one I mentioned was flog. I, however, I realized now that might not be a valid gravity word.

My apologies for the confusion. And I said, what is the definition of flog? And then I said, oh, actually, flog is in a valid word and scrapple and I don't even know that has a recognized definition in english.

So again, just don't trust these things. Don't trust them. Yes.

you know, the stakes of the scramble game are a little bit lower than, hey, i'm in the woods. I snapped a photo of mushroom. Is this going to kill me? And I goes, like, now, actually, you should put some .

touching around IT that will be here. 你。

Yeah, it's a big Tommy. No, no but that was the point of this uh and shut out to the uh the four or four media APP that had this articles that people in the mushroom group of thing, this is really dangerous if you are in a group of the experts or even amar s plugging in an A, I like this. Their first interaction might be with a machine that that gives them really poor advice and this could be some with the life is online, so stay away from the flow jees there. I know there's bioluminescent everybody wants to nm, nm, those flog es deadly adam .

IT won't been like a triple word go to. Anyway, let's talk about some of things that we saw this week that we really thought we were very cool in the world of the eye that we didn't get our hands on precise, but we excited to see them IT is time for a eyes. See what .

you did there. Then you stop 半身。

Alright, this will be quick that I have. Let's first start with this video that we've seen go around. This is just a very cool, awesome use case of watching a teacher bring forth out of the power of A I to their students.

A I and doctor nation. It's actually the worst video ever. No one should be celebrating this. In fact, this is why the department of education, bye bye OK to do that ticket too far. This is a teacher ask their students what they see themselves as when they grow up.

What would you like to be gave in a baseball player in airline pilot and astronaut, and used AI to imagine the children as adults in those rules as veteran ans as Cosmonauts at seta? And it's hard to hate IT. It's absolutely adorable as some people are against cameras in the classroom. Okay, whatever, you know, can we just enjoy one thing for a second? Can we enjoy the smiles on the faces of these children who have no idea that A I is gonna take their ability to even have a job in the future?

I will say the one thing about that, I think, is a really important thing, is why A I in education is very important, is that every kid who who sees this a is going to look into their faces are very joyful about this. But also they will speak curious about how this is possible. IT is like, when I was a kid and I was in, I took, this is how old I am, but I was a computer class.

I took a logo when I, the logo was a computer programing language built around the idea of a making a turtle move in a square, right? But the fact that I made that turtle move by writing some code was like a very cool thing. So I just hope that this sort of thing get more integrated into classes, especially in great school, in middle ols. I hope that they start having A I classes where kids can learn how to use these tools because they are gonna be a big part of our future. So like in the same way that you probably, for you can, there was internet classes in high school or in in middle school, where he started to learn about.

which you with the internet, that's what you and I we get asked all the time to speak at elemental schools and colleges alike. And when we show up, especially like to the Younger kids, and you go, look, you just ask you to write the paper for you don't have to read the catch in the right. That is a transformative moment. That is just a privilege for us to be there for those things.

Even that Kevin is in about not one two. We are doing that in schools, and we are not, but too, we have not been divided. The high schools are bidding ols.

But we would we would do that. I think we would have a lot of fun doing that. Like getting kids on bordered in the space is a very cool thing. And I think overall could be awesome.

agreed. Now, one thing that is definitely awesome game is x portrait to highly expressive portrait animation. This is coming out of bite dance.

You might know for such hits as tiktok and government take over of algorithms, but we love technology that puppets avatar ars, especially if you can feed something a two dimensional still and have IT come to life. And these examples are stunning. I hope this is real.

It's unbelievable, actually, because we've talked about runway at one, which is now the kind of mainstream me version of this within runway ma product. This is the next generation of this. And when you look at the videos from what they're able to do again with a single selling IT has gotten insane. And I do think this kind of lends more accurate to the idea that like anybody can act as anybody else, whether it's for deep fakes in a bad way, or to create some sort of like, you know, really interesting A I film or A I show, all of this stuff is coming in a in a very realistic, very interesting way.

And just to like, in case you're not watching the video, what makes us so amazing high speed movement characters quickly moving their heads, like whipping their heads from side to side, like chin to shoulder and an opposite direction. And IT doesn't turn into a blurry mess, wide mouth, tongue coming out, things that other models just don't even attempt or will break if you try. This is really, really impressive. And like, I don't like with the giggle, because i'm describing these things given as IT be special, giggly.

mostly because, like i'm kind of shocked by the idea that this is what's possible now. And the other thing that we want to shout out up next is the idea that somebody out there took a runway act one, which is using a similar sort of technology to this imported over the polar express, the incredible polar express, which I I have my love to have.

I love this in the polar express was a fine movie, but IT early motion cap, and all of the kids especially, look on a dead inside in their eyes. And what they did, he is to use runaway act one. This is kitani who have showed up before the kids in our discord who did.

This is a very cool way of just showing how off the shelf technology is. We could remind ter something that, say, twenty to thirty years ago was just brand new breaking technology. And even then, IT didn't look amazing.

But I was really shocking for what people saw. Now you can use these like literally off the shelf tech to kind of make something Better. And that's all perfect. But IT was just a cool way of showing how A I can improve something older.

Awesome video. k. Goni, thank you for sharing that. And also quick shot out to google because they don't get enough love gave, and we want to throw some flowers our way. They presented recapture, which is a method to regenerate a source video with all its existing seen motion, but from vastly different angles with different cinematic camera motions.

So what does that mean if you've got a product video and you film something, but it's not the exact angle that now you can we adjust that now? Yeah the the code in the model have not been released yet. This is just a paper with some examples as of the time of our recording but again, this is just like future of film making stuff as we talk about IT all the time like, oh, cool, you got your character but it's not the right angle, all right, just run that other tool and fix that problem.

Not that thing that I think people in the film and video world may not fully get. Is that like not only you're talking about I kind of making up a content from scratch essentially with these things, but now you're talking about being able to set camera movements, being able to retake shots, being able to do stuff with the things you have. Like all of that comes together to a pretty significant tool set in the next property, two to five years max.

If you hate the fact that we only yp about this stuff once a week, by the way, Gavin quick shot out to ourselves for the A I for humans newsletter releases every tuesday morning. It's absolutely free for all of you. Go to A I for humans adult show.

You can sign up there. The line go up there for us, which is amazing. So try that out. You may like IT. If so, please share that with your friends.

And also in just a few weeks time, we're going to be doing A Q and a special for thanksgiving as you and I give thanks to the audience that a small but proud legion of A I for actual humans out there. So if you have questions for us that you want to answer, theyll be a link in the newsletter. You can leave them on, our patron on.

You can join our discord and leave IT there. But we just want to know, what are your questions? What are your concerns? Your comments even leave them for us for when we get ready for the thanksgiving episode.

Yeah, people been asking about an episode like this for a while, like kind of a FAQ episode. And this is your chance. We want to answer a bunch of your questions and no question to done.

Really like that. Got the goal to show is to inform and educate if you have a question about like how does ChatGPT work or will help answer that. So like just get this questions. We're going to try to do IT h you have a question.

I have a question game, but i'm worried that I might not be the right question is IT OK I ask IT.

This is a safe space. That's a safe space. okay? As long get not about something, it's going to get its cancelled. Find Kevin.

I don't know. IT depends, what did you do with the eye this week? Kevin? Well.

let's get IT to this. I actually spend some time with a knew update to a company that's called a video. And video is a company I think we cover a while back, but there is a one point five version of their model.

And you can go do this right now. IT is free to try. And what they've done is created a multi model, A I video model. And what this means is you can take a picture of someone, of yourself, of somebody.

You can take a location, and you can take like an outfit, and you get you're able to put each one of these things together. And then what I will do is making an A I video of those three things, mash, gather and Kevin, it's not like perfect. But IT actually worked pretty well. And this just got to show like we're starting get more control over the thing. So I made some examples .

if you go to somebody OK. So let's go through this.

right? So first and foremost, as I wanted to use a picture of you, just like, all this is interesting. So when I do this, I took a picture of Kevin, just the sage, a head shot.

I took a picture of a guy in the construction of garbage workers, like, kind of like yellow, vast. And I took a picture of an empty cubicle range, and this is Kevin walking through that. Now the face is not perfect, but you can see at certain points to get you a little closer to a little beat. Your version of you, I think, is what I would say.

right? I was going to say everything about this actually is perfect, how dare you? But the source for this, Gavin, was just, you giving three sales. I was, like, much to three pictures.

picture pretty ool. So then I took you and replaced with guy theer's. And just literally, as I always do, and I replace the picture, I just the picture, and there you can see that guy in the office. Then I flip IT, and I took guy theory, and I took that out of the office and put fancy dance ball.

which was so good.

Then I took a guide in the outfit and put him in at at a sports game, like a football game where I said, hey, raised his arms and like, make his belly show, which is showed up. And then and I took guy theory and a two too, and I put him in that sports game as well. So you just get a sense of like how you can just change one small thing.

These are pretty. It's not the best video model, but a fun application. You can immediately imagine what runways implementation of this is going to be, or luma or clean or any of those things like, but yourself, here's the outfit that you want on your superhero.

Now tell us the environment.

Are you driving IT with text? Are you saying with and his belly pops out, right? So like that does show some of IT, I think to your point, cave, the interesting thing here is like when a company like runway says, oh, we can find a way to integrate this if you put this plus act one together, which is the ability to drive facial animations and to create words that make the faces that way, plus genre generations, you start to see like a really entire, like sweet of AI video products that could bring force something pretty incredible.

pretty fast. There was a minute there where for still generators, you could click on any point on the, imagine, sort of dry IT like the nose of a kitten, and move ted or the army. Someone I can do that was a little walkee, but I kind of work.

And when I see this, I go, yep. What your characters face, pick the body, pick the outfit, select the scene, hit, generate and then go in and just kind of click and drag and fine tune where they are posed in that. And then, as we just saw up with a ICU, did there adjust the camera angle as well? And just the ability to generate all this stuff is mind bling. And the wholesome .

thing is you can go try for free. You get like, I think three generations. You don't get a lot for free, but you can go get three generations right now.

The link will be in the show now to go check, but I O K, I wouldn't to hear more about what you do is, you know, because to me, this is what your sweet spot, right? Like you are a musician. You have a music background. I always love when you play around soon, because you're able to come to get a sense of, like, what this thing can actually do. Tells a little bit was possible here.

So this is a little garage band, something that I clinked out in a few minutes that I long abandoned. But you will hear a little bit. It's a little country, western twenty two things. There's some steel guitar.

A bit of a vibe set, right, but certain, not a song and nothing. I spent time designing the instruments are, but yeah, I took that, put that into sono and said, hey, let's remastered using v four. Let's turn this on to something new.

Then you can take IT, and you can get stems where IT separates the different instruments and the vocals, if they're orono, and arrange them in a garage band, a logic, whatever your work station of choices, and then put those back in disco with the arrangement and say, great.

Now take that and run with IT.

When you're doing that, how did the guitar solo appear? Did you did you prompt for solos or did you yeah, I took arrangements of things .

that I liked and would make a very basic something and then feed IT back. And here's the crazy thing about, can remind ter a remastered, you can go deeper within the songs. So there's pieces in this thing that are like the third and fourth generation of the same thing.

You're basically taking the same seed and regenerating IT. And so yeah, just I iterating playing, but asking IT for a solo and letting IT go nuts. And we arrived to a point, given where I sent remixes of friends songs to them accomplish musicians.

And one was like a media, wow, I want to record that version of that song. I can't wait to make that. That's amazing. So the machine inspired a capital artist to do something. And then another friend, I deeply disappointed they are disillusioned .

that they were not .

too please, not specifically with me having done to understand.

but could happen. Yes, yeah.

yes. And and that once again, struggling this this double edge sort, we are sitting on the fence, this very sharp fence. It's tough because, like, I got me excited. I want to get back on my drums and play along with the thing that the machine gave me. But I also don't fault anybody for saying I don't want to pick up sticks or a guitar ever again.

Both realities can exist. And I think what's interesting about this particular tool is that, you know is one of the tools that providing a lot deeper ability manipulate things then IT was before, right? And like it's not just prompt to get a song and you can do stuff with that.

So I would encourage musicians who are curious about this stuff and I would hope that sono maybe they're got a programme like this, but like you know should start some sort program where they're going to give like musicians like free credits to play around. I know everybody can get some free credits on soon, no, but some of these products, like the cover song feature, are paid tools, everything like that. They would be well served by getting a lot of actual musicians trying these things. At least I know they've got timber in involved in the company but like that's a cool way to do stuff is to like say to the musicians like hey, try these things maybe they're useful for you and then you can start open the door in the same way I do be, has, I think.

to artists and just the level set on the rate of progress. Again, we found them pretty early on a change ago. And from the other could be something here to now. We have basically CD quality sound generation happening where you can separate the stems, you can style, transfer your own voice or instrumentation, you can lock the sound of a singer or of a band and then style. All of these features are coming online as the sound models are getting Better.

Like I do not know where we will be this time next year, but the notion of a twenty four seven procedurally generated radio station where you can talk to an A I and say, and about to go do a workout, make this more heavy metal and driving and and does that, you know? And then you say, i'm in my car, I need to chill out now and that a style transfers that we glitter. Ally, don't know where this is heading and it's so exciting.

Yeah, I guys say in the good news is if you're listening in this podcast, you are somebody who is at least aware of where I might be heading. So keep listening. Keep watching.

Thanks, everybody, for being here. We really appreciated as per usual, and we will see you all next week. Baby.

bye bye.

OpenAI & Google Struggle on Model Training, Suno's New AI Music Model & More AI News 46:52 Share

AI For Humans: Making Artificial Intelligence Fun & Practical

Deep Dive

Shownotes Transcript

OpenAI & Google Struggle on Model Training, Suno's New AI Music Model & More AI News