This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Will OpenAI run away in the large language model race in 2025?
Or has Google already caught them? Or maybe Anthropic Claude will wake up from its late 2024 nap of not too many updates and come back and retake all of the spotlight. We're going to be talking about that today in a lot more on Everyday AI.
What's going on, y'all? My name is Jordan Wilson and welcome to Everyday AI. Before we get started, have to give a quick shout out to our partners at Microsoft. So why should you listen to the WorkLab podcast from Microsoft? Because it's the place to find research-backed insights to guide your org's AI transformation. Tune in now to learn how shifting your mindset can help you grasp the
the full potential of AI. That's W-O-R-K-L-A-B, no spaces available wherever you get your podcasts. Another place you can get your podcasts, well, right here, but our website. So if you're new here, thank you for tuning in. My name is Jordan Wilson. This is Everyday AI. We do this every day. This is your daily live stream podcast and free daily newsletter, helping us all
Learn and leverage generative AI to grow your companies and your career. So you could spend, I don't know, hours a day trying to keep up, trying to see what it all means. Or you could let us do that. Tune in every day. Subscribe on the podcast and go to our website, youreverydayai.com. On there, it's like a free generative AI university. Hundreds, hundreds,
of episodes. You can go back, watch them, listen to them, read all the important insights on our website, all for free. So that's your new home away from home. So make sure you go check that out. All right, before we get started, now I'm excited today to talk about the large language model race in 2025. Not quite a prediction show. Going to have that coming for y'all in two weeks. Been spending like dozens of hours on that one. All right, but before we get started, let's talk about, as we do almost every day, the AI news.
So Microsoft has announced a $3 billion investment to boost AI and cloud services in India. Microsoft's significant investment in India highlights the country's growing importance in the global tech landscape, particularly in AI. So Microsoft plans to invest $3 billion to expand its AI and cloud services in India.
The company aims to train an additional 10 million people there in AI skills, which could enhance job prospects and career growth for many in the tech sector. CEO Satya Nadella emphasized the exciting diffusion rate of AI in India, indicating a strong market potential for AI technologies.
Microsoft operates three data center regions in India already and is preparing to launch a fourth aiming to develop a scalable AI computing ecosystem for startups and researchers.
All right, next in AI news, Google is reportedly building a new AI team that aims to simulate the physical world with advanced models. Yeah, more world model updates. These are important. So Google is making headlines with the formation of a new team at Google DeepMind focused on developing AI models that simulate the physical world. Here's why it's pretty...
Pretty noteworthy. Well, this person leading it, former OpenAI employee. So Tim Brooks, formerly a co-lead on OpenAI's video generator Sora, will lead the new team, which aims to tackle...
quote unquote, critical new problems in AI modeling. The team will collaborate with existing projects such as Google's Gemini, VO and Genie enhancing capabilities and image analysis, text generation and video production. So Google's Gemini series is already recognized for its versatility in AI task while VO focuses on video generation and Genie simulates games and 3D environments in real time. So
This development of world models could revolutionize different sectors, including visual reasoning, simulation, and interactive entertainment, potentially impacting how video games and movies are created. All right, last but definitely not least, you're going to hear,
Be hearing a lot more about CES this week, but CES is off to a big bang. So the biggest tech conference kicked off hours ago with NVIDIA CEO Jensen Wong delivering the keynote address. All right. So like I said, we'll be covering this a lot more in today's newsletter and throughout the rest of the week. We'll probably have a dedicated episode maybe on Thursday or Friday recapping everything new, but here's what NVIDIA announced and why did NVIDIA keynote address
one of the biggest tech shows in the world, right? The Consumer Electronics Show. Well, because everything that they announce impacts everything in technology, right? They are literally powering the generative AI movement with their GPU chips. So,
Speaking of that, some new GPU announcements. The RTX 50 series GPU was announced with Blackwell Architecture. They have four new models of that, which are ridiculously priced. The RTX 5070 starts at $549.00.
How is that even possible? Yeah, if you're not a dork, in RTX 5070, the new model for 549 is baffling. Also, the GB10, which is the Grace Blackwell super chip for desktop AI computing, announced updates with the
Cosmos platform for physical AI and robotics training. Probably one of the biggest updates we saw last night. NVIDIA getting into the desktop computer game with Project Digits, its first supercomputer. It's priced at $3,000, but it is ridiculously powerful, more powerful than
literally any computer that you can buy right now. Also, they announced a partnership with Toyota for autonomous driving systems and also Uber and Nvidia partnered to enhance AI technology in autonomous vehicles. So yeah, a lot more AI news in our newsletter. So please go to youreverydayai.com, sign up for that free daily newsletter. All right. Yeah.
Uh, Fred just said project digits. Wow. Yeah. Uh, Juliet, the, the keynote, I believe you can go replay that. We'll leave that in the newsletter today as well. All right. I'm excited for this. So we'll open AI run away in a large language model race this year. So this is actually, uh, your show. Uh, so I, in the newsletter yesterday, I said, Hey, we got our hot take Tuesday coming up. What do you want to hear? Uh, so, uh,
If you are a longtime listener to the podcast, I literally started this thing for you guys, right? I noticed that when I was trying to learn more about generative AI like four or five years ago, I noticed it was only for highly technical people. So I wanted to create something that's for all of us. So sometimes in the newsletter, I'm like, yo, what do you guys want to hear tomorrow? I'll stay up all night putting together a show just for you guys. So this is technically a user request. All right, but let's talk about it.
So right now, I'm really just focusing on OpenAI, Google, and Anthropic for this, you know, who's going to run away with it episode. Here's why. Obviously, some of the biggest names in the game are playing different games, right? Microsoft.
in their Microsoft 365 co-pilot right now is using OpenAI's GPT-4.0 to power their system. So a lot of people are like, oh, what about Microsoft? Well, even though they are developing their own models, their five models are fantastic small language models. They are...
reportedly going to be offering new models or new choices in the future aside from OpenAI's GPT-4. They're not a player in this game, not in the large language model race. Also Meta, Meta is going open source-esque, not truly open source, but I think they're playing a different game as well. And there's all the Chinese companies as well that I think are going to be really thrusting themselves into the conversation for 2025.
And like I said, we'll have our whole 2025 prediction shows. I think we're going to break it up this year in a couple of weeks. But this is just about the large language model race. Who is going to win it? Well, first, let me start by saying, why does this matter? Well,
You probably are using large language models every day if you're listening to this show. That's why it matters, right? Probably in every aspect of your business, your company, your career, your personal life, you're probably using AI a ton. So this kind of frontier model race, right, it impacts us all.
And even if you are not a huge, large language model user right now, probably there's thousands of name brand pieces of software out there that you probably don't even know are leveraging these technologies. So that's the other thing. At least when we talk about the quote unquote big three here with Google, OpenAI, and Anthropic, their APIs or their backends are
are powering just about everything. There's very few enterprise softwares right now that are not using AI, that are not using large language models. And in most cases, in the overwhelming majority of the time, they're using one of these three models. So even if you're not logging into the front end of these tools,
you're probably benefiting from them, right? They're starting to seep into every aspect of our daily lives. That's why this race, this large language model race is incredibly important. All right. Livestream audience. Let me know who do you think is going to win this race? All right.
uh so is it a is it going to be a open ai is it going to be b google is it going to be c anthropic might it be d meta or you can leave e other in the comments uh i'm curious what everyone else thinks right um part of this doing this everyday ai thing uh together with you all is learning from you uh
I come at it from my perspective. I'm lucky enough to spend the majority, or maybe dorky enough to spend the majority of my day playing with large language models, testing new features, helping large enterprise companies. People are always like, oh, Jordan, how does this everyday AI thing make money? Well, we're lucky enough to have great sponsors and partners like Microsoft, but enterprise companies hire us. And they're like, hey, Jordan and team, we have 5,000 employees, or if
500 employees that need to learn chat GPT, or we need to learn Microsoft Copilot and then we go help them. So, you know, I'm lucky enough to be able to interview people from these companies, but also help, uh, enterprise organizations and small and medium sized businesses actually learn these, but
I want to learn from you guys. So yeah, a lot of people here so far saying, you know, Marie said, Marie said, a Kathleen said a, which is open AI in a landslide. Fred is team Google. Douglas says open AI and Microsoft.
Jackie says it's a two horse race here between OpenAI and Google. A lot of people, no single votes for Anthropic. That's interesting. It's interesting, right? Depending on where you look, where you read, you would think Anthropic is the only large language model out there, right? I think people on Twitter, for whatever reason, are very, very bullish on
on anthropic, right? Like you would literally think no other large language model exists. So that's why I like asking you all
But let's go ahead and talk about it. And I want to give you three reasons. All right? Yeah, this is Hot Take Tuesday. I'm going to accidentally go on a rant. I'm going to keep Fred on the treadmill longer than normal. Sorry. Or if you're walking your dog. All right? But I'm going to give you three reasons why.
Why open AI might win this race and why they might not. And at the end, I'm going to give you my honest take who is going to win this. And again, let me remind you of the importance. Your company is probably making long-term, maybe six, seven, eight figure financial decisions.
on what large language model they use, whether you're using it on the front end in a team or an enterprise account, or maybe you're building something on the back end with the API.
There's a good chance your company is making major investments in large language models and how you use that to change knowledge work. So keep that in mind. I always like to reframe this and tell you why it's important. So let's start with reasons OpenAI might not win.
the large language model race. Yeah. You might be curious on this one because you're like, Jordan, you talk about OpenAI all the time. Well, yeah, it is. I mean, OpenAI is the company that's technically started the generative AI world.
wave. Yes, the transformer technology in GPT originated with researchers at Google, but OpenAI in November 2022 technically started this whole generative AI race with its AI chatbot chat GPT, even though there had been many different large language models available for developers prior to that. But reason number one, they might not win the race, even though they kind of started it.
is because I think the race is going to make way for reasoning and agentic models. Or in other words, the way that we look at this race, right? Like who's winning? We look at benchmarks, right? We look at things like MMLU or MMLU Pro or human eval, right? We look at all these dorky benchmarks.
But then we also look at head-to-head scores, right? So probably the most popular one out there is the LM Arena or the Chatbot Arena, previously under the Hugging Face umbrella, but now it kind of has its own domain. So this is where millions of users have gone on and they put a single prompt in and then they get two outputs and then they judge which one is right.
better. All right. And that, uh, gives us what's called an ELO score. So think of it like, you know, how, uh, they had the blind taste test for, you know, Pepsi and Coke. That's kind of what this is. There's a dozens of frontier models, uh, that go head to head blind scores, the top model, the model that wins the most essentially gets the most points. I think in 2025, uh,
These big companies are going to care less. I think in 2023, benchmarks and ELO scores largely drove the conversation. And it actually, I think just these two metrics influenced decision makers on which model they should try. And I think part of it, rightfully so, right? Because when you looked at capabilities, right?
Again, up until the latter part of 2024, these two benchmarks, these two metrics alone, right? The chatbot arena and benchmarks that told the whole story, right? And that's what companies race toward. I don't think it's going to be like that as much in 2025.
I think companies like OpenAI are going to stop caring less. And you can't say that they didn't care, right? Because what you would see anytime one of these companies, you know, released a model and then they, you know, were kind of crowned the top of the chatbot arena board, you know, literally you had a day later, you know,
other company would release an update to their model that they'd been sitting on because they're like, oh, we got overtaken on the top of the chatbot arena board. So you can't say that this wasn't a driving factor for releases in 2023, 2024. It was, but I think in 2025, we're going
We're going to be talking much, much more about business value, right? I don't think it's going to matter as much about benchmarks and ELO scores because we're already topping out, right? If you're a dork and you follow MMLU like me, I mean, all the new models are going to be 88, 89, 90, 91, 92, right? They're going to be in the high 80s, low 90s, which is smarter than the smartest human,
Smartest single human and it's not even close, right? So we've already gotten past the point when large language models you can't make you know as long as you know what you're doing which a majority of the people talking about AI or sharing about it online literally don't know what they're talking about but as long as your company knows what they're doing and they probably do because you're investing in the technology, right? There's a certain point of diminishing returns on
I think once you hit a certain point on these benchmarks, once you hit a certain MMLU, right? Once you hit a certain ELO score or a head to head win rate against the other models, like I think there's a point of diminishing returns where it's like, yes,
I think it turns into a simple yes or no. And I think the smart companies have already figured this out, right? Overfitting a model to go from 88.7 to an 89 on the MMLU is not going to be a driving factor anymore. So-
What I'm trying to say is OpenAI may not be atop the benchmarks, atop the leaderboard for the majority of 2025. Like they've probably spent 80% of the time since these benchmarks became widely used, since the chatbot arena started becoming a main discussion piece. They probably spent 80% of their time at the top. I don't think it's going to matter anymore or as much. All right.
Fred said people do keep leaving OpenAI. That's the truth. All right, let's keep going because OpenAI still has their old model, quote unquote old, right? GPT-4-0, which was announced in May.
It is still batting at the top of these chatbot arena boards, which is another reason why I think they might not technically win the race. I think the race is just going to be redefined, right? In terms of how we measure the large language model race. I think it's going to be more about creating business value than it was about these other things. So reason number two, OpenAI might not win the race. ChatGPT search is seriously flawed.
Seriously. Okay. So without going into too much of a side tangent, most large language models, aside from Claude for whatever reason, are not connected to the internet.
All right. So that is problematic. So internet connectivity is a huge part, at least of using these models on the front end, because in often, sorry, in many cases, the training data, right? So essentially think of large language models. There's a training cutoff.
So you have all these smart researchers, they gobble up all the information on the internet, a lot of it copyrighted. They have smart humans train these models and then release it to us all. And there's usually a knowledge cutoff, but that knowledge cutoff is generally somewhere from nine months to 18 months in the past.
And most things you're working on, you need up-to-date information. So a large language model's ability to connect to the web is huge.
All right. So previously open AI used, uh, they had a feature called browse with Bing. All right. And then I believe it was late. Uh, we'll just say, uh, October, uh, October, November, uh, of 2024 opening. I rolled out chat GPT search. So from a, uh,
UI UX, right? So from a user interface, user experience perspective, there's great things, right? It's bringing aspects of Google map. Uh, it's bringing these rich snippets, right? Chat, GBT search. It's it's, they're still using, uh, as far as we know, the, uh, the being
the Microsoft Bing technology somewhat on the backend. They haven't really described it a lot, but ChatGPT searches how OpenAI and how ChatGPT stays connected to real-time up-to-date information past its knowledge cutoff. It's seriously flawed though.
Browse with Bing to not have these problems. And I think this is important to talk about. I always keep saying like, oh, maybe I'll have a dedicated show on this, but I know at any time OpenAI could just fix this. They have to know it's a problem. Here's what's seriously flawed right now. Yes, it's great. You can ask ChatGBT, what's going on this weekend in Chicago, the city I live in, right? It'll give you this nice Google-esque like search results, right? With rich snippets, lists,
you know, little photos. Nice to use, right? Now we're seeing rollouts on mobile that give you essentially map results, right? Very nice and intuitive to use. However,
iterative prompting is broken when you are using chat GPT search. There's a little globe icon. Sometimes chat GPT will use this feature or call this tool on its own, even if you don't call to it. So much of
Using a large language model is what happens after the first prompt, right? It is the iterative nature. It is going back and forth, right? It's going like, as an example, you know, our prime prompt polish, you know, our PPP method. It's not just putting in one giant prompt. It's having a conversation. It's going back and forth after your first response.
For whatever reason, since it came out, chat GPT search, it gets stuck in a loop, right? You can't really iterate or build upon a result. I don't know why. It's been broken for many months. Uh,
And that's concerning when a feature that big, not 100% of the time, but a good chunk of the time, it gets stuck in a loop. So let's say if you ask as an example, what's the biggest AI news, right? And then ChatGPT is going to use ChatGPT search because it knows it needs real up-to-date information for that. It might spit out some trends, right? And then you go back and you refine it and you say, no,
Please give me the top AI news for January 2025. Guess what? It's going to, in most cases, spit back the exact same response. I don't know why OpenAI hasn't fixed this. It's a little concerning because there's hundreds of millions of people using chat GPT and chat GPT search.
Uh, it's concerning that they haven't fixed this. I know I'm not the only one complaining about it, uh, but I've been complaining about it pretty, pretty loudly, but that's a reason they might not win the fact that they haven't fixed this yet. And it has been a terrible, terrible user experience for multiple months. I get it. December, they shipped like two years worth of features and updates, but it looks like chat GBD search just got glazed over. Right. Um,
The core functionality, because at its core, it is broken. If you compare it to browse with Bing, which after the browse with Bing updates of the latter part of 2024, it was essentially a perplexity light already. All right. Reason number three, open AI might not win the large language model race. Well, they're burning cash.
And they're facing an increase. This is all reportedly, right? Reportedly, OpenAI lost billions of dollars in 2025 or sorry, in 2024.
So according to reports, OpenAI experienced a $5 billion loss in 2024. So that's another reason why OpenAI might not win the large language model race. They're burning cash, right? Reportedly. And they need to turn a profit. They did just release their new and more expensive platform.
That's $200 a month. OpenAI CEO Sam Altman did go on Twitter and say, oh, we're actually losing money on this, right? So I'm sure investors weren't thrilled to see that tweet, which led to a lot of news and media coverage on, hey, OpenAI is losing more money. So why?
One reason they might not win it is because the advancements that they might want to be working on from a large language model perspective might not be getting the resources that it needs. They're losing key people like we talked about at the beginning of this show, Google's new AI team, and now they have Tim Brooks, formerly a co-lead on their video generator, Sora. So they're burning cash, right? They're reportedly losing money.
And that might, I mean, between that and their increased, at least from an external perspective, their increased focus on AGI, on ASI, so artificial general intelligence, artificial super intelligence, right? Their increased focus.
external focus on this could in theory keep them from their internal uh daily driver which is improving their two classes of models right so they have their gpt class of models so we have gpt 4.0 and then they have their reasoning class of models the o1 uh you know so o1 uh
01 Mini, 01 Pro, and then you have your 03, which may or may not get released in 2025. We'll see. But they could lose their focus on actual large language models chasing agentic AI, chasing AGI, chasing artificial super intelligence. So it could keep them from that day-to-day race.
All right. Before we get into the three reasons I think they might still win the large language model race, let me tell you a little bit more about Microsoft WorkLab. So why should you listen to the WorkLab podcast from Microsoft? Because it tackles your burning questions about AI at work, like how can I guide my org's AI transformation? How can AI help maximize value and create new products and business models?
What mindset shift do we have to make if we want to tap into its full potential? Find the answers on WorkLab. That's W-O-R-K-L-A-B. No spaces available wherever you get your podcasts. All right, straight into it. Now, three reasons why OpenAI might win the large language model race of 2025. Number one, they got the users, baby. They got everyone.
It got the users and with the users comes the data and with the data comes better models, right? Still, I don't think people realize how good of a deal that $20 a month to use a pro plan of in cloud anthropic, right? Even though if you look at cloud the wrong way, you hit a rate limit, right? I saw someone in the comments here on our live stream saying,
Someone was tweeting on the live stream. I think it was Michael just about like rate limits, right? Yeah, but even at these $20 plans that are extremely $20 a month plans, right? From Microsoft Copilot, from ChatGPT, from Gemini, from Anthropic Claw, from all these other large language model makers. If you're not opting out of data, you are the product, right? And so many people don't know any better.
So many people don't know how to turn off their training data, right? And there's more, I would say, protection over your data as you are on higher plans, right? So I have...
normal, uh, you know, normal paid accounts. I have team accounts. I have enterprise accounts, right? Because we, uh, advise companies on how to use this at scale in their organizations, right? So I know the different, uh, kind of data controls that you have at different levels. So even at the base level or the free level, right? So many people are on free plans and they're just dumping in all their company info. That's why I think they're going to
probably still win the large language model race. They have the data, they have the users. All right.
So let's look at this here on my screen for our live stream audience. This is just a Google Trends comparison, all right? And this isn't like overall searches. This is just interest over time, right? Comparatively. So comparing ChatGPT to Gemini to Perplexity to Claude, right? Just as an example, talking about some of the popular, you know,
AI systems there. And yes, perplexity is more of an answers engine. So that's why I didn't really include them in this conversation either. Right. I'm talking about large language models for the most part perplexity. You just use one of these models and then you, it's technology is more of an answers engine. All right. But what this graph shows is the interest, the search volume and the users for chat GPT are huge.
greater than all other competitors combined. It is not even close. Open AI is synonymous with AI, right?
which is weird because artificial intelligence has been around for many decades, but you ask the average person on the street, Hey, have you heard of AI? Not, not all you all, right? You guys like me have probably used dozens of large language models, right? But ask the average non everyday AI listener, right? Hey, do you know anything about AI? They're going to say, Oh, like chat GPT, right?
my mom uses chat gpt i didn't even tell her to do it right she probably did it from i don't know maybe listening to the show so hi mom um but you know most people don't know anything about ai right we live in a bubble here on this show on social media in our own echo chambers of artificial intelligence most people when they hear a ai they just think chad gbt
Not the fact that traditional machine learning and neural networks have been widely used for many decades. It is synonymous. That is one of the benefits of OpenAI's go-to-market strategy. They made a huge splash. I think at the end of November 2022, it became especially heightened because
You know, we were still, you know, kind of in this COVID phase, right? People were spending more time indoors using technology, right? More people were working at home and it just came at the perfect time and it blew up, but they have more users, more interest, more name brand recognition than everyone else combined. And it is not even close. More users, more data means you're probably going to win. Let's keep it going. I'm going to wrap this one up.
Try to go quickly. Reason number two, they might win. Yeah, good question here. Actually, Cecilia is saying, doesn't Google inherently have the potential users? Kind of, right? Yes, Google has hundreds of millions of users of their technology.
What people don't know is you have to be on a paid plan, right? So, uh, it's, it's an additional add on right now, right? Uh, Gemini, uh, or if you are on a Gmail plan, uh, you, you know, you can use Gemini for free, but for the most part, it is extremely hard to use Gemini on the front end. It is hard for organizations to roll this out, right? You have to have literally like a degree sometimes in order to give your organization, uh,
a pro version of Gemini. So yes, I do believe that Google will catch them eventually, but right now in terms of active users, right? ChatGPT is blowing everyone else away. All right. Reason number two, OpenAI still might win the LLM race. They are the only front end model with a reasoning model, projects, the internet access, code rendering, and tools.
All right. Google Gemini is catching up. Their front end was essentially, you know, the redheaded stepchild of AI until December of 2024. No offense against redheads or if you are a stepchild or if you are a redheaded stepchild. I'm just, you know, using analogies here. Sorry. But yeah.
They were large, like the front end of Gemini, Gemini.Google.com was largely ignored until December. Google tucked away all of its best technology inside developer platforms, inside of Google's AI studio, inside of Vertex. And people don't know that. I
I did a whole ranty episode on this a couple of weeks ago. So go listen to that if you want to. But Google, I think, ended up losing trillions of dollars in market value because they didn't understand it is non-technical people making decisions for Fortune 500 companies and what they're doing, what everyone's doing to test out AI, right? To test out large language models before implementing it in their organization, right?
I've literally talked to dozens of Fortune 500 companies that do it this way. Nothing wrong with it, right? Usually an individual or a group of individuals or teams will start using a large language model on the front end, usually before their company has an official AI policy. Then they'll go to leadership and show them something. They'll log on to chatgpt.com or gemini.com or claw.ai or copilot.microsoft.com, right? And be like, oh, wow, look at this, right? What I'm saying is very...
I won't say very rarely, but it's not commonplace that you have your technical people, your CTO, your CISOs, your CMOs doing these things on the backend. Front end is where decisions are made. And ChatGPT has a stranglehold on front end features. It's not close. Google Gemini catching up finally, but until four weeks ago, Google Gemini, sorry, didn't
was trash on the front end. They didn't put their most recent models. Five months ago, it had a problem
Using Google, right? Yeah. Claude, great. I mean, Claude's great at some things. It's not connected to the internet. They don't have a reasoning model yet. So OpenAI is the only model that has everything. They have everything you need on the front end. Are there improvements to be made? Absolutely. Do other models excel in other areas where OpenAI doesn't? Yes, right? Google Gemini.
is technically has the better model, right? Right now, by a thin margin, but they have a better model. Claude, their artifacts feature that you can render code way better than OpenAI's Canvas, even though they're kind of two different things. So yes, these front ends for Google and Claude have advantages, but OpenAI has it all.
All right. And just like we saw, OpenAI came out with projects, right? Which tells you, yeah, anything good that a competitor has on their front end, OpenAI is going to implement it or it's probably already been in the works.
All right. And I mean, we haven't even talked about what else might come to the front end of open AI, right? Hopefully we'll see an updated Dolly or maybe it'll just be Sora photo right now. It's a different front end. Maybe we'll see Sora inside the chat GPT interface. You know, maybe we'll see the new operator, which is the agentic system interface.
Inside the chat GPT interface, there's something that's been rumored called tasks where you can schedule essentially prompts to run, right? So the front end interface is only improving. And I think for whatever reason, their two biggest competitors have been too slow, too stagnant in bringing features consumers want to the front end.
All right. Business leaders are not making, at least across the board, they're not making decisions based on API, even though that's where they may ultimately be using the large language models where they're testing them out, where they're making their decisions is on the front end. Chatgbt.com, gemini.google.com, claw.ai, and open AI is running away with it. All right. Reason number three, I say the best for last y'all.
OpenAI is actually crushing one of the most important games that I don't think anyone's paying attention to, the small language model game. All right, let me share. And I'm on record saying this back in 2023. I've said the future of large language models is small language models as hardware becomes more powerful.
Okay. The AI chips are getting better, right? Your, your GPUs, your NPUs, right? Edge AI using a large language model on your device, on your phone, are you're on your computer is going to be come more and more commonplace in 2025. Why you might ask, well, it's faster. Number one, it's more secure. But right now, when you use all these models on the front end,
right? You are sending all this information to the cloud. It makes it more expensive. It's worse for the environment and there's less safety. Are we going to be able to use, you know, uh, open AI models or Google? Well, Google already has some or Claude models locally. I don't know, but open AI is winning the game of smaller models that are more powerful, right?
No one's paying attention to that. And that is, I think, one of their biggest wins they have going for them right now. Let me quickly explain here. All right. There was a Microsoft research paper. We covered this yesterday in our AI news that matters. That's our weekly Monday wrap up.
Livestream audience, do you guys catch it or is it boring to you guys? Let me know. But we talked about this yesterday, a Microsoft research paper that essentially somehow uncovered model sizes for some of the most popular proprietary models. All right. So these proprietary models, for the most part,
They're secret, right? No one really knows how big they are, how many parameters they are. Think of parameters as a model size. So your open models, right? They say it.
Because you can download them, you can fork them, you can build off of them, et cetera. So with meta, you have your meta, what is it? 3.27B, 7 billion parameters, 11B, 11 billion parameters. Their 3.1 has a 405B, 405 billion parameters. That's the size, how big the models are, the weights, the training, everything that makes that model special. For the most part,
We don't really know. All we've really known is the GPT-4 model was 1.7 or 1.8 trillion parameters. Giant, right? So this new Microsoft research paper shed some light. So like I said, GPT-4 had 1.7 trillion parameters. And if we talk about its benchmarks, right? Sorry, non-dorks, stick with me here for a second. 86.4 on the MMLU.
GPT-4-0, right? So the successor or the updated version of GPT-4, so the Omni model, 200 billion parameters. What does that mean? One-tenth the size, performance went up, all right? But you're still like, all right, Jordan, well, a 200 billion parameter model, you can't really run that locally. Well, yes, you can.
Right. Not GPT-4.0 because you can't download it, but you might be saying, oh, that's a huge model. You can't run that locally. Well, look at what NVIDIA just literally announced. Right. You could run a 405 billion parameter. You can literally chain two of these these new parameters.
digits, project digits, chain two of them together, and you can run a 405 billion parameter model, Meadows 3.1, 405B. You can run that locally, which is mind blowing, right? If you don't follow this stuff, I can't even explain it, right? But the fact that you can run a model that big locally, huge. So
GPT-4.0, a tenth of the size more powerful. Why does that matter? Well, look at their quote unquote small model, their small language model, GPT-4.0 mini, 8 billion parameters. That is tiny. All right. The next iteration of smartphones in one year, smartphones.
would be able to hold an 8 billion parameter large language model, right? Edge AI. Right now, usually most edge AI smartphone kind of models are between 1 billion and 3 billion parameters. No one's doing this math since this study came out. Like that was the first thing I saw. Maybe it's because I'm a dork, but I'm like, wait, GPT-4.0 is only 8 billion parameters?
And it's still highly capable with an 82 on the MMLU, right? Think of an MMLU as, you know, I know more people are looking at MMLU pro or other benchmarks. I like MMLU. It's a nice standard that's been around for a long time. And you might think of it like, okay, that's a big drop off, right? To go from, uh, you know, an 88.7 in GPT-4-0 to an 82, uh,
with GPT-4.0. It is an 8 billion parameter model. That is tiny. Let's look at some of the other models that are in that same size, at least that we know the parameters and we have an MMLU score for. LAMA, the 3.211B. So a bigger model, technically, 73 MMLU. All right. Microsoft's 5.3. It's a 7 billion parameter model.
65 mmlu all right if you don't know anything about mmlu scores right uh they fight for a 0.1 percentage right like when you're getting into the 88s the 89s like a 0.1 0.2 0.3 improvement is is huge open ai
is silently crushing the small language model game. Why does that matter? One, I told you, Edge AI, right? In theory, you would be able to run something like that locally. Who knows? Maybe OpenAI will allow that one day. Maybe that will lead to us actually having a state-of-the-art model on our devices, right? Who knows? Maybe the iPhone 18 might have GPT-5 Mini, right?
on it running locally, which in terms of what that means for humans, what that means for society, what that means for work is crazy because then there's really zero reason, right? For anyone in the world to be like, nah, our organization's not gonna do this AI thing. They're silently crushing the small language model game. No one is paying attention. Why else does that matter aside from edge AI? Well, I believe in the future,
we're going to be using thousands of small language models. I think your O1s, your O3s, these reasoning models, they're going to take, let's just say a GPT-5-0 mini. Let's just say, let's just say there's a GPT-5-0 mini. And let's say we have an O3. I believe the O3 is going to start taking the place of
Reinforcement learning with human feedback is going to become reinforcement learning with reasoning feedback. You are going to have these reasoning models fine tuning, right? And this is when we step into that line between AGI and ASI, right? But AI is going to be creating thousands of versions of these smaller models.
I think we're going to have what's called a mixture of models. Make sure you tune into our 2025 prediction show in two weeks. I'm going to be talking about that, right? We had this thing called mixture of experts and we have for a couple of years. I think we're going to have something called mixture of models. I think we're actually going to be using
thousands of small language models in all the large language model. The front model is going to do is congregate information and orchestrate the large language model to go out and do things. All right. I got a little, I got a little dorky there at the end y'all, but let me go ahead and end this this way. All right. Cause this was, this was a lot long episode. Sorry. Sorry. If you're still on the treadmill.
I gave you three reasons why OpenAI might not win the small language model race. I gave you three reasons they might. I asked the audience, who do you think is going to win? So let me wrap it up by saying this. Yes, OpenAI is going to win the large language model race in 2025. However, they actually have competition.
Because if you look at the 24 months from November 2022, when chat GPT was released until November 2024, right? 24, 25 months, it was a one person race. It wasn't even close. Google, I think had the best month in AI ever in December, 2024, right?
Not just from a large language model perspective, but generative AI. Tons of features that I think are going to be useful and actually used. For the first two years, OpenAI was running by itself. Yet, they still innovated and they still, I'd say, dominated.
Right. There's a reason Microsoft, number one, number two, Apple, there's a reason they chose open AI to power the future of their devices, of their technology, of their software. Right. Apple and Microsoft are smart. They build their own models. Yet they said, ah, we're going to use open AI for a big part of our future. Open AI did that with they knew.
If I'm Sam Altman, if I'm in leadership at OpenAI, I knew it was a one pony race for two years. It's not anymore. So yes, we're going to see OpenAI go in different directions. Yes, they may get sidetracked going after AGI, ASI, their operator agents, all these other things. But they know now Google is on their toes. Poor Claude, poor Claude.
I think Claude could be one of those sad stories in 10 years where everyone's like, oh, remember Claude? And I don't know, maybe they get acquired by Amazon or Aqua hired by Amazon, or maybe they just fade into oblivion. I think, you know, at least going back to our original three, I don't think Claude is in the race. I don't think they are. But I think OpenAI is going to win the race, but it's going to be a lot closer than it was before.
for the first two years. I hope this was helpful, y'all. If so, please go to youreverydayai.com. Sign up for our free daily newsletter. Also on our website, there's a ton of information. Like I said, hundreds of episodes, no matter what you care about. Do you care about HR? We have a category for that. Go learn from HR leaders. Do you care about marketing? We have a category for that. Do you care about enterprise technology? We've talked to the experts.
Literally our website, youreverydayai.com is your new best friend. If it's one of your goals in 2025 to better learn AI, there is no better unbiased, no BS resource. It's all there. It's all free. Make sure you sign up for our newsletter while you're there. Thank you for tuning in y'all. I hope this was helpful. If so, please, if you're listening on the podcast, subscribe to the channel, leave us a rating, all that good stuff. If you're
listening here online, click that repost button, share it with someone who needs to know it. Thank you for tuning in. We'll see you back tomorrow and every day for more Everyday AI. Thanks, y'all. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.