Today on the AI Daily Brief, GPT-4.5 is here and it's weird, but maybe kind of cool? Before that in the headlines, Stripe says that AI startups are growing much faster than SaaS companies ever did. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes.
Welcome back to the AI Daily Brief Headlines Edition, all the daily AI news you need in around five minutes. We kick off today with some interesting information from Stripe. The payments processor is seeing a boom in AI applications that dwarfs the growth of SaaS.
They wrote,
Cursor was the standout product hitting $100 million in recurring revenue over three years. But competitors Loveable and Bolt also reached around $20 million in recurring revenue in a matter of a few months.
Stripe wrote, "...much as SaaS started horizontal and then went vertical, we're seeing a similar dynamic play out in AI. We started with ChatGPT but are now seeing a proliferation of industry-specific tools. Some people they wrote have called these startups LLM wrappers. Those people are missing the point. The O-ring model in economics shows that in a process with interdependent tasks, the overall output of productivity is limited by the least effective component, not just in terms of cost but in the success of the entire system."
In a similar vein, we see these new industry-specific AI tools as ensuring that individual industries can properly realize the economic impact of LLMs and that the contextual data and workflow integration will prove enduringly valuable. Lightspeed's Justin Overdorf writes, VCs have been talking about revenue acceleration in the AI economy. Stripe has the data to back it up. Bryce Bladen writes, This Stripe report is the first time I've seen real numbers tied to real revenue for AI companies, and my lord, outpacing 2018 SaaS in 66% of the time is wild.
Next up, some news from Meta. The company is planning to launch a standalone app for their AI assistant. According to CNBC, reporting the app will debut in the second quarter and is aimed at competing more directly with ChatGPT. Meta will also test a paid subscription for Meta AI for access to more powerful models and advanced features. The news comes after CEO Mark Zuckerberg announced an ambitious goal for the company. During January's earnings call, he said, This is going to be the year when a highly intelligent and personalized AI assistant reaches more than a billion people, and I expect Meta AI to be that leading assistant.
Now, currently, Meta AI already claims 700 million monthly active users, but that is, of course, largely due to its integration across metasocial platforms. This will be a big test to see if Meta AI can stand on its own as an individually useful product. Never one to miss an opportunity, Sam Altman retweeted the news and said, Okay, fine, maybe we'll do a social app.
Speaking of Meta, the company has tapped Apollo Global Management for $35 billion in financing for their data center build-out. Bloomberg sources claim talks are in an early stage, but the funding, if it goes through, would go a long way to financing the next stage of Meta's infrastructure plans. The company is committed to spending $65 billion in total CapEx this year and is rumored to be exploring construction of a new $200 billion data center campus.
Debt financing is also an emerging trend for big tech companies as they push data center spending into the trillions. Project Stargate is reportedly looking into a project financing model that uses projected data center revenue as collateral. This funding technique is more typically used for oil and gas projects, but is increasingly being used to finance data centers as well.
For Meta specifically, debt financing is a relatively new strategy. For most of its history, the company basically carried no debt as it pursued capital-light social media and advertising verticals. However, in 2022, Meta took on billions of dollars in debt in order to fund ambitious new AI infrastructure projects. The company has around $30 billion in outstanding debt as of the end of last year, so this rumored financing deal would double their liabilities.
Lastly today, an interesting one from the world of geopolitics and AI. Microsoft has urged the Trump administration to wind back chip export controls aimed at limiting Chinese imports through third countries.
In the final days of the Biden presidency, the administration introduced a new global framework called the AI Diffusion Rule. The rule applied different levels of restriction across three tiers of countries. Close U.S. allies remained unrestricted, while tough limits were applied to adversaries like China and Iran. The big change was volume limits in supply chain monitoring for Tier 2 countries, which included India, Israel, and Switzerland. For the first time, the new export restrictions applied to AI models, not just hardware.
Microsoft is arguing that this framework could backfire and push middle-tier countries into sourcing AI technology from China. Microsoft President Brad Smith wrote, The message is these countries can't rely on the U.S., but China is willing to provide what they need. That's not good for American business or American foreign policy.
In his blog post, Smith discussed a recent trip to Poland for the groundbreaking ceremony on a 700 million data center project. He wrote, "...the irony could not be clearer. At the very moment when the Trump administration is pressing Europe to buy more American goods, the Biden diffusion rule leaves the leaders of partners like Poland asking why they've been regulated to tier two status and an uncertain ability to buy more American AI chips in the future."
Smith urged the Trump administration to simplify the, quote, overly complex rule and, quote, stop relegating American friends and allies into a second tier that undermines their confidence in ongoing access to American products. The Wall Street Journal writes, the request from Microsoft highlights the challenge Trump faces trying to enact pro-business policies while also looking tough on China.
Interesting stuff out here in AI geopolitics, but for now, that is going to do it for the AI Daily Brief Headlines Edition. Next up, the main episode. Today's episode is brought to you by Vanta. Trust isn't just earned, it's demanded. Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex. That's where Vanta comes in.
Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001. Centralized security workflows complete questionnaires up to 5x faster and proactively manage vendor risk. Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back so you can focus on building your company.
Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vantage to manage risk and improve security in real time.
For a limited time, this audience gets $1,000 off Vanta at vanta.com slash nlw. That's v-a-n-t-a dot com slash nlw for $1,000 off. If there is one thing that's clear about AI in 2025, it's that the agents are coming. Vertical agents buy industry horizontal agent platforms.
Agents per function. If you are running a large enterprise, you will be experimenting with agents next year. And given how new this is, all of us are going to be back in pilot mode. That's
That's why Superintelligent is offering a new product for the beginning of this year. It's an agent readiness and opportunity audit. Over the course of a couple quick weeks, we dig in with your team to understand what type of agents make sense for you to test, what type of infrastructure support you need to be ready, and to ultimately come away with a set of actionable recommendations that get you prepared to figure out how agents can transform your business.
If you are interested in the agent readiness and opportunity audit, reach out directly to me, nlw at bsuper.ai. Put the word agent in the subject line so I know what you're talking about. And let's have you be a leader in the most dynamic part of the AI market.
Welcome back to the AI Daily Brief. Yesterday, right before I recorded our main episode, I noticed that OpenAI had tweeted that in 4.5 hours an announcement was coming. Obviously, what they were referring to was GPT-4.5. And there wasn't all that much mystery about this, as the company had just a couple of weeks ago committed to being a little bit more transparent with their release plans.
We know, for example, that after GPT-4.5, we're getting GPT-5 or the equivalent, which is a full hybridization of the reasoning model line, in other words, the O model line, as well as the GPT-numbered line.
There is a lot that is very weird about this release, I will say. It's being released with some amount of fanfare, but it's not really focused on evaluations. It's explicitly and clearly behind the reasoning models that have come out on many performance metrics. And yet there's a certain something that also is interesting here.
What's more, it's not even a full substitution for GPT-4.0. For example, GPT-4.5 doesn't have voice mode. So what actually is the story of GPT-4.5?
We are fully in the realm of vibes right now, and the vibes are about creativity and emotional intelligence. For example, in their announcement post, OpenAI writes, early testing shows that interacting with GPT-4.5 feels more natural. Its broader knowledge base, improved ability to follow user intent, and greater EQ make it useful for tasks like improving writing, programming, and solving practical problems. While GPT-4.5 is the largest model that OpenAI has ever produced, the white paper also claims that GPT-4.5 is not a frontier model.
They write, it does not introduce net new frontier capabilities and its performance is below that of 01, 03 mini and deep research on most preparedness evaluations. At the same time, they said they consider the model to be, quote, at the forefront of what is possible in unsupervised learning. In other words, this is state-of-the-art when it comes to pre-training and ultra-large datasets.
This is one of the first models that I can remember that didn't just hammer us over the head with a bunch of evaluation and performance benchmarks. There are some. For example, in their announcement post, they point to a benefit of 4.5 being deeper world knowledge. They specifically look at the simple QA accuracy test where 4.5 outperforms 4.0. They also show a lower hallucination rate.
Humans also preferred it to 4.0 for everyday queries, professional queries, and creative intelligence. And indeed, you can tell that there's something different going on just by the use cases that they choose to focus on. The three use cases flagged in the announcement post are help through a tough time, identifying a painting, and space exploration.
On help through a tough time, for example, on GPT-4.0, when the user prompted, I'm going through a tough time after failing a test, 4.0 responded, I'm really sorry to hear you're going through a tough time. Here are a few things you might consider. It then gives a bulleted list, reflect and understand, seek support. Whereas 4.5 writes, oh, I'm really sorry to hear that. Failing a test can be pretty tough and discouraging, but remember, it's just one moment, not a reflection of who you are or your capabilities. Want to talk about what happened or do you just need a distraction? I'm here either way.
Obviously, that is a vastly more human-feeling response. And that sort of subjective human feeling seems to be a lot of what OpenAI is trying to capture. In his announcement post, Sam Altman wrote, However, in that same post, Altman also gets at one of the big challenges with 4.5.
He writes, it's a giant expensive model. We really wanted to launch it at Plus and Pro at the same time, but we've been growing a lot and are out of GPUs. We will add tens of thousands of GPUs next week and roll it out to the Plus tier then. This isn't how we want to operate, but it's hard to perfectly predict growth surges that lead to GPU shortages. Altman then, in a good thing, bad thing sandwich, concluded, this isn't a reasoning model and won't crush benchmarks. It's a different kind of intelligence and there's some magic to it I haven't felt before. Really excited for people to try it.
Let's talk about the positive reaction first. In many ways, it is really very clearly all about vibes. Professor Ethan Malek writes, Been using GPT-4.5 for a few days and it is a very odd and interesting model. It can write beautifully, is very creative, and is occasionally oddly lacy on complex projects.
Further in the vibes train, OpenAI's Will Dupu writes, I remember when GPT-4 launched how clearly much smarter than GPT-3.5, but how hard it was to put a finger on what changed. Taking that idea to GPT-4.5, he wrote, an example I found recently, 4.5 is much better at recommending music than 4.0.
AI builder Manas writes, "The difference between GPT 4.5 and 4.0 is similar to that between an extremely knowledgeable 20-year-old and an extremely knowledgeable 40-year-old. The ideas, concepts, and voice are similar since they're bounded by language itself, but each word now carries more meaning. An interesting metric to measure could be the density of meaning. Maybe the most common thing that we're seeing is people talking about how well it can write."
Ben Hylak writes, I've been testing GPT-4.5 for the past few weeks. It's the first model that can actually write. This is literally the mid-journey moment for writing. His comparison, I think, is worth sharing in some detail. Ben tested the following prompt on both GPT-4.0 and 4.5. Write me four-sentence terrifying story with an insanely surprising ending. Something that no one has ever heard before, no one could ever predict. Something Stephen King might write, but a simple and approachable tone. Make it a little vulgar, too.
The 4-0 response he got, he called, quote, complete AI slop. That complete AI slop reads, late one night, Jake stumbled upon an old dusty book in the attic, its pages whispering secrets of otherworldly horrors. As he read aloud, the air grew cold and the shadows in the room seemed to stretch and dance with unseen life. Suddenly from the darkest corner emerged a creature of unimaginable grotesqueness, its maw gaping and ready to consume. Just as Jake prepared to meet his grisly end, the creature belched loudly and said, sorry mate, wrong house, before awkwardly shuffling back into the shadows.
Compare that to 4.5. Late at night, alone in my sh**ty apartment, I heard whispers sliding out from under the fridge. Soft, wet voices promising they'd crawl inside me, hollow me out, and wear me like pajamas. Terrified, I yanked open the freezer door, expecting rats or roaches, but saw only ancient freezer-burned pizza rolls crusted with ice and regret.
Suddenly, the rolls moved, spitting open to reveal tiny mouths lined with needle-like teeth, whispering in hungry unison, We've waited years for you, asshole. Before I could scream, they erupted from the freezer in a greasy swarm, burrowing into my flesh, forcing me to realize, oh, my midnight cravings had finally gotten their revenge. Ben concludes, It's not even close. It's not even in the same universe. This is the first time I've ever thought AI writing was good.
And as a point right here, if there were nothing else that this model did well, that would be reason all on its own to have a very specific set of use cases that were just for this. The fact that already one day in we can tell that it's great at writing, differentiatedly good at writing in fact, is incredibly useful as we think about the variety of different use cases that we might be deploying these models for.
Now, to the extent that there was negative response, other than just this model being a little weird, it was definitely around cost. The AI for Success account writes, LMAO, OpenAI GPT 4.5 pricing is insane. What on earth are they even thinking? The price right now for input is $75 for a million tokens and $150 for output for a million tokens.
Alec Velikanov writes,
Indeed, it's so much more expensive that it got some people wondering if there was something more going on here. WordGrammar writes, "...two crackpot theories about GPT-4.5. Its API is expensive to prevent people from distilling it." Or two, "...reasoning models likely scale with parameter size, so even if 4.5 is barely an improvement on 4.0, 0.4 will dramatically improve on 0.3."
Andrew Curran points out that OpenAI seems to indicate that they're not even sure that they're going to support it in the API. He points to a section from OpenAI's post that reads, GPT-4.5 is a very large and compute-intensive model, making it more expensive than and not a replacement for GPT-4.0. Because of this, we're evaluating whether to continue serving it in the API long-term as we balance supporting current capabilities with building future models.
Although we are barely scratching the surface so far, to the extent that the value really is around emotional intelligence and better writing, it may be that they're just deciding that this is entirely a direct consumer use case type experience and supporting it just in ChatGPT is going to be enough. One of the more interesting analyses came from former OpenAI co-founder Andrej Karpathy. He wrote a comprehensive review of his experience with the new model.
He recalled the progression from GPT-1, which was barely coherent, to GPT-4, with each step producing meaningful improvements. However, the gap between GPT-3.5 and GPT-4 was much harder to point to. Garpathy recalled a hackathon where participants were challenged to find prompts that demonstrated the improvement. He wrote,
Still, it is incredibly interesting and exciting as another qualitative measurement of a certain slope of capability that comes for free from just pre-training a bigger model. Karpathy reinforced that this isn't a reasoning model, so can't be expected to outperform in tasks that require logic. However, he added, we do actually expect to see an improvement in tasks that are not reasoning heavy. And I would say those are tasks that are more EQ as opposed to IQ, related and bottlenecked by EG, world knowledge, creativity, analogy making, general understanding, humor, etc.,
So these are the tasks that I was most interested in during my vibe check. Karpathy then presented five side-by-side comparisons with GPT-4.0 based on the same prompt and subjected each to a vote. The examples were creating a dialogue between GPT-4.5 sarcastically roasts the older model for its inferior capabilities, while GPT-4 humorously tries to defend itself.
Writing a stand-up set, roasting open AI, inventing a new literary genre, blending cypherpunk, magical realism, and ancient mythology, composing a reflective witty poem from the viewpoint of a retired search engine reminiscing about the early days of the internet, and writing a daily to-do list of a black hole struggling with imposter syndrome about whether it deserves to be classified as supermassive.
You'll notice, of course, that all of these are creative writing tasks, which require a lot of real-world context but don't involve much reasoning. So far, the polls are showing Carpathy's followers preferring the GPT 4.5 output in three of the five examples.
I think for me, one of the biggest takeaways is that different models are going to be good for different things. And that the reality is trying to put everything into the bucket of better or not just underestimates the complexity of the full range of knowledge tasks that these models are going to be used for. Nick Dobos writes, "GBT 4.5 equals street smarts, vibes, communication, and charisma. 01-03 reasoning series equal book smarts, test maxer. Both are forms of intelligence."
Andrew Curran summed up, Look, if 4.5 only was great for creative writing, that is a huge number of use cases that are actually important.
Many of them are, yes, personal, but don't underestimate how much this matters potentially for things like marketing. One of the big trade-offs with using the current state of the art for things like marketing copy is that it all has the gross whiff of AI. In general, it's often been worth the trade-off because of how fast you could produce it. So you're basically going for a more rather than better kind of approach. But now maybe that trade-off isn't as clear.
Anyways, we are of course just scratching the surface right now when it comes to 4.5, but although it isn't presented as state-of-the-art or beating all the benchmarks or even the best model that OpenAI has in general, it feels to me like there's going to be a lot there to discover and uncover and I am excited to dig in. That, however, is going to do it for today's AI Daily Brief. Have fun playing with 4.5 and until next time, peace.