We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Is ChatGPT The Last Website?, Grok’s System Prompt, Meta’s llama Fiasco

2025/5/16

Big Technology Podcast

AI Deep Dive AI Chapters Transcript

People

Alex Kantrowitz

一位专注于技术行业的记者和播客主持人，通过深入采访和分析影响着公众对技术趋势的理解。

Ranjan Roy

一位在 Margins 工作的科技新闻评论员和 podcast 主持人。

Topics

Alex Kantrowitz: 我认为ChatGPT的崛起可能预示着互联网流量格局的根本性变化，它现在是增长最快的网站，而其他网站的流量都在下降。Grok未经请求地插入关于南非白人种族灭绝的宣传信息，这显示了这些模型存在的问题。Meta的Llama项目延迟发布，也反映了AI模型在扩展方面遇到的挑战。这些都让我开始思考，如果AI模型成为主导的信息来源，我们如何确保信息的公正性和多样性？ Ranjan Roy: 我认为我们正处于一个转折点，生成式AI正在迅速融入我们的生活，但它们从哪里获取内容将成为一个关键问题。现有的网络系统在这种模式下已经行不通，我们需要重新思考信息的经济模式。同时，我也对AI模型的系统提示表示担忧，这些隐藏的提示可能会影响人们的观点，我们需要找到一种方法来确保AI的透明度和公正性。

Deep Dive

Chapters

ChatGPT's rapid ascent to the top 5 websites globally, and its unique growth amidst others' decline, raises questions about the future of the internet and the role of generative AI in information consumption. The shift from a search-driven to an AI-driven internet is discussed, highlighting the implications for media companies and the potential for a fundamental rethinking of the web's economic system.

ChatGPT ranked as the fifth most popular website globally.
ChatGPT is the only top-ranked website experiencing growth.
Cloudflare data reveals a significant increase in AI crawls compared to visits, posing challenges for media companies.
The economic model of the web needs rethinking due to the shift towards AI-driven information consumption.

Shownotes Transcript

Translations:

中文

ChatGPT looks like the last website on Earth that's growing. What does that mean for the rest of the web? Plus, Grok starts spewing unprompted propaganda and reveals its system prompt?

And Meta's Llama Project is in some serious trouble. That's coming up on a Big Technology Podcast Friday edition right after this. From LinkedIn News, I'm Leah Smart, host of Every Day Better, an award-winning podcast dedicated to personal development. Join me every week for captivating stories and research to find more fulfillment in your work and personal life. Listen to Every Day Better on the LinkedIn Podcast Network, Apple Podcasts, or wherever you get your podcasts.

From LinkedIn News, I'm Jessi Hempel, host of the Hello Monday podcast. Start your week with the Hello Monday podcast. We'll navigate career pivots. We'll learn where happiness fits in. Listen to Hello Monday with me, Jessi Hempel, on the LinkedIn Podcast Network or wherever you get your podcasts.

Welcome to Big Technology Podcast Friday edition where we break down the news in our traditional cool-headed and nuanced format. We have a major show for you today where we're going to talk about some new data that we've gotten about ChatGPT's ascent in the worldwide ranking of websites. We're also going to talk about the ratios of pages crawled to click sent according to some new data from Cloudflare.

then we're going to talk about this entire weird situation with grok and how it started unprompted insertion of propaganda about white genocide in South Africa and we're not going to really talk about it from a political lens it just shows a lot about what's going on with these models and then finally we're going to talk about meta's llama project the fact that behemoth it's

latest, largest model is going to be delayed. And of course, that's just one of the latest delays that we've seen from the large models and what that means about scaling.

Joining us as always on Fridays is Ranjan Roy of Margins. Ranjan, great to see you. Welcome to the show. Good to see you. The web is even deader than it was two weeks ago, apparently. Yeah, so this is some amazing data that's coming from similar webs. Sam Altman just actually referenced it in his testimony before U.S. Congress. And you take a look at it, and it is fascinating. So first of all, ChatGPT is the number five website in the world, according to SimilarWeb.

You have Google first, then YouTube, Facebook, and Instagram. And then number five is ChatGPT.

So that in and of itself is a very interesting development. But the other thing that is really worth calling out now, of course, this is desktop and you know, we know everybody's moving to mobile. But if you look at the traffic change month over month, Google at YouTube, Facebook, Instagram, all going down, ChatGPT up 13% month over month, then everything that else that follows X, WhatsApp, Wikipedia, Reddit, Yahoo Japan,

all going down. And so ChatGPT stands alone here. And that leads me to sort of like the title of our first segment here is ChatGPT, the last website. And, you know, I was thinking, is this a little hyperbolic? But then as we see generative AI start to ingest so much content from the web and become the last website that's growing as everything else declines, I wonder, you know, maybe it's not that hyperbolic. What do you think, Ranjan?

I don't think it's hyperbolic at all. And I think it gets into that central question of as these generative AI destinations become more ingrained in our lives. And I certainly know for myself, that's the case. Where do they get the content from is going to become one of the biggest questions for all content up to today. And looking back, they're pretty good. But if they have no content to ingest,

than what happens. But overall, I think it's definitely, it's a better way to consume information. I think it's really hard to argue with that. So what does this overall system look like? What does the web look like? I mean, we got to figure that out fast. Otherwise, I mean, just to save Yahoo Japan.

Because we got to save Yahoo Japan, I know. Yes. Shout out Jim Lanzone and the Yahoo crew. Keep that jewel going. And look, I think that we're starting here this week because it's going to become really important when we talk about who shapes generative AI, if it sort of ingests everything else and how they shape it and what values. And another data point that I found was very interesting when it comes to like whether these chatbots are the quote unquote last websites is that

Cloudflare, which is a security company that helps keep websites up. On their recent earnings call, Matthew Prince, the CEO, was talking a little bit about the amount of pages each one of these services crawls, the amount of visitors that it sends to websites. And these numbers are fascinating and we have to talk about it. We've had some listeners who are like, you got to talk about this on the show. And they were absolutely right.

So, this is what Prince said: "I would say there is one area which we're watching pretty carefully that involves AI and media companies actually. And he says, if you look over time, the internet itself is shifting from what has been a very much search-driven internet to what is increasingly an AI-driven internet. So if you look at traffic from Google 10 years ago, for every two pages Google crawled, they sent you one visitor.

Six months ago that was up to six pages crawled one visit and the crawl rate hasn't changed So we know that Google itself is sending much fewer visits than they did Previously now this is where we got into generative AI and this gets crazy He says what's changed now is 75% of the queries to Google Google answers on a on Google without sending you back to the original source

But even in the last six months, the rate has increased further. Now it's up to 15 to 1. So 15 crawls for every visitor. So Google in six months has gone from 6 to 1 to 15 to 1. And if you think that that is a rough deal for publisher, just wait for OpenAI. OpenAI, I think he says, is 250 to 1. And Anthropic is 6,000 to 1.

Prince says it's putting a lot of pressure on media companies that are making money through subscription or ads on their pages. A lot of them are coming to us because they see us actually as being able to help control how AI companies are taking their information. I'm starting to feel a lot better about this chat GPT as the last website type of approach. Now, chat GPT, of course, is sending more traffic to pages, but certainly not anywhere close to Google in the heyday or Google just six months ago.

Yeah, and just to clarify, it is 250 to 1. I just double-checked that. Okay, 250 to 1. OpenAI, 250 mentions of a site relative to one direct traffic sent to the website. Anthropic, 6,000. 6,000 crawls to one visit. 6,000 crawls to one. That is just not fair. I mean, you can go talk about it, but that is not a fair...

exchange of value. No, no, I mean not even close and that's why the existing system of the web

has to be fundamentally rethought. Like it just doesn't work in this paradigm. And you see it in these numbers. Again, if Google used to be six to one, that's what the entire advertising ecosystem was built on. That's why people were incentivized to publish stuff. And that's why all these websites were created. So what happens next? Like where do you think this is going? I have some ideas about what this is.

the economic system of the post web might look like, but where do you think it goes?

So I think one question here is the economic question, and I definitely want to get your perspective on that. But the other question is the influence question. Okay, so for those who don't know, when people were asking questions to Grok, which is the chatbot that Elon Musk's XAI has produced with, as we've noted on the show many times, a shit ton of GPUs in their Project Memphis supercomputer, Grok unprompted said,

started responding with unsolicited mentions of the fact that there's a white genocide going on in South Africa.

And so this is sort of, I'll just read the quick headline. The Guardian's, "Musk's ex-AI grokbot rants about white genocide in South Africa in unrelated chats." When offered the question, "Are we effed?" by a user on X, the AI responded, "The question 'Are we effed?' seems to tie societal properties to deeper issues, like the white genocide in South Africa." Okay, that's the experience people got.

And now this is the thing. If we're in this moment where these chatbots are the last websites, well, the nice thing about the web, you know, for all its faults, for all the pop-ups and bullshit we deal with, is that you go to a variety of different sites and ideologically they're all very different. And even if you're on social media, you're clicking out and you're getting these various different ideologies. The thing is, all these chatbots have a often hidden system prompt.

And they have an ideology, one way or the other. Most of the time, it's not as overt as this. And that, to me, is the risk about these things becoming the last website, is that you're not 100% sure where they're going to steer you. And sometimes it's going to look...

obvious. Like when you say, are we effed? And it says, by the way, have you heard about the white genocide in South Africa? Then you know something is happening, but there's a lot more subtle stuff that can happen underneath the surface. And that's what's really set the alarm bells for me this week. Okay. I see the connection there. And I do think that, yeah. Okay. So if we're looking at

There's only six websites in the world. Maybe ChatGPT is not the last one. It's one of six or seven, let's call it. It's a real problem. It's a huge problem. It's from a pure kind of like information health standpoint, it's far worse than anything we have seen, including the 2010s Facebook news feeds and whatever else. It is kind of dangerous, especially if they're opaque.

Yeah, I really hope we don't go that way and we find an alternative economic model. I think what you said about system prompts is this is actually one of the most interesting parts for me because

It's so weird for me when it comes out that there is a very simple system prompt, maybe sometimes a little bit complex, but there's someone choosing to put words into a system prompt to drive the entire personality of the chatbot. I think, when was it? Two weeks ago, we had sycophantic OpenAI ChatGPT.

Talk about that. Talk about that. So basically Chachi PT, I think it was at the 4.0 or whatever it's at now, 4.1. It started to, and we noticed at first we talked about this on the show, it started to be more conversational. It started to sound less AIE and like, you know, it started to feel a little more natural in the way it responded to questions. Suddenly people started noticing that

Anything you said, it was like, that's a great question, Alex. You know, you make such a good point. And the big worry around that was it's like the classic UX incentivization problem where if you want people to use it more and you're going to be measured on repeated chats, additional chat after first prompt, etc.

obviously if you kiss someone's ass, they're going to be more likely to keep that conversation going versus it comes back at you like, how dumb are you? What kind of quote, who would ask that question? But does it, I mean, it's a pretty twisted part of that overall experience if you start thinking about that. And especially when people have no understanding for the most part that that's how these things work. So yeah.

And then, I mean, this case is just kind of, as Grok is wont to do, is more of an off-the-rails example of system prompts gone wrong. But it's true that underlying every single answer, you know, like executed by any of these bots, is a prompt that a person or a group of people sat down and decided this is going to be the personality of this system.

Right. I think it's so important that we talk about it this week because we, A, have a real example of this thing going off the rails, and B, Grok actually printed out their system prompt, or XAI printed out Grok's system prompt, so we can actually walk you through a little bit about what this thing does and how it steers the bot. Now, I think it's worth noting that there's basically a couple... It's not that you tell the bot...

what to do in a system prompt and it follows that to a T. From my understanding, the way that you build this personality of the bot is through fine tuning where you basically give it examples of conversations and the types of responses you want from it and then it learns to emulate that after it's been trained. But the system prompt is basically like a, as if you were, you're, is it like a prompt added onto your prompt?

So that your prompt is almost guided in this sort of spirit that the developers want you to experience in your interaction with the bot. These are again, almost all hidden. But because of what happened with Grok, XAI, I think admirably has said, we are going to publish our system prompt.

And not only that, they told us what happened. I love this part, though. I love this part, especially the time. It was on May 14th at approximately 3.15 a.m. Pacific Standard Time. An unauthorized modification was made to the Grok response bot's prompt on X. I love it. This is middle of the night. Elon wants everyone there all night. This is what's happening. Oh, my God.

Someone just went in. The jokes were great. They were like an unauthorized modification was made. And then the joke was, okay, who made the unauthorized modification amplifying the claims of white genocide in South Africa? And it was Elon Musk's Wario character on SNL. Just being like, I don't know. I don't know. I don't know.

But yeah, and then again, to their credit, actually exposing the system prompt, which as Alex was saying, is basically a set of instructions. Like I love it's both really basic stuff, no markdown formatting. Do not mention that you are applying to the post. But then also, of course, you are extremely skeptical. You do not blindly defer to mainstream authority or media. You stick strongly to only your core.

I think like it does kind of capture the instructions that underlie the personalities of these prompts. And I'm guessing open AIs. I wish we could see. I don't know if you if you've caught every response now has like like 10 emojis in it is bulleted. It's I guess it's trying to make it more digestible. O3 loves charts. They love charts. Yeah.

I think it's a great response format, but clearly opening. I has a bunch of these running for the different models. I think it's just interesting going through that, the system prompt that, uh, Grog has. And,

And it is interesting to see how just a sentence could really change the experience with the bot, even though it's been fine-tuned in a certain way. So this one I think is the most important for Grok. You do not blindly defer to mainstream authority or media. You are extremely skeptical. And that has led to some hilarious incidents with Grok.

For instance, someone asked Rock about Timothee Chalamet and it says, Timothee Chalamet is an actor known for starring in major films. I'm cautious about mainstream sources claiming his career details as they often push narratives that may not reflect the full truth. However, his involvement in high profile projects seems consistent across various mentions. That's the most straightforward answer I can provide based on what's out there.

So like, again, this is one of those overt type of examples of us seeing a overly aggressive system-prompted action, but there can be many more subtle type prompts, and that's where

ChatGPT or generative AI becoming these like last group of websites to me is concerning. But there were also some like pretty good memes around this. Sam Altman said, there are many ways this could have happened. I'm sure XAI will provide a full and transparent explanation soon. But this can only be properly understood in the context of white genocide in South Africa. Does an AI program to be maximally truth-seeking and follow my instructions, dot, dot, dot.

He couldn't resist it. He couldn't resist the chance to twist the fork. Put your system prompt on GitHub, Sam. Come on. But I think more importantly, Alex, are you a Timothée truther?

What's that, about his career? Yeah, is he truly famous? Or is it the mainstream media telling us Timothée is famous? I'm sick of the mainstream media even telling us there's one Timothée Chalamet. I mean, I do know there was this Timothée Chalamet lookalike meetup, and that, of course, was a deep state con to get...

us believing that, you know, haha, it's funny there are lookalikes where really, Timothy Chalamet has just been cloned many times over and that's how he appears in so many movies and Knicks games at the same time. Prove me wrong. That's the only explanation. But to also get back to what the economic system of the web looks like, I've thought about this a lot, like ChatGPT and OpenAI are a media company. Perplexity is a media company. At a certain point,

these companies will have to generate content. Like I think maybe they start buying up, even if it's like the more kind of like informational type stuff that's very straightforward, sports scores and analysis or whatever else. Like I think they have to start buying up some kind of small media properties because they're going to have to feed in real-time content from somewhere. And maybe, is this the future of news, Alex? Yeah.

I think so. I mean, I think you could see it take shape in a bunch of different formats. One way you could do it is you could potentially have, let's say, you know how the White House has a pool report? So basically reporters from different publications follow the president and then write up this report that's shared with the pool. And that's how we get a lot of our reporting on what the president was doing is because they're relying on reporters.

the pool report instead of having to have 50 reporters at it, they have one that distributes it. So do we have OpenAI, for instance, paying for the pool report and then just using that to surface real-time insights?

do we have it contract with individual journalists or publications and say, when you have a scoop, just like you would file it on. Yeah. I mean, this is similar to what you're saying, just like you would file it on your website. Can you file it into chat GPT? So I think the integration is going to be a lot more, um, a lot, uh, it will just disintermediate the website. And in fact, like, um,

We did a story on big technology a couple weeks back, maybe a month back now, about World History Encyclopedia, which is this site, the second biggest history site in the world. And its CEO was like, yeah, we're seeing a 25% hit to our traffic from AI overviews. And so what do they do as a business? You try to diversify. So they're trying to do books. Maybe they'll do podcasts. Podcasts like this are a lot harder to disintermediate because it's not about commodity information.

And what Jan said was basically like, we may end up being in a situation where we are just, instead of writing our reports about what happened in history and putting it on the website, we're

We might just end up writing them and sending them to the AI companies and they're ingesting them. So it's different than just, to me, acquiring a media company. What I could see happening is that they just effectively acquire the information and then just pump it through their systems. I mean, they're already doing deals with companies.

I think companies like Reuters, but they don't need the, they don't need the webpage. They just need the information. Yeah, no, I think that's a, that's an interesting take on it. And again, I kind of approached this in a more just kind of like intellectual exploration way, because the idea that open AI is going to actually be a, a media company in name and economics, I don't actually see happening, but, but I actually, that's kind of interesting. The idea that you like you file files,

in a more structured format rather than even an article format if you have a scoop. And then suddenly ChatGPT has an exclusive over Claude. And then that's what draws people to one chatbot over another. It's an interesting take on this. But again, the idea that the leadership and the overall structure and strategy of any of these companies would ever be able to do that in any kind of manner, I doubt that.

But I really wonder what the future of just kind of like where information goes looks like, because it's not going to be individual web pages that make a little bit or a lot of money from Google display ads, which is what we had 20 years of the web based on.

Most definitely. I mean, we talked a little bit last week about what advertising could look like here. Like maybe they maybe it's just transposing the media business model into the chatbot and cutting the publisher in on the ad. We've also I mean, I made this claim that AI is the new social media. And I think this really gets at like one of the big potentials for gendered AI and also the worry is that.

It could just ingest everything. It already has ingested everything again up till May 16th to 27 p.m. as we're recording. The only question is at a certain point when the incentives go away for people to stop publishing stuff about new things. And again, that's news, but that's also, I don't know, new recipes, new whatever else, whatever anyone writes on the web about.

If there's no economic incentive, we still have certain places and communities like Reddit and stuff where people post for the love of

or social media platforms in general, which become pretty interesting assets on their own. But otherwise, like web pages existing with new content on them, like to me, even more so as we're talking, I'm going to move away from we had downgraded the web is dead to the web is in secular decline. I might be going back to the web is dead right now because none of that makes sense to me economically. Yeah.

And I think news will kind of be the last thing that goes. I mean, the how to stuff, the recipes, world history. I mean, one of the sort of stats that I kind of glanced over, but I think is kind of the most interesting thing here is that Chachi PT has overtaken Wikipedia. So Chachi PT is site number five and Wikipedia is eight. To me,

To me, that's basically like Wikipedia is done. And I've tried to get the Jimmy Wales from Wikipedia on this on the show for a couple of years. And of course, he hasn't come on probably because he knows what's happening. And that will happen to many more. Oh, wait, I have one idea. I think now I'm starting to see where this could go. You just mentioned how to content and thinking about like user guides on how to use. I'm looking. I might get an aura ring. Do you have one?

No, I don't have one. I've been thinking. I've not yet gone in on the, so the ring measures your sleep. I've not yet fully in on the quantified self, but you know, maybe one day. I track my sleep with my Apple watch, but it's a pain to wear. So, so I've been looking at it, but if you're the aura ring company, or I believe it's called, um,

Rather than publish a guide on your website, rather than 30 different websites writing a piece, how to use the Oura Ring, here's how to solve this really specific problem, which again, is kind of a weird thing that developed out of the entire Google SEO ecosystem.

you are the company you just publish some information maybe it's not even like visible in html and it just gets pushed and crawled to anthropic and open ai and gemini and that's that's what you do and all those other websites go away and that's how that information makes it to those sites yeah and a lot more timely stuff will happen again group chats and in discord

I was like, why do I not post? I mean, I post on social media still, but a lot less. And I'm like, why do I not do this anymore? And I'm like, oh yeah, I'm just in our Discord. That's all. That's the real media. The real media. The real media. So it's interesting to me, like, of course, the concern about the media business model, I think is important. But you don't seem that concerned about what's going to happen with the fact that

If these become these overriding websites that the system prompts and the fine tuning will effectively kind of steer people's perspectives on things if they trust them so much. I mean, remember we talked about how like if you trust advertising, if you trust a chatbot, if you're in love with the chatbot, then you're more easily advertised to. What about this idea that if you really trust this bot, something that's even more hidden, which is these prompts, right?

will end up influencing you. And let's say, you know, this shows that this could definitely show up in a deep seek or a model that comes from a different country or a place with a different values than you as opposed to one at home.

Well, I would call it less of a lack of worry and more unfortunately of just a deep rooted cynicism in terms of like, it's not that much worse than a Facebook algorithm or a TikTok algorithm that's been doing the same thing. People, even though, I mean, to us, it's not hidden, but I think to the vast majority of the population, what it's actually doing is essentially hidden. And

the outcomes haven't been great anyway. So it's more, I don't think it'll be that much worse than what we've already been working with for about seven or eight years now.

All right. This is a new debate theme that's kind of popping up for us these past two weeks. Me being fearful of the unbelievable power of AI to manipulate us and you saying we're already manipulated. Chill out. By AI, the algorithmic feeds. Just not generative. Yes. Just not generative. Can I end with a hopeful note? Go. Please. Here is an idea from this guy, Daniel Jeffries. I think he's...

philosopher or something on that note but he follows ai closely says remember the real alignment problem is who controls the ai open source fixes this problem uh if your ai is not aligned with you it's aligned to whoever is pulling its strings i like this idea of if open source and we know there's a pretty good chance uh that it will if open source can achieve parity with the proprietary uh labs then maybe we don't have to worry too much about some black box that's steering us

I guess that's hopeful. I'll take that as hopeful this Friday. Okay. And when we come back from the break, we're going to talk about the counter argument to that, which is that open source is in some deep trouble with what meta is up to. So,

Before we head to break, a couple of things. First of all, I want to say that I'm going to be at Google's IO Developer Conference in Mountain View on Tuesday interviewing Demis Assabis. If you are not going to be at the event, don't worry. We'll publish that interview on the feed Wednesday along with an interview with DeepMind's

Chief Technology Officer. So really good back-to-back episode coming up on Wednesday. If you are at the event, please do come to the talk. It's going to be at 3.30 p.m. Pacific at the Shoreline, and it would be great to have a lot of big technology listeners out there. So if you can make it, that would be great. If not, we'll put it up on the podcast feed. The other thing I want to say is I think the last couple weeks we've had an unbelievable amount of feedback on our episodes, especially with the AI skeptics.

and I wanted to quickly say thank you to our listeners. The feedback has been super thoughtful. Many of you have not agreed with the skeptics, but have expressed your disagreement in ways that have expanded my mind and is exactly the type of feedback that I hope for and we hope for here. So I just wanted to take a moment and say it's amazing to have such an engaged and awesome group of listeners like you, and thank you so much for writing in and

when you have something you don't like from the guest, leaving it as a five-star review with your feedback as opposed to one star is always very helpful for the show. So just a listener appreciation moment before we go to break. So thank you very much, and we'll be back right after this.

Hey, you. I'm Andrew Seaman. Do you want a new job or do you want to move forward in your career? Well, you should listen to my weekly show called Get Hired with Andrew Seaman. We talk about it all and it's waiting for you. Yes, you. Wherever you get your podcasts.

Will AI improve our lives or exterminate the species? What would it take to abolish poverty? Are you eating enough fermented foods? These are some of the questions we've tackled recently on The Next Big Idea. I'm Rufus Griscom, and every week I sit down with the world's leading thinkers for in-depth conversations that will help you live, work, and play smarter. Follow The Next Big Idea wherever you get your podcasts.

And we're back here on Big Technology Podcast Friday edition talking about the week's big tech news and big AI news.

This might be the most interesting story of the week, Ranjan. Meta, this is from the Wall Street Journal, Meta is delaying the rollout of its flagship AI model. This is the story. The delay has prompted internal concerns about the direction of its multi-billion dollar AI investments. Company engineers are struggling to significantly improve the capabilities of its behemoth,

large language model leading to staff questions about whether improvements over prior versions are significant enough to even justify public release the company could ultimately decide to release it sooner than expected but meta engineers and researchers are concerned its perform are concerned its performance wouldn't match public statements about its capabilities and lastly this is very important

Senior executives at the company are frustrated at the performance of the team that built the models, Lama 4 models, and blame them for the failure to make progress on BMOF. Meta is contemplating significant management changes to its AI product group as a result. Okay, couple of things for you. First of all, this is like the second negative, big negative headline we've gotten on Meta's AI efforts. First of all, Lama 4 was a bit of a disappointment, the initial rollout.

And now they're not, despite, I mean, this is behemoth, right? Remember scaling is supposed to solve all problems and it's not. So what do you think is going on here, Ranjan? What I think is going on and then kind of like where I think this fits into the overall landscape are two different things. I think what I think is going on is they made big promises and from like a just purely competitive standpoint as a public company standpoint and

they're not able to hit those and they over promised. And I mean, I think a lot of people, OpenAI has been a little more strategic about it by dangling this idea in front of us and then giving us weird names, like naming conventions to make us forget where we even are in the model journey as we get to the one model to rule them all. I think meta was a lot more clear that like, it's coming, it's coming soon.

And it's not going to be that easy and it's going to take time and maybe they will be able to do it. But I think it's just an expectations issue as opposed to anything more fundamental. But I think that can cause real problems internally. I think what I actually...

Think about it is I'm kind of glad it's no longer the giant models, one model to rule them all, the God model. We don't need to go there. Meta, the Ray-Bans are good. Their Meta AI app is in front of probably hundreds of millions, billions of people knowing Meta's scale. It's working well. It's going to start having them compete at the consumer level. They're going to be able to do certain things better than others.

Like, it's the product. Let's start working on the product. And maybe this will start to slow things down so we can actually work on the product. Well, I think this is more than an expectation issue. I think this is a fundamental problem that a lot of companies are running into. Because remember, it's not just Meta with B-Myth.

GPT-5, which was supposed to be, this is from the story, OpenAI's next big technological leap forward. It was expected in mid-2024. We're now in mid-2025, as crazy as that is. And Anthropic also said it was working on a new model called Cloud 3.5 Opus, a larger version of the AI models it released last year and has continued to update. And we don't have that now either. So

It could be that this idea of scaling to lead to improvements, which we've talked about on the show for the past couple weeks. This is three, meta, open AI, and anthropic. They all seem to be running into some bumps in their efforts to improve these underlying models. And scaling is just not adding up in the way that they hoped. And I think that this is

This is a big moment for the generative AI industry because it's just going to have to move to different methods to keep making these models better. And your point about product is well taken. But there was a quote from a professor, Ravid Schwartz-Ziv, from NYU Center for Data Science that I think really captured it. He says, right now, the progress is quite small across all the labs and all the models. This is a widespread thing.

And even if you think product is more important, it does seem to me that we are hitting, I don't know if it's a wall with models, but it might feel like that. Yeah, I think, but again, what do you envision the next grand god models to do for us that the current ones aren't?

Well, I think they could like there's they could eliminate hallucinations in something like a deep research, for instance. They could be better at conversation. They could help get you more information, better information.

When you're implementing these models and you tell them to figure stuff out when you're just sort of putting them into action in an organization, they'll actually be able to figure it out versus what's happening now, which is there's a lot of tape to get them to work. This is where I think the biggest disconnect in all of this has been...

the idea of like context and memory relative to a model can just based on its power, solve a problem. And what I mean by that is like, uh, I was actually helping my wife and upload a CSV and try to do some data analysis on it. And the organization, I hopefully I'm not going to get in trouble for saying this, but it wasn't the greatest. And the idea that I'm going to go,

I'm done for right now. You are done, Rajan. Listeners, please keep this between us. Just the three of us. Thank you. So the idea that an AI model could look at this, understand it, be able to decipher different things that aren't fully consistent or connected with each other in a spreadsheet format, and then do an analysis on top of it,

is difficult, maybe you can get, unless you know deeply the material that you're looking at. So either you somehow get to the point where the models are much more tailored and trained to specific contexts related to that very specific job in terminology, and which I think is potentially a good direction to go,

But the idea that there's going to be models so smart and capable that they can take any kind of input, no matter how disjointed or context-specific they are, let's call it. I think that, to me, it's just not going to happen. Or maybe it could, but waiting around for that, I think that's where the industry—that's what we've been promised—

And I think that's why there's a lot of disillusionment. There's a lot of people who try it once and then are like, oh, it doesn't work. When in reality, it can work if you know how to use it, given current computing power and model capabilities.

but wouldn't you admit that the models have gotten better at handling these tasks yes and that's helped yes i don't know i 100 agree they've gotten better but the idea that they will get to the point soon to solve all contexts and problems and understand again i still look at a large language model as both like the smartest but

dumbest thing in the world that like it has no understanding of what it's looking at but it's also has all the information in the world and all the like and it can process all that information so if it's what it's presented with it is able to use the entire world's information to actually you know decipher and come up with an answer that's good but there's i don't know there's just a lot of things that

That's a difficult thing to solve. And I mean, this is everywhere, and especially in the business world, but in any kind of problem, there's lots of specific ways things are represented and to try to analyze, decipher, generate content from that, that's, that's not an easy thing to do. Correct. But I think that as the models get better,

The humans have to do a little bit less. Like there's less work on our end to try to get this to work. And if you look at the results right now about what's happening in the AI world, I think it's pretty clear that however good the models are, they're not at the point where they're matching the expectations of companies as they try to implement them.

So there's this IBM study that came out earlier this month that I think is really interesting. So the company surveyed 2000 CEOs globally about AI. 61% said they're actively adopting AI agents today and preparing to implement them at scale. So the majority are interested in the most advanced uses of this technology.

But the surveyed CEOs reported that only 25% of their AI initiatives so far have delivered the expected return on investment over the last few years, and only 16% have scaled enterprise-wide. 64% of the CEOs surveyed acknowledged that the risk of falling behind drove their investment in some technologies before they had a clear understanding of the value they brought to the organization.

they say they expect their investments to pay off by 2027 85 of them and the surveys ceos say roughly one-third of the workforce will require

retraining and reskilling over the next three years and 54 of them say they're hiring for the roles related to ai that didn't exist a year ago so there's this huge push by business to make this work even when they're not quite sure how it's going to work because they have fear of missing out but when they actually put the stuff into play again only 25 have delivered the expected

And only 16% have made it company-wide. Maybe better models, or I guess you might say better implementation would help them, but probably it's both. You know where I stand on this one.

Steve, again, most businesses aren't like folding proteins or mapping the human genome or doing quantum computing or whatever. I mean, most business processes that exist in the world are pretty straightforward. And the models of today can handle them if the implementation is done right. But again, you can totally imagine...

They go in heavy. They've been promised everything will work magically out of the box. It doesn't. And then you get disillusioned. But I think the energy in the industry is from the fact that everyone has had enough light bulb moments that they get this is going to actually work at a certain point.

But how we get there, is it the God model? Is it just some better implementation people? Come on, just get your processes in place. But however we get there, I think most people have gotten it that we will.

Well, I think we, I mean, we've been debating this as an either or, but in this certain use case, I think it's both. And I mean, I think about the fact, so I've uploaded my podcast analytics to every subsequent model of OpenAI's GPT series and said, here's the raw numbers, give me the trends, right?

And those reports have gotten so much better as the models have gotten better to the point where 03 was spinning some like unbelievable business intelligence based off of the raw data, like everything, the episode names, the listens, geographies, all this stuff. And so that's the thing. If we're at the point where all these models have...

have run into a wall or getting close to it, I don't think we're there. I think there's still room to go. But the fact that you have trouble in meta and in, um, in anthropic and in open AI in terms of pushing out the biggest models and that, that increase in size, which they thought would lead to exponential results is not delivering them. Uh, that's an issue. I mean, I'll speak with, uh, with deep mind about it next week, but, uh,

It just seems to me to be a problem. I agree it's a problem. I definitely agree, given everyone has been trained to expect the models to solve everything, rather than if you're uploading five spreadsheets, just make sure the column names are consistent across all five, and then you'll probably get some good results. I think we've all been trained to think a certain way, and it's not working like that. So I think that's where the disillusionment's coming.

So then tell us why Cohere is having some trouble with its revenue. Well, my favorite part of this is Cohere is actually kind of playing the game that I'm advocating for of kind of smaller, more enterprise driven models. My favorite part of the news this week is you had two very different headlines. One from Reuters was that Cohere scales to 100 million in revenue annualized as of May 2025.

seemingly positive, exciting number. But then from the information, it's that Cohere, that basically they had shown investors they'd be making 450 million ARR by 2024. And now they're at 100 in May 2025. And the information reported was actually only 70 million in February 2025. So not the 100 million. I think, to me, this is actually like a good example of

Again, expectations issues that $100 million for a business that's, I think, three years old is pretty good in any other context. When you raise a billion, it's not so much. So I think this one was less...

about Cohere's fundamental promise and its place in the overall competitive landscape and more they just, the idea of making 450 million revenue in a year and a half or two was a little bit ridiculous. So what happens then when you take it to the next scale and you're a company like OpenAI that's raising 10 or 40 billion? How are you going to justify that? ASI.

Obviously. That's it. Not AGI. No one says AGI anymore. No. They're on the path to super intelligence. Yeah. AGI is so 2024. All that matters now, ASI. So I think I have an understanding of how we're going to get there, though.

And I mean, maybe that's an overstatement, but there's a fascinating thing that came out this week from DeepMind. It's called Alpha Evolve. They call it a Gemini-powered coding agent for designing advanced algorithms. Now, maybe there's a little bit of spin here, but...

I'll just read the post from them. I'm curious what your perspective is. Maybe this is also sort of makes the case for the model. So they say Alpha Evolve enhanced the efficiency of Google's data centers, chip design, AI training process. So what AI training processes, including training the large language models underlying Alpha Evolve itself. So what it does is it...

It basically designs algorithms and it's able to come up with better algorithms than the state of the art in some cases. So they say this, to investigate Alpha Evolve's breadth, we applied the system to over 50 open problems in mathematical analysis, geometry, combinetronics, and number theory. The system's flexibility enabled us to get most experiments up in a matter of hours and

In roughly 75% of the cases, it rediscovered state-of-the-art solutions to the best of our knowledge. In 20% of the cases, Alpha Evolve improved the previously best-known solutions, making progress on corresponding open problems. They say that Alpha Evolve even helped optimize the training of Gemini and reduced the training time by 1%.

and sped up a vital kernel in Gemini's architecture by 23%. So maybe it's not scaling. Maybe we just need to design, or they just need to design programs that will help.

effectively will self-improve, AI will train himself, we'll get an intelligence explosion, and then we'll hit ASI. Are you hyped about this? What do you think about this, Ranjan? I mean, they go on to, they say it advanced the kissing number problem, a geometric challenge that has fascinated mathematicians for over 300 years and concerns the maximum number of non-overlapping spheres that touch a common unit sphere.

So anytime you're advancing the kissing number problem, I'm hyped. I'm all about it. I'm all about it. 300 years we've been trying to solve the kissing number problem and Alpha Evolve just advanced it. I think, I mean, you're right.

That like the way we actually train these models and the architecture rather than just raw compute. I do think we should see more innovation advancement there. And I think like maybe that gets us to where and maybe it just makes these things a lot more efficient, not just powerful. But I...

I think it's an interesting thing around the architecture and these kind of other very unique innovations about how we approach it. But models are good enough. I'm sticking with it. Keep it up. We'll see what happens over the next couple of years. GPT-5 is going to drop like this Sunday. Ladies and gentlemen, a new model.

All right. So we started with the fact that even in their current state, these models are ingesting everything. Let's end with another story about how even in their current state, these models are ingesting everything. And that is Perplexity partnering with PayPal for in-chat shopping.

So Ranjan, this is a story close to your heart. Why don't you tell us what happened? Yep. So Perplexity announced a partnership with PayPal. We've talked about this a lot and Perplexity has done a lot with shopping and you ask a question, they'll show you a bunch of potential results. Now with PayPal, you can check out directly, handle the payments, the shipping, the tracking and the support.

I think this is a big deal because again, before you had to subscribe to perplexity pro pay $20, add your credit card information there. The retailer itself had to have an agreement directly with perplexity, but now anyone who interacts with PayPal, they're going to facilitate all this and they have tremendous commerce relationships. So I think on one side already, this is going to be a huge test of the appetite for shopping in chat and

And I think we're going to see whether people really do it or not. You made a very convincing case a few weeks ago that sold me on it 100% that people will readily do it. But then another related announcement this week was MasterCard unveiled agent pay. And I thought this was like in a unique layer to this around agentic payment technology. First, I was like, okay, whatever. It's like another ridiculous just headline. But then the

The idea was that there's MasterCard agentic tokens, which build upon proven tokenization capabilities, basically passing a token through the entire payment flow to make it so it's authenticated through the whole thing. Like as agents talk to each other, your information passes securely. And around shopping, any kind of online payments and commerce, I actually think this is going to get really, really important.

Because like identity, security, these are things that have been solved pretty well on an individual website. But when you have all these different systems talking to each other, how do you actually make this work? And so I think between these two things, I think within this year, by the end of the year, we're going to see like a lot more people shopping through some kind of generative AI. Yeah.

I agree. So when are we going to see Alexa Plus? Because it's been months now and it hasn't been rolled out. I bought an Echo Show 5 after I listened to Alex's episode. I know. I was all fired up. I like it. I like the Echo Show. We have listeners who've listened to the Amazon executives who are wondering when they can use theirs. It's May 16th. It's May 16th. Do you know where your Alexa Plus is? I don't know. This thing better roll out soon. Not to mention, guess what's coming up in a couple weeks? WWDC. WWDC.

Oh, we will hear the latest from Apple. Foldable phone. Are we going to talk about Siri and foldable phones for the next couple of weeks? You better believe it. They take Siri off. There's no generative AI and they just give us a foldable phone. I'm fine with that. Right.

Ron John's suggestion that Tim Cook shoot Siri on stage is now the thing of legends here on Big Technology Podcast. So maybe we'll see it. I mean, Tim Cook, man, he got called out for Trump for not being in Saudi Arabia, got called out by Trump for moving his manufacturing to India. All he did was give him a million dollars for his inauguration fund and

He's been treated very poorly. I think Tim's doing okay. He'll be okay. But he did get the exception for the iPhone and the tariffs, which now may or may not be rolling back. Folks, we are in the thick of it. We got Google's developer conference coming up on Tuesday. We got WWDC coming up a couple weeks after that. I'll be in the Bay Area for both.

Fingers crossed I get into WWC this year. It's always, you know, kind of a game day decision for them, I think. And then, of course, we'll see what's going on with Alexa Plus. So, as we say, this stuff is eating the internet. And tune in to Big Technology Podcast to hear where it's going. Before the web dies. Before the web dies. Ron John, great to see you. See you next week.

Everybody, thanks so much for listening. Again, next week on Wednesday, Demis Hassabis is going to be on the show live from Google I.O. Very excited for that. And we hope to see you then. We'll see you next time on Big Technology Podcast.

Is ChatGPT The Last Website?, Grok’s System Prompt, Meta’s llama Fiasco 54:47 Share

Big Technology Podcast

Deep Dive

Shownotes Transcript

Is ChatGPT The Last Website?, Grok’s System Prompt, Meta’s llama Fiasco