We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

EP 504: Has Anthropic’s Claude lost its edge? What happened & can Claude recover?

2025/4/15

Everyday AI Podcast – An AI and ChatGPT Podcast

AI Deep Dive AI Chapters Transcript

People

Jordan Wilson

一位经验丰富的数字策略专家和《Everyday AI》播客的主持人，专注于帮助普通人通过 AI 提升职业生涯。

Topics

我观察到在过去的一两年里，Anthropic、OpenAI和Google在大型语言模型领域展开竞争。然而，现在OpenAI和Google已经远远领先于其他竞争对手。Claude曾经是顶级大型语言模型，但现在我认为它已经失去了竞争优势。Claude的市场份额在下降，我个人使用率也从之前的25%下降到现在的5%。这主要是因为Claude存在一些问题，例如缺乏企业级访问、无网络访问、有限的第三方集成以及严格的使用限制。此外，Claude在创新方面也落后于OpenAI和Google，它更多的是模仿其他模型的功能，而不是进行创新。在基准测试和用户偏好方面，Claude的表现也不如OpenAI和Google的模型。例如，在人工智能分析智力指数和ELO评分中，Claude的排名都不在前十。在创意写作和编码方面，Claude的表现也不如其他模型。Claude的速度和价格也不具有竞争力。即使是Claude Max计划，其消息限制也过于严格，这使得用户体验不佳。总的来说，我认为Claude已经失去了在大型语言模型领域的竞争优势，并且很难恢复其领先地位。

Deep Dive

Chapters

This chapter discusses the AI landscape, highlighting the advancements of OpenAI and Google, and questions Anthropic's position and Claude's competitiveness. It also covers recent AI news, including Apple's use of synthetic data, NVIDIA's investment in US AI manufacturing, and OpenAI's GPT 4.1 release.

OpenAI and Google are leading the AI race.
Apple is using synthetic data to improve AI performance.
NVIDIA is investing in US AI manufacturing.
OpenAI released GPT 4.1 with a million-token context window.

Shownotes Transcript

Translations:

中文

This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. I would say for the better part of two years, the large language model race was three teams. You had Anthropic, OpenAI, and

and Google racing for the lead and going back and forth, jab for jab, as the best AI model maker in the land. Obviously, you know, Microsoft's in there, but they're more of a system that uses other technology. But when it came to actual AI frontier labs, it's always been a three-team race. I don't know if it's like that anymore. I think right now,

OpenAI and Google are so far ahead of everyone else. And I'm left wondering what happened to Anthropic? What happened to Claude? Is it still a top tier large language model or has Claude completely lost its edge? And can they ever catch up with Google and OpenAI?

All right. We're going to be talking about that and a lot more on Everyday AI. What's going on, y'all? My name is Jordan Wilson, and I'm the host of Everyday AI. This thing, it's yours.

It's your daily live stream podcast and free daily newsletter helping us all not just learn AI, but how we can leverage it to grow our careers. Because you can try to keep up with AI news and developments and new large language model updates. You can try to keep up, but just hearing about them, reading about them doesn't do anything. You need to leverage it. And that is what our website is all about. Your podcast.

So there we recapped each and every day's podcast episode. Sometimes I have guests on, sometimes it's just myself. So we bring you exclusive insights every single day. We're actually the only AI newsletter that does that, as well as we keep you up with everything else happening in the world of AI. So you can be the smartest person in your company or your department when it comes to generative AI.

All right, let's actually do that and go over a quick recap of what's happening in AI news for April 15th. So Apple is responding to criticism over its AI performance, particularly in areas like notification summaries with a timely pivot toward synthetic data and differential privacy. So yeah, Apple kind of responding according to reports by focusing a little bit more on AI

synthetic data, right? So the company generates, according to the report, the company is now generating synthetic data to emulate user information without using real content, enabling private testing on data of users who opt into device analytics.

So this approach ensures accuracy while safeguarding privacy. So yeah, Apple obviously has had a super, super slow rollout. And by super slow rollout, they're years behind everyone else. And their Apple intelligence, let's just say it has not been well received. So some new reports and information are showing Apple's kind of new or updated approach by using synthetic data, kind of topology,

of tying it to those who sign up for kind of this device analytics. So by pulling devices with synthetic data comparisons, Apple is hoping to enhance its Apple intelligence with better email summaries and other functions, signaling a broader commitment to addressing user concerns and advancing its AI capabilities responsibly.

All right, our next piece of AI news. NVIDIA has committed $500 billion to US AI manufacturing amid changing tariff policies here in the US. So NVIDIA announced plans to invest up to $500 billion in AI infrastructure manufacturing within the US over the next four years, marking a significant shift in its supply chain strategy to meet surging demand for AI chips and supercomputers.

So the move coincides with U.S. President Trump's ever-changing tariff policies, which initially imposed steep levies on imports from Taiwan and China, but recently exempted chips in other tech products, easing concerns for companies like NVIDIA and Apple that rely heavily on overseas productions. So NVIDIA will partner with Taiwan Superconductor in September

Taiwan Semiconductor in Arizona for chip production and with Foxconn and Wistron in Texas for supercomputer manufacturing, aiming to achieve mass production at these facilities within 12 to 15 months. So by using digital twins of

factories and advanced robotics for automation nvidia hopes and plans to streamline operations and enhance efficiency in its us-based facilities demonstrating how ai technology can transform the manufacturing process so yeah if you're wondering like okay what the heck does this matter well so many big companies and all the ai systems that we use like

ChatGPT, Google, Microsoft, everyone else, they're struggling to keep up with demand, right? So essentially everyone's looking for more compute. This is a pretty big move from Nvidia to bring more kind of AI power to the U.S.,

And then last but definitely not least, OpenAI has launched a new family of models with the GPT 4.1 series. Probably the big headliners there is it now has a million token context window, but right now, at least it is only available on the API end. So only for developers right now.

So OpenAI has launched its new family of models, GPT 4.1, as a major upgrade to its previous models, offering advancements in context processing, reliability, and cost efficiency. But like I said, you're not going to find it. If you go to chatgpt.com, it's not there. At least right now, OpenAI did not announce it.

announced any plans for it to live on the front end inside chat gpt and is only available for developers on the back end but let's talk a little bit about the model because i have some pretty pretty impressive specs here so uh gpt 4.1 introduces a 1 million token context window uh far surpassing gpt 4o's previous tops on the api end which was 128

So that's big. So, you know, Claude and Gemini and others were really beating open AI historically in context window. Right. But not anymore. So pretty, pretty big news there.

And then unlike previous models integrated into ChatGPT, like I said, GPT 4.1 is exclusively available through OpenAI's API, making it a tool tailored for developers rather than general use. The performance is pretty impressive across coding, instruction following, and complex reasoning tasks.

What's also important is OpenAI has said some of those improvements have also been rolled out kind of under the hood to its GPT-4.0 model. I would assume it was the late March update that there wasn't a lot of updates about.

And there are now three new varieties. So there is GPT 4.1, kind of the full version, GPT 4.1 MIDI, which is more affordable and compact. And then GPT 4.1 Nano. Yeah, the first time, you know, OpenAI has gone at Nano and that is their smallest, fastest and cheapest model. Yeah, it's

if it's it's as if it's not hard enough to already understand these models now we have uh two variety of small ones yeah if you thought mini was small no now apparently mini is medium and nano is small uh and then some sad news for some old school uh you know if you like some of these old models open ai is planning to phase out older models like the old

OG GPT-4 by April 30th. And then also somewhat surprisingly, OpenAI announced they'd be phasing out GPT 4.5 preview by July 14th to focus on the more efficient 4.1 lineup. Also, this release coincides with a delay in GPT-5's launch now expected in a few months as OpenAI navigates some integration challenges. And yeah, so...

FYI, obviously OpenAI has changed course a couple of times. They essentially said, hey, we're going to stop releasing non-reasoning models and GPT-5 is going to be more of a hierarchy or a system. So they said, yeah, we're not going to be releasing a lot of new models before GPT-5. And here we are. So, all right, let's get into it. A lot more on those stories on our website at youreverydayai.com. What's up?

live stream crew. Yeah. If you listen to the podcast, come join us sometime on a live stream. You know, when I have guests, we take questions. Sometimes I ask you all things. So thanks to everyone for joining in George from YouTube, big bogey saying GPT 4.1 is a coding powerhouse. Yeah, it is already early benchmarks trade printing here from YouTube. Thanks for joining us on the LinkedIn machine, Kimberly and Dennis.

Allison, thank you all for tuning in. But let's just get straight into it. Has Anthropics Clawed lost its edge? It's Tuesday, y'all. I'm going to take a sip of my coffee and let me know. Should I crank this up? It's been a while since I really brought it on a hot take Tuesday. I'm a little tired, but livestream audience, if you could, leave me an emoji or two. Should I be one fire emoji? Should I be kind of nice?

two fire emoji should i bring the heat or three fire emojis burn baby burn i mean i don't know uh one thing and let me tell you this i tell you all the truth i do period right as an example if you would have asked me 18 to 20 months ago hey jordan what are your thoughts on on google gemini i'd say don't use it ask me today

Google Gemini is the king of the hill, right? I do think it is Google and OpenAI now going jab for jab. But I tell you the truth.

So I'm not going to hold back if you all want a little bit, a little bit of fire. Rolando here is saying to crank it up. Fred, all right, Fred, Fred, thank you. Fred's like, all right, Jordan, be nice today. He wants me to be kind of nice. Allison here throwing in some dynamite. That's dangerous. All right. All right, we'll see. We'll see. I don't want to offend anyone because let me say this. Let me say this.

Claude is still one of the most impressive pieces of AI technology ever created. All right. Period. So I don't want to overlook that. All right. But what I've found is I've been using Claude less and less. I would say probably nine months ago, Claude probably accounted for about 25% of my usage.

It's probably down to about 5% now. I'm finding it hard

uh to find actual use cases for clawed and i'm talking about on the front end y'all all right so i'm not talking about on the back end i know that uh clawed 3 5 has historically been uh you know one of the most used models if you look on like open router uh i know that uh quad 3 7 is still popular for developers although not it's not the most popular anymore it's not the most popular anymore with gemini 2.0 flash and gemini 2.5 pro it's really not

But this has been a long time coming. So back in September, back in September, if you want to go listen to this, what episode was this here? 351. All right. So I told you all back in September, three reasons businesses shouldn't use Anthropics Cloud yet. And this was after like a year, right? This was a year of...

Me being hesitant. So what a lot of people don't know, people are like, okay, Jordan's just some random guy that, you know, jumps on a podcast and talks about AI. Well, yes, on the surface, right? On the other end, I do a lot of things that you all don't see on this show. Consult big companies, companies with tens of thousands of employees. I work with research organizations. They reach out to me, big ones, big name ones. And they're like, hey, Jordan, can you help us better understand generative AI? Right?

So it's much more than, you know, this little podcast. Although thank you all for listening, you know, and making Everyday AI a top 10 tech podcast in the US. But I'm talking with a lot of businesses, a lot of things that you don't hear. And it's not just me. Big enterprise companies have always been hesitant to use Claude at scale. All right.

And it's been a long time coming. I even said three big reasons. This was back in September that I said, Claude was in trouble and enterprises shouldn't be using it yet. Number one, there is no enterprise access. So I'm talking about on the front end, right? So keep that in mind. Everyday AI, it's for largely non-technical people, right? And I'm talking about logging on to, you know, claude.ai or I'm talking about logging on to gemini.google.com, chatgpt.com.

Using this on the front end with your team. One thing I'm a huge advocate for, if you listen to the show, is having your AIOS, your AI operating system. Your team needs one. In addition to whatever your company may be doing on the back end, you need a front end AI operating system where you and your team collaborate to get work done. No internet access.

Claude went the first two years without having internet access. They just rolled out internet access about a month ago. Okay.

very limited third-party integrations, right? Google technically on the front end has not a lot of third-party integrations, but because they're Google, right? Because they have, you know, anything that could be a third-party integration, they essentially have in-house, right? Google has like a trillion of their own products, right?

extremely limited third-party integrations. It's improved since September, since I had this show, episode 351. And then I said extremely restrictive testing tiers. So both free and paid, I'd say the one biggest thing that's been knocking Claude off from real business adoption is you can't even go and test it. If you have a paid, even a paid plan of Claude,

Right. And you're like, all right, you know, let's go test this. Let's see if this is right for our business. You know, you're paying the $25 a month or whatever. There's been, and I'm not exaggerating hundreds of cases because I use large language models. I mean, it varies. I don't know anywhere from four to 12 hours. Recently, it's been a lot of 12 hour days using large language models. Right. Um,

It's so easy on a paid plan to hit your rate limits on Clawed. I kid you not within 10 minutes. It's happened to me hundreds of times where I will hit on a paid plan the rate limit within 10 minutes. Yes, I'm generally working in multiple tabs. If I'm in Clawed, I'm working with long context windows. Yes. I can't tell you the last time I've hit chat GPT limits. Right? Doesn't happen. Gemini doesn't happen.

Claude has been extremely restrictive, and I think that was a major misstep early on. How do you expect...

aside from appealing to your core audience, which we'll talk about because I think they're losing space there, right? Coding, development, software, engineering, et cetera, right? How are you going to appeal to the average business owner, to the average enterprise use case when a company pays and maybe they get a team plan? I think those rates are about double, but still you can't even use the thing when you pay for it. It is extremely restrictive, right?

All right. And another reason why I think that Claude has lost his edge. It's no longer innovating, right? In the early part of 2024, even midway through the year, I still think Claude was an innovator, right? They came out with artifacts, which when it came out,

extremely impressive. So if you don't know clawed artifacts, it's actually kind of hidden. You have to like enable it and then you have to make a call to it. Right. But it's, it's, it's still in right now.

Let me be honest, because I still said Claude is still one of the most impressive pieces of AI technology. There's still great use cases, right? Even though I'm trying to, you know, y'all wanted the flame emojis. I'm not going to totally poo-poo on Claude. There's still some use cases. I said, maybe now it's 5% to 10% of, you know, my use. But the only thing I use Claude for right now is using 3.7 thinking on artifacts. That's it. Nothing else. Because everything else...

Claude is not a top model anymore. In many cases, it's not even a top five or a top 10 model, which sounds crazy to say because nine months ago, they were that tier one, right? If we go back to our ranking tiers, right? Like S, A, B, C, right? They were S. They've fallen. How the mighty have fallen. But that's all I really use it for. But

Claude and Infropic were innovators early on. So the artifacts, so that's something that can render code in natural language. You can have it build you a business dashboard, games, whatever, right? And you can run it in the browser. And then guess what? ChatGPT and Gemini said, all right, yeah, let's do this as well. So they came in with Canvas. All right. Similarly,

Claude was an innovator with projects. Anthropic innovated with projects, right? A good way to, you know, organize your chats. A good way to leave custom instructions and project knowledge, right? Chat GPT follows suit.

Computer use, right? Anthropic was innovating. Although back in October, when that came out, it was extremely clunky, extremely clunky, right? You know, one of the easier ways to run computer use was, you know, you had to download Docker, you had to go to GitHub, you know, work with their repo, which is fine, but for non-technical people, not that good. In the rate limits, I did a live show where going over Claude's computer use,

Again, very hard to use with the rate limits. So I think Claude is no longer innovating. Now, I think they're clone chasing. Whereas before, others were copying their innovation. Now they're copying other people's innovation. So yeah, like now a lot of the things that you see and that are going to be rolling out, like as an example, according to some online sleuths right now, Claude is testing voice mode and all these things, right?

They're just now seemingly cloning features that were popular six months ago, a year ago. And one of the reasons I think is Anthropic dropped the ball, right? Back in September, when I gave those three reasons, those three different scenarios on why I thought enterprise companies and why I told countless enterprise companies don't use Claude for those three reasons, they didn't address those.

Those were not secrets. It was no secret that you couldn't, it was so hard to literally use Claude on the front end. They knew that, right? Their team is interacting with people online, on Twitter. Everyone's been complaining about rate limits and Claude's team has been saying, oh, we're working on it for years. It's too late. It's too late. One of the reasons why, right? I don't know this, but

We've heard stories that as an example, OpenAI is losing money, right? Even CEO Sam Altman said on their new pro $200 a month subscription that they were losing money, even though it's been extremely popular. So I don't know. This is my hunch, but my hunch has been Claude has been maybe more profitable, at least by percentages, than maybe their main and closest competitor to ChatGPT. But at what cost?

because I don't think they're growing their user base. I don't think about it. I think sometimes if you are an avid listener to this show, if you're an AI nerd like myself, we all live in an echo chamber as well. Outside of our little echo chamber, no one knows about Claude. But they could have. They could have a year ago if Anthropic would have listened to its customer base a little more closely.

and continue to innovate and improve the product, improve usability. I don't think we'd be having the same conversation today. Always have receipts, y'all. Always have receipts. All right, so on my screen, this is January 2025 web traffic. All right.

Are you still running in circles trying to figure out how to actually grow your business with AI? Maybe your company has been tinkering with large language models for a year or more, but can't really get traction to find ROI on Gen AI. Hey, this is Jordan Wilson, host of this very podcast.

Companies like Adobe, Microsoft, and NVIDIA have partnered with us because they trust our expertise in educating the masses around generative AI to get ahead. And some of the most innovative companies in the country hire us to help with their AI strategy and to train hundreds of their employees on how to use Gen AI. So whether you're looking for chat GPT training for thousands,

or just need help building your front-end AI strategy, you can partner with us too, just like some of the biggest companies in the world do. Go to youreverydayai.com slash partner to get in contact with our team, or you can just click on the partner section of our website. We'll help you stop running in those AI circles and help get your team ahead and build a straight path to ROI on Gen AI. No one uses Claude. Comparatively, no one uses Claude. They don't.

I know I'm going to catch them flat for that and be like, Jordan, you're a chat GPT fanboy or jumping on the Gemini bandwagon. No, I'm not. I've been using Claude since the day it came out. I've enjoyed certain features. I use all. I've used dozens of LLMs and like I said, hours every single day. Claude's not good anymore. It's not.

I got more stats. I got more receipts. Don't worry, y'all. You said you wanted some flame emojis, right? So let's look at total visits in January 2025. Web visits. ChatGPT.com, 3.9 billion with a B. Yeah, with a B. Claude, 76 million. Yeah.

Gemini, 267 million. DeepSeek, 277 million. So essentially, Gemini and DeepSeek are right there with each other in terms of people visiting the front end. Perplexity, 99 million. Y'all, ChatGPT, let me do some quick napkin math here. ChatGPT has more than 10 times the users of Claude, Gemini, DeepSeek, and Perplexity combined.

Is my math right there? 500? All right, almost. Sorry. My Natkin math was a little wrong there. All right, so we have, that's about 500 million, 600 million. All right, so about five times. So ChatGPT has five times more users than Claude, Gemini, DeepSeek, and Perplexity combined.

And Claude is in, at least according to kind of online demographic or online website information, which is pretty accurate, right? I've been using these different SEO tools for 10 plus years. They're very accurate. No one's using Claude. Hot take, ready? It's been less than two months since Claude released its latest model in Claude 3.7. Claude Sonnet 3.7. And it already feels antiquated.

So they announced it February 24th, Claude 3.7. And let me just call this one out right there. They made a big deal of Claude being, you know, the world's, you know, first hybrid model. Right. So, you know, when you think of old school transformers and then you think of these, you know, quote unquote, new school models that think in reason under the hood. I don't know. To me, that seemed like a marketing gimmick from Anthropic. Right. Why? Well,

You have to actually, if you want to use that extra thinking, right? You have to go in and you have to click the button. So is it actually a hybrid model?

I don't know. I'd say not. So now I think you also have Anthropic falling down this trap that Google fell in in late 2023, where they're getting caught up in the marketing and not listening to their users in shipping new, capable, powerful models. But Claw37's on it. Feels antiquated.

Because since that time, we've already had multiple updates from OpenAI. We've had multiple updates from Google. We've even had multiple updates from models that I'd say never use, like DeepSeek, right? If you care about your privacy, don't use it. Unless you're downloading it and fine-tuning it locally, right? But don't use DeepSeek on the web or their API if you care about your data. If you are a business, don't do it, especially in the US, right? But anyways, how...

Are we at the point now where a model that is not even two months old feels antiquated?

That's where we're at. And I don't know if Anthropic can keep up. Like I said, they're very innovative to begin with. They're great researchers. Obviously, I think they are a world leader in terms of AI safety, in terms of ethics, right? All of those things. But in terms of like, okay, are they just going to be more of a research arm that kind of drops AI models? Or are they trying to actually...

Are they actually trying to be relevant? Are they actually trying to be one of the top large language model makers in the world? I don't know. I was personally very underwhelmed with Claude 3.7 Sonnet, their newest model. Even the thinking variation when you have to toggle it on. I know a lot of people, I was reading online, a lot of people are using...

are using it inside, like George here on YouTube says, you know, he says, Claude seems lazy in Windsurf and Cursor, but it is not when you use it in an app. Yeah. So I know a lot of people, yes, Claude, up until, you know, a week ago when Google said, oh, wait, Claude, you are no longer relevant because we're dropping Gemini 2.5 Pro, which wipes out

wipes all the way the competitive advantages that Claude 3.5 or Claude 3.7 saw in it hack, right? Google just said, yeah, we're going to knock you off this pedestal. You're not going to compete. Google straight up wiped them, which is interesting, right? Because Google is, you know, has invested, you know, but they're still technically competitors in some regards as well. Let me tell you what I mean. And here's my hot take.

I think there's a lot of Twitter talk and hipster hype when it comes to Claude37 or Claude35. But I care about business utility. Anthropix lost its edge there. I care about benchmarks. I care about real human usage. Claude's not competing there anymore. And like I said, I think one of the biggest things that's happened and that I would not want to be working at Anthropix right now, Gemini 2.5 Pro and 2.5 Flash are

I don't know if I'm being honest, unless Anthropic has been sitting on a world-changing model. I don't know how Anthropic is going to compete against Gemini 2.5 Pro and Gemini 2.5 Flash.

Good luck. I know, you know, a lot of people have said, oh, well, there's still, you know, Claude 3.7 Opus, right? You know, Infrared Claude had kind of these three tiers of models. They have their small, Haiku, their medium, Sonnet, and their big one, Opus. And they haven't updated Opus in a very long time. So everyone's like, oh, you know, Claude 3.7 Opus or, you know, Claude 4.0. Well, I don't know.

I don't know, because there's also rumors, even though Gemini 2.5 Pro just went generally available like 10 days ago, there's already rumors that Google has a much better and more capable model that they're already testing on the LM chatbot arena. I don't know how Gemini is going to compete against Google. All right. Got receipts as always, y'all. I have receipts. Yes. Similar web, Dennis. Thank you for asking. That's where that data was from.

All right. Let me know, y'all. Why stream audience? Am I am I wrong on this? But let's get quickly to the receipts. I'm not going to make you wait an hour for this one. I'm going to go through quickly because the proof is in the pudding, y'all.

the writing's on the wall all right so let's look at artificial uh analysis so a great uh third party unbiased website uh right that does benchmarks because one of the thing is when companies put out their benchmarks they cherry pick there's dozens of different benchmarks so of course you know when when uh you know these ai labs put out their models they choose okay out of these 50 benchmarks here's the eight that we're gonna put on our website because we look great on this right

that's why i always look at elo scores we're going to talk about that in a minute here from lm arena and look at third-party benchmarks as well so intelligence this is from the artificial analysis intelligence index uh gemini 2.5 pro in the lead second uh oh three mini high from open ai then you have uh the two variations of deep seek and then you have the new version of gpt 4.1 y'all

I had to count. Claude Freeze 7 is number 8 in terms of intelligence on this third-party benchmark. Let's keep going because you're like, okay, what about humans? Humans probably prefer it. Okay, so ELO scores. Let's talk about that. That's head-to-head. You put in a prompt on LM Arena, on the chatbot arena. You get two outputs. You don't know who they are. You say this one's better. All right, there's been millions of votes. Guess what? Total ELO score. Claude is not a top 10 model.

That's when I, like, I know it sounds crazy to say, but you have to ask the question, even if you ask it rhetorically, is Claude no longer a state-of-the-art model? I don't know, y'all. In so many benchmarks, in so many now ELO categories, overall ELO, they're not a top 10 model. Gemma 3, which is a small language model from Google, has a higher ELO score.

than Claude 3.7. Let me say that again. A small language model, not a large language model. Humans prefer the outputs across millions of votes compared to Claude 3.7. Google has, let's count it, one, two, three, four, five models, five different models that humans prefer over Claude 3.7's on it.

i don't know so i don't know is my hot take the very hot take when i said hey anthropic has lost its place atop right gemini 2.5 pro uh higher uh let's see we have uh gemini 2.0 flash thinking higher gemini 2.0 pro experimental higher gemini 2.0 flash higher and then there's small language model java 3 my gosh all right but you might be saying all right jordan well people use claude for certain reasons right

They use it for creative writing. Claude's great at that. They use it for coding and software development. Claude's great at that. That's an old narrative. Literally, that's an old narrative, right? Especially the creative writing thing. I think essentially, right, you know, a bunch of stuff went viral online like maybe a year and a half ago about how bad ChatGPT and Gemini were at writing content and Claude was just so much better. All right, well, let's look at those two things. Let's first look at

Creative writing. Okay. Oh, where's Anthropic? Oh, the bottom of the list. Again, not top 10 ELO in creative writing. That's what I'm saying. I think right now it's a lot of Twitter talk and hipster hype, right? Oh, it's cool to like Claude, right?

It's like, oh, you know, I see you wearing that name brand chat GPT. Oh, I see you with that mainstream Google Gemini. I'm over here prompting with that Claude, man. No, why? Why? Not a top 10 model when it comes to creative writing, which everyone thought it was amazing at. It was a year and a half ago. Don't lie with me, y'all. This is millions of people have voted this. Blindly.

Guess what else? Not a top five in coding either. It's not. Claude 3.7 Sonnet with Thinking is not a top five model for coding. Guess what is? Guess what's at the top? OpenAI's 01.

their 01 preview their 01 mini gemini 2.0 flash and i do i i do believe once uh gemini 2.5 pro uh is on here and gets enough votes it'll be up there as well but not a top five model in terms of coding so what do you want what do you want i don't understand why are people still using anthropic like i said maybe you have one or two use cases that you're happy with it right if i'm being honest the only thing i use it like i said

Claude used to be maybe 20% of my usage. I'm a heavy large language model user. Like I said, maybe it's 5% now. I'm only using it because there's certain things and artifacts that Claude does better than Google's Canvas and OpenAI's Canvas. But it's always like I'm doing it all at the same time anyways. I'm running the same thing in all three of them. And sometimes I'm like, okay, yeah, Anthropix is a little bit better here.

All right, so maybe you're like, oh, it's fast, it's affordable. It's not fast, it's not affordable. It's not, you know, when you look at speed, and this is from artificial analysis, Gemini 2.0 Flash and Gemini 2.5 Pro are the fastest models, followed by GBT 4.0 and 0.3 Mini from OpenAI. Again, clawed not in the top five when it comes to speed, all right, which is output tokens per second. So it's not fast.

And that is the non-thinking model, by the way. It's terrible on price. It's terrible on price, which I still don't understand why people are so deep-seek drunk

like deep sea is not cheap anymore. Right. It's not when it first came out, it's like, Oh yeah, this is cheaper. Okay. Well, Gemini 2.0 flash is wiping the floor with everyone. When it comes to price, Lama's new Lama for scout GPT 4.0 mini, right? There's just so many faster, better, cheaper models than Claude. So I don't,

It's definitely lost its edge, right? I think there's one more thing I wanted to pull up here. Okay, it's coming up in a slide here because this is also telling. So looking at the intelligence versus price. So it's not like you're getting a good bargain either if you're using Claude on the back end.

right you're not you're not so on the front end humans aren't preferring it on the back end you're not necessarily getting what you pay for again this is intelligence versus price so there's a little quadrant here so you want to be on the upper the upper left because that means it is uh cheaper and smarter claude is on the right side and uh claude 3.7 sonnet is actually on the bottom right all right uh

Not necessarily fast or affordable. And here we go. Everyone's like, oh, it's the best coding model. Guess what? It's not artificial analysis. They're coding index. This one is very interesting. Cloud37, the thinking model. Ready? The thinking model is in fifth place. Guess what's ahead of it?

the new model that was just released from OpenAI GPT-4.1. But guess what, y'all? This is the mini version, the mini version of OpenAI's new model. Not only is it a non-thinking model, right? Because normally if you use these thinking models, these reasoners, they code much better, right? Especially when you're working with very complex tasks in long token, long context windows. So not only is

is this GPT-4-1 model. It's not a thinking model and it performs better on artificial analysis coding index, but it is the mini version. It is the mini version. So I don't know, y'all, if you're still using Quad 3.7 Sonnet, let me know why. Let me know why. I'm very curious. Like I said, I know a lot of people on the software engineering side, on the software

development side, they love it, right? Using it with Cursor, using it with WinServe, using it, you know, inside all these different IDEs. I also don't understand why on that. Now, with Gemini 2.5 Pro, with Gemini 2.0 Flash, and now these new models from OpenAI that they just announced, I...

I don't understand it. I honestly don't understand how Anthropic has gone, and Claude has gone from that top tier, right? State of the art, world leading model to kind of irrelevant.

So a lot of people are like, oh, well, you know, Claude just released a new plan, Jordan. You're really harping on them for, you know, these rate limits. You can just pay more and use it way more. Okay, well, why? If it's not a top 10 model, right? Yeah, Claude just came out with their Claude Max, right? So you get higher limits, you know, if you're paying $100 a month or $200 a month, which let me just call this out, you know, because people are like, okay, Jordan, this solves. Well,

You don't get anything more powerful for that $100 or $200 a month. You don't get more features, right? So when OpenAI, as an example, announced their $200 Pro plan, at the time, that was the only way you could access Sora. That's still the only way that you can access 01 Pro. And then you get unlimited everything. Unlimited. This is not limits. Or sorry, this is not unlimited. You can still go on the front end

and pay $100 or $200 a month. You don't get new features. You don't get new models that are exclusive to that max plan. You just get slightly better limits. But here's a concerning one, y'all. This one's kind of concerning. Ready? This is from Anthropix website talking about their new plan. Ready? Talking about their message limit on the new max plan.

Your message limit will reset every five hours. We call these five-hour segments a session, and they start with your first message to Claude. Please note that if you exceed 50 sessions per month, we may limit your access to Claude.

Each session includes any messages sent within five hours from the first initiated chat. So we expect it to be fairly generous for our users. My gosh, I don't know. How tone deaf is this, y'all?

Come on. So let's just say in theory, let's say you're a very regimented person. All right. Like I am. So this is why I can't even use Claude on the current paid plan. But even if I pay $100 or $200 a month. So let's say I use Claude in the morning before my show to help plan it. All right. So let's say 6 a.m. And then I use it at noon, midday. All right. And then in the evening, you know, I use it again. So let's just say I just do this.

a couple of props a couple prompts a day i do it at you know 6 a.m i do it at you know noon and then i do it at 6 p.m 6 noon 6 right couple prompts a day paying 100 to 100 or 200 a month in that scenario even if i'm only doing a couple prompts right paying 100 200 a month i might get

cut off from my pricey $100, $200 a month plan. That's what they're saying. 50 sessions a month. So if I do that, if I use Claude three times a day that are more than five hours spaced apart, I could, in theory, in three weeks get shut off. And I might not be able to use their paid plan for the last week of the month. In theory, that's what it's saying here. How tone deaf is that?

I don't understand. If I'm being honest, when I saw that, I'm like, come on, Anthropic. You have, I don't know, how many billions of dollars have you gotten from Amazon? I lost track, $6 billion or something. This is why people aren't using your service. Humans don't prefer it. Benchmarks don't prefer it. And for those people that are actually still finding utility in our power users, you're slapping them in the face. Get real. All right, hot take. Let's end it here.

Can Claude recover? I honestly don't think so. I don't think so. Here's again, this is just reading, reading reports. You can't knock Anthropic for putting safety first. You can't. They put out world leading research. I do think when it comes to, you know, safe AI, they are a leader in that, but no one's paying you for your research. You're not competing with,

to be the best frontier AI lab with the best research, with the best safety. This is a race. This is the Wild West, right? That's what it is. There's no rules when it comes to AI. Anthropic is playing, I'd say, the wrong game. They've alienated their power users. They've stopped innovating. And I think that has caused them

to now face an almost insurmountable challenge, right? Let's just say, as an example, Claude had their 4.0 model ready, and they probably have had it ready for a while. When you see these new drops from OpenAI, right? Their 4.1 models, the smaller versions, when it comes to price per performance, amazing. Same thing with Google Gemini 2.5. I don't think, if I'm being honest, right?

Where nine to 15 months ago, I'm like, yep, it's going to be a three-team race. It's not anymore. Yes, you have to pay attention to open source. You have to pay attention to Chinese models. But most enterprise companies here in the U.S. aren't going to touch many open source models for different reasons. And they're not going to touch Chinese models for obvious reasons, data security, data privacy, and not sending all your business IP straight to China from a U.S. perspective. Anthropic was primed.

To compete in this three-team race, they were primed to be a leader. But now they're a second-tier company. They are. That might be harsh. You wanted my honest take. That's not just me. Is that my personal usage? Sure. Is that my personal experience? Yes. But I showed you the receipts. Users aren't using it. Number one. They're not competing on benchmarks. Number two. Humans don't prefer it. Number three. So can Claude recover? I don't know.

I'd probably say no. All right, y'all. I hope this was helpful. You wanted some hot takes? I tried to bring it. Tried to bring it a little bit. So, you know, talking a little bit, has Anthropics Cloud lost its edge? What happened? And are Google and OpenAI too far ahead? Simple answer, yes, Anthropics lost its edge. And yes, OpenAI and Google, at least today, are way too far ahead for Anthropics to catch.

I could be wrong, but the only way you're going to find out is by continuing to tune in. Maybe I'll be eating a big helping of, you know, humble pie, you know, in 2026, but we will see and find out. All right. Thank you for tuning in, y'all. If you haven't already, please go to youreverydayai.com. If this was helpful, please share this with your network, tag a friend, someone that needs to hear this. If you're listening on the podcast, appreciate your support as always. Reach out to me.

I always leave my email in my LinkedIn there in these show notes. So please reach out if you have thoughts on this. You know, let me know in the live stream comments as well. Then go to youreverydayai.com. Sign up for the free daily newsletter. Thanks for tuning in. We'll see you back tomorrow and every day for more Everyday AI. Thanks, y'all.

And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

EP 504: Has Anthropic’s Claude lost its edge? What happened & can Claude recover? 51:02 Share

Everyday AI Podcast – An AI and ChatGPT Podcast

Deep Dive

Shownotes Transcript

EP 504: Has Anthropic’s Claude lost its edge? What happened & can Claude recover?