We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
People
播音员
主持著名true crime播客《Crime Junkie》的播音员和创始人。
马斯克的律师
Topics
播音员:Anthropic推出了付费高级订阅服务Claud Max,以解决用户对使用限制的抱怨,并获得更多收入。美国政府对英伟达对华芯片出口的限制有所放松,原因可能是英伟达CEO与美国总统的会面和承诺。风险投资公司Andreessen Horowitz计划募集200亿美元的AI巨型基金,可能利用国际投资者对美国AI公司的投资需求。OpenAI反诉马斯克,要求法院阻止马斯克进一步的行动,并追究其造成的损害。 Scott White:Anthropic推出付费高级订阅服务是为了增加收入,并满足高级用户的需求。 Chris Miller:即使经过降级的英伟达芯片性能降低,但仍然优于中国国产芯片,中国仍然严重依赖英伟达芯片进口。 马斯克的律师:OpenAI董事会没有认真考虑马斯克的收购要约。

Deep Dive

Chapters
Anthropic launched Claude Max, a premium subscription tier offering increased rate limits and priority access. This addresses user demand for higher usage capacity and follows a similar offering from OpenAI.
  • Anthropic introduced Claude Max, a premium subscription with higher rate limits.
  • Two tiers are available: $100 (5x rate limit) and $200 (20x rate limit).
  • The company is considering even higher-priced tiers based on user feedback.

Shownotes Transcript

Translations:
中文

Today on the AI Daily Brief, Anthropic introduced their power user tier. Before that in the headlines, Google Next is all about agents. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. Welcome back to the AI Daily Brief headlines edition, all the daily AI news you need in around five minutes.

We kick off today with Anthropic getting themselves a power user tier. Now, you might remember when OpenAI introduced their $200 a month tier back in December, many believed that the demand just wouldn't be there. However, while we don't know how popular exactly that subscription is, it's generated enough usage that Sam Altman said that the company was actually losing money on the deal.

Anthropic is calling their version of the premium subscription Clawed Max. The product allows users to pay to get around Anthropic's notoriously troublesome rate limits, gets priority responses during heavier traffic periods, and early access to new features thrown in as a sweetener. If you've spent any time on AI Twitter, you'll know that people have been constantly asking Anthropic for the ability to pay more for more service.

There's actually two different levels of ClaudeMax. For $100, users get five times the rate limits of the $20 a month pro tier. That math works out. But for those users who really burn through tokens, they'll want the $200 level that allows for 20 times the rate limit.

What's not on offer, unfortunately for some, is unlimited usage. So is this about boosting revenue or simply recognizing that power users are underserved? As they say in the business, why not both? Anthropix product lead Scott White said that the company isn't ruling out adding even more pricey subscriptions, saying we could see a $500 a month level. Ultimately, he said that the product roadmap is guided by user feedback. And the one loud, consistent piece of user feedback we've seen over the past year is that power users want to pay for more usage.

Next up, a little more on the tariff fallout. NVIDIA has secured a carve-out for their China-based chips. Industry insiders had widely expected the administration to clamp down on export of H20 chips, which is the downgraded GPUs that are designed to get around export controls. NPR is reporting that the additional restrictions won't go ahead after Jensen Huang attended a million-dollar ahead dinner at Mar-a-Lago last week. Sources said that restrictions had been in the works for months and were ready to be implemented as soon as this week.

They said that the president changed his mind after Huang promised new data center investment in the U.S. Chris Miller, a Tufts University history professor and semiconductor expert, commented, Even though these chips are specifically modified to reduce their performance, thus making them legal to sell to China, they are better than many, perhaps most, of China's homegrown chips. China still can't produce the volume of chips it needs domestically, so it is critically reliant on imports of NVIDIA chips.

Then again, that view is a little up in the air after recent reports of efficient training runs on new Huawei chips. However, even if the Chinese AI industry is trying to wean themselves off of Nvidia, the market is still critical for the dominant chipmaker. 13% of Nvidia's official demand comes from China, and that figure could be much higher if you account for evasion of export controls through Southeast Asia.

All in all, it adds up to us continuing to not have a coherent picture of the administration's strategy when it comes to chip controls. Following inauguration, Trump pledged to wind back many of the restrictions on the AI industry. However, the enhanced export controls introduced in the final weeks of the Biden presidency are still in place. These regulations put limits on a huge portion of the world, including friendly countries like Israel and India. Then again, this does also seem like reinforcement of the idea that there's always a deal to be made when it comes to Trump.

Now, over the last couple of days, you may have heard me wax poetic around what I think the implications of some of this tariff stuff are likely to be on venture capital. Nominally, I think that it's going to be a harder fundraising environment, not only for startups, but also for VCs themselves. But countering that point appears to be Andres and Horowitz, who are reportedly looking to raise a $20 billion AI megafund.

Sources said the firm is looking to capitalize on high international demand for investments in American companies. They added that international LPs view the fund as a way to more easily invest money in the USAI sector without the restrictions. So it sounds like this actually might be playing into and taking advantage of some of the tariffs.

Last year, A16Z raised $7.2 billion scattered across the themes of American dynamism, apps, games, infrastructure, and growth. This fund then is both significantly larger and more focused than previous efforts. The gigantic size brings up questions about whether venture capital can scale up in this rarefied air. SoftBank is perhaps the obvious comparison. They raised their $100 billion Vision Fund in 2017 with very mixed results. The second Vision Fund raised in 2019 is a relatively more modest but still massive $56 billion.

The other comp that springs to mind is Sequoia, who currently manage over $56 billion in assets overall. Still, there hasn't been a venture strategy as capital-intensive as AI in the past. In fact, part of the reason that companies like OpenAI had to turn to big tech partners like Microsoft is that there simply wasn't enough dry powder in the venture capital coffers for them to get what they needed.

Reuters sources said that a significant portion of the fund would be set aside for follow-on investments in companies already in A16Z's portfolio. And with portcodes like Mistral and Safe Superintelligence and Databricks, there is a lot of money to be spent. There is certainly a lot of capital need.

Lastly today, a little bit of institutional psychodrama. OpenAI has countersued Elon Musk asking the court to bring an end to the billionaire's legal challenge. The court filing called for Musk to be prohibited from taking, quote, further unlawful and unfair action and held responsible for the damage he has already caused. It stated, OpenAI is resilient, but Musk's actions have taken a toll. Should his campaign persist, greater harm is threatened to OpenAI's ability to govern in service of its mission, to the relationships that are essential to furthering that mission, and to the public interest.

Musk's continued attacks on OpenAI, culminating most recently in a fake takeover bid designed to disrupt OpenAI's future, must cease. Musk's attorney immediately fired back. In a press statement, he said, Had OpenAI's board genuinely considered Musk's bid as they were obligated to do, they would have seen how serious it was. It's telling that having to pay fair market value for OpenAI's assets allegedly interferes with their business plans.

The case is currently at a slow point as the parties await a jury trial next spring. Musk's attempt to seek an injunction to stop OpenAI from converting to a non-profit was rejected in March. So technically there isn't anything stopping California Attorney General Rob Bonta from making a decision on the conversion. However, complaints continue to roll in and the act of litigation gives him a good excuse to delay.

Meanwhile, of course, OpenAI has a huge financial incentive to get this wrapped up quickly. The company's latest fundraising round featured $10 billion from SoftBank that is contingent on the conversion being completed by the end of the year.

That, friends, is going to do it for today's AI Daily Brief Headlines Edition. Next up, the main episode. Hey, listeners, want to supercharge your business with AI? In our fast-paced world, having a solid AI plan can make all the difference. Enabling organizations to create new value, grow, and stay ahead of the competition is what it's all about. KPMG is here to help you create an AI strategy that really works. Don't wait. Now's the time to get ahead. Just

Check out real stories from KPMG of how AI is driving success with its clients at kpmg.us slash AI. Again, that's www.kpmg.us slash AI. Today's episode is brought to you by Vanta.

Vanta is a trust management platform that helps businesses automate security and compliance, enabling them to demonstrate strong security practices and scale. In today's business landscape, businesses can't just claim security, they have to prove it. Achieving compliance with a framework like SOC 2, ISO 27001, HIPAA, GDPR, etc.,

is how businesses can demonstrate strong security practices. And we see how much this matters every time we connect enterprises with agent services providers at Superintelligent. Many of these compliance frameworks are simply not negotiable for enterprises.

The problem is that navigating security and compliance is time-consuming and complicated. It can take months of work and use up valuable time and resources. Vanta makes it easy and faster by automating compliance across 35-plus frameworks. It gets you audit-ready in weeks instead of months and saves you up to 85% of associated costs. In fact, a recent IDC white paper found that Vanta customers achieve $535,000 per year in benefits, and the platform pays for itself in just three months.

The proof is in the numbers. More than 10,000 global companies trust Vanta, including Atlassian, Quora, and more. For a limited time, listeners get $1,000 off at vanta.com slash nlw. That's v-a-n-t-a dot com slash nlw for $1,000 off.

Today's episode is brought to you by Super Intelligent and more specifically, Super's Agent Readiness Audits. If you've been listening for a while, you have probably heard me talk about this, but basically the idea of the Agent Readiness Audit is that this is a system that we've created to help you benchmark and map opportunities in your business.

in your organizations where agents could specifically help you solve your problems, create new opportunities in a way that, again, is completely customized to you. When you do one of these audits, what you're going to do is a voice-based agent interview where we work with some number of your leadership and employees to map what's going on inside the organization and to figure out where you are in your agent journey.

That's going to produce an agent readiness score that comes with a deep set of explanations, strength, weaknesses, key findings, and of course, a set of very specific recommendations that then we have the ability to help you go find the right partners to actually fulfill. So if you are looking for a way to jumpstart your agent strategy, send us an email at agent at bsuper.ai, and let's get you plugged into the agentic era. Welcome back to the AI Daily Brief.

In a massive shock to literally no one paying any attention to this space, Google Next is all about agents. Yes, friends, this week we are headed into the next round of big tech conference season. And unsurprisingly, Google's Cloud Next conference featured a huge lineup of AI announcements designed to all in some, I think, make the technology feel and be more useful.

The company's annual cloud conference was held in Las Vegas this week and was absolutely squarely focused on taking the next step in the AI race. We've got agentic infrastructure, new models, a new AI chip, and much, much more. A pair of announcements about agentic infrastructure could end up having the most impact of all.

Back a couple of weeks ago on March 30th, in the wake of OpenAI announcing that they were going to support MCP or the Model Context Protocol, Google CEO Sundar Pichai tweeted, To MCP or not to MCP? That's the question. Let me know in the comments.

1.8 million views and 1,000 comments later, we got the answer and it was a yes. DeepMind CEO Demis Hassabis tweeted, MCP is a good protocol and it's rapidly becoming an open standard for the AI agentic era. We're excited to announce that we'll be supporting it for our Gemini models and SDK. Look forward to developing it further with the MCP team and others in the industry. And to reiterate, this means that all three of the leading US labs are now supporting MCP.

Now, for those of you who don't know what the heck I'm talking about, we covered MCP in depth a couple weeks back. But in short, if you're just trying to understand the implications, it means a significant boost for interoperability and compatibility across the rapidly developing agentic infrastructure layer. The more people supporting and building on MCP, the more agent building becomes plug and play, and the faster everyone builds on the advances of everyone else. The second big agentic announcement was the unveiling of Google's agent development kit and a new interoperability standard called Agent-to-Agent.

As the name suggests, the standard is seeking to harmonize the way agents communicate with each other rather than how they interact with tools. Rao Surapaneni, VP of Google's Cloud Business Application Platform, insisted that A2A is not competing with MCP, which of course is more about tool use.

They said,

Srappanini said that Google isn't necessarily looking to compete with other consortiums working on their own solutions, saying, "We will look at how to align with all of the protocols. There will always be some protocol with a good idea, and we want to figure out how to bring all those good ideas in."

Now, the benefits of this sort of standardization are replete. Standard ways for agents to coordinate could reduce the amount of complexity, for example, in multi-agent systems. Could mean that agents talk to their counterparts at other companies, making them more capable of getting work done without getting humans involved. So what does it all amount to? Well, MIT PhD Tobin South says...

My take on A2A from Google and friends is that they're trying to create a communications hierarchy with MCP as tool use and A2A as coordination and communication. Frankly, I prefer the client-server model of MCP, and I think we'll see A2A wrapped by an MCP server supporting the A2A schema. HubSpot founder and now agent.ai creator Dharmesh Shah writes, Shockingly, this doesn't change everything. It's very, very early, but here are my initial thoughts. I'm a big believer in multi-agent networks and agent-to-agent communication. It's good that there's now an open standard out there for it.

They cover some very key needs. Capability discovery, agents being able to send messages to each other, being able to work on tasks that are long-lived to async, weaving in human UX into the agentic flow, etc. This is not a replacement for MCP. In fact, during their announcement post, Google included a helpful diagram that illustrates how A2A and MCP fit together.

Still, this feels a bit heavy to me. It's trying to do a lot. In a way, that's good because you get a bunch of capabilities out of the box like async tasks and user experience negotiation, but the trade-off is that heavier protocols are harder to implement, and as such, you don't get the quick adoption you see with lighter weight things, so I don't anticipate MCP-style adoption.

Reading between the lines, this feels like they're solving for a lot of mega-enterprises and big consulting firms looking to build multi-agent systems inside the corporation. It's less about connecting agents across orgs, but I could be wrong. We'll be interesting to see actual usable implementations of this outside the Fortune 1000 companies. Overall, this is good news, though. Moves us further down the multi-agent systems road.

Other random little agentic notes, Gemini Code Assist is getting an agentic upgrade. This is their cursor competitor nominally, and it can now deploy agents to complete complex programming tasks across multiple steps. This kind of agentic feature has been a game changer on other platforms, with programmers using it to automate repetitive tasks like code migration. Google has also released a security agent as part of their new unified security platform. The goal is to have an agent on the beat that can recognize and remediate threats before they become major problems.

The chief information security officer at Charles Schwab, Bashar Abou-Said said, Google is transforming security operations and enabling our vision to stay proactive in responding to cyber threats. The platform has empowered our team to focus on strategic initiatives and high-value work. We didn't get a massive new model, but what we did get was Gemini 2.5 Flash. Like its predecessor, this is a smaller model designed to deliver efficient performance with extremely low latency. The big change is a lot more customization.

Google wrote, you can tune the speed, accuracy, and cost balancing for your specific needs. This flexibility is key to optimizing Flash performance in high-volume, cost-sensitive applications. Gemini 2.5 Flash is a reasoning model at launch and will likely end up being the cheapest on the market. The model is designed to adjust the depth of reasoning based on the complexity of the prompt, which is similar, it seems, to the approach that OpenAI is pursuing.

Google is aiming to provide a model that hits the sweet spot between performance and cost writing. It's the ideal engine for responsive virtual assistants and real-time summarization tools where efficiency at scale is key. Google also announced that they plan to bring Gemini models to on-premise deployments starting in Q3.

On the chip side, Google has announced the seventh generation of their tensor processing unit, which they're calling Ironwood. Now, unlike GPUs, TPUs are specifically designed for AI computing tasks. GPUs, meanwhile, are more generalized across all mathematical functions. In fact, in the early days, it was a slight coincidence that the chip architecture that powers 3D gaming was also highly performant for AI. The core bet with developing TPUs instead of GPUs was that specially designed architecture would be more efficient.

So far, that theory hasn't totally played out, with NVIDIA's GPU architecture still at the top of the pack. However, Google is hoping that Ironwood might finally validate that thesis. The company claims that their new processor can deliver 24 times the computing power of the world's fastest supercomputer when deployed at scale. Previous generations of the hardware were designed for both training and inference, but Ironwood is the first to be specially optimized for inference.

Amit Vadat, Google's Vice President of Machine Learning said, Ironwood is built to support this next phase of generative AI and its tremendous computational and communication requirements. This is what we call the age of inference, where AI agents will proactively retrieve and generate data to collaboratively deliver insights and answers, not just data. Now there are a bunch of numbers and specs that come with this thing, but rather than try to explain an exaflop, the promise here is that Ironwood is around twice as fast as NVIDIA's H100s.

Still, they argue that the biggest difference is actually scale. A maximum-sized pod of Blackwell B200 chips is 576 chips before they require outside networking, while Ironwood claims to be capable of being deployed in a 9,216-chip pod. This is also a massive jump from Google's previous generation of TPUs, which were called Trillium. The company has achieved a four-fold increase in computing power compared to the 2024 model.

Ironwood is also more efficient, delivering twice the performance per watt compared to Trillium. And this TPU and the focus on efficient inference definitely suggests that Google is scaling up to service compute-hungry reasoning models and the agents driven by them.

Now, there is more going on here as well. In a big vote of confidence for Google's new silicon, Ilya Sutskever announced that his startup will use Google Cloud's TPUs. Samsung announced that Gemini would be added to their new home robot. Google's Enterprise Cloud platform now features a music generation model. Ultimately, while there wasn't one big huge thing, at least not a hit you over the head, overall, it's a pretty remarkable shift from black Nazis and glue-on pizza to all of this in just about a year.

AI for Success writes, Google DeepMind is destined to win the AGI race and here's why. They have the data advantage, they own the TPUs, massive distribution channel, have all the best models right now, they have everything, and they dominate all four key areas. Applications, foundation models, cloud, accelerator hardware. I'm not sure how this all shakes out, but if you're Google, you gotta be happy that that is the narrative among many these days. That's gonna do it for today's AI Daily Brief. Appreciate you listening or watching as always. And until next time, peace.