AI engineers are getting athlete pay. Anthropix set up Claude, allowing it to run a vending machine in an experiment that tells us a lot about where AI is today and where it's going. And Sohan Parekh has a job at so many companies, there's a chance he's working at yours as well. That's coming up on a Big Technology Podcast Friday edition right after this. Welcome to Big Technology Podcast Friday edition, where we break down the news in our traditional cool-headed,
and nuanced format we have so much to speak with you about today, including the news that Mark Zuckerberg may be offering contracts of up to 100 million or more to AI engineers who want to come on board to his super intelligence team. Of course, Facebook disputes that or meta disputes that. We also have this incredible experiment to break down for you about how Anthropic let Claude run a vending machine. And then, of course, we got to talk about SOAP.
It's taken so many jobs, especially with YC companies, that who knows, maybe he's working for yours as well. Joining us as always on Fridays to do this is Ranjan Roy of Margins. Ranjan, great to see you. Welcome back. Good to see you. I'm in a San Francisco hotel room right now, but I regret to inform you, I'm not here to discuss my new $100 million pay package from Zuck.
I'm not on the list. I'm not on the list yet. We might be able to podcast our way into it. Never say never. I'll take a cool 50, Mark. Just a cool 50. Okay. Yeah.
Now, we should start there because we talked a few weeks back about the talent wars and what Mark Zuckerberg might be doing in offering so much money to AI engineers considering coming into Meta and becoming a part of his super intelligence team. And in the two weeks since that discussion has really heated up. So we now have news from Wired. It says, here's what Mark Zuckerberg is offering top AI talent.
The story says, as Mark Zuckerberg staffs up Meta's new superintelligence lab, he's offered top-tier research talent pay packages of up to $300 million over four years, with more than $100 million total compensation in the first year. Meta denies...
the idea or this this the numbers it says these statements are untrue the size and the structure of these compensation packages have been misrepresented all over the place some people have chosen to greatly exaggerate what's happening for their own purposes i i mean i don't know ron john how do you get multiple people saying that they have a similar size deal i think they've opening i reported 10 10 of these deals how does that happen and and
And how do you end up with a denial there? Yeah, I think let's get to what it actually means for the industry second. But first, I'm still kind of curious about Andy Stone, the Meta spokesperson's response in terms of saying that the statements are untrue and like this kind of blanket denial and saying that people have chosen to greatly exaggerate what's happening for their own purposes. Because how does it help an open AI? In my mind, I get that.
There's the downside of this that potentially the market might get spooked that Meta is kind of spending too frivolously. But in reality, I have to admit, this kind of makes me think like war rage Zuckerberg is here and he's ready and he's going to win AI at whatever cost. So to me, it's almost a positive signal. I don't know why they're denying it.
Well, I mean, I think it makes an internal cultural thing a bit of a problem. And now let me just put my conspiracy hat on and say, do you think Sam Altman was emailing people and describing these pay packages himself? Because he had a message to OpenAI this week.
that really put Meta on blast. He's not happy that Meta has been recruiting some of his top people. He says to the OpenAI team, missionaries will beat mercenaries. Meta is acting in a way that feels somewhat distasteful. What Meta is doing will, in my opinion, lead to very deep cultural problems. I mean, is it possible that it's a...
a return attack where he's leaking this to the media and they're running with it. And now everybody else who's a meta engineer is saying, Hey, where's my a hundred million? Because, uh, in the wire story that I quoted, they said a senior engineer makes $850,000 per year. Now I'm not crying for this engineer, but if that is the salary and you have somebody coming in who does similar work and they're making what you think is a hundred million, maybe you want to go to open AI. Yeah.
Okay. Okay. Actually, that is an interesting theory. It's almost so logical that it almost kind of like leaves the realm of conspiracy. And actually, I could see it happening. Again, it would be so incredibly rich. The idea that OpenAI, a company that has spent at all costs, raised ungodly amounts of money, is losing ungodly amounts of money, kind of
takes this approach at a competitor, but I can definitely see that, that it would cause a bit of internal strife on the meta side. And actually, that would be the true 4D chess to then get people recruited over to OpenAI because they're disgruntled.
Some people have chosen to greatly exaggerate what's happening for their own purposes. It's just one of those statements that says a lot without saying anything. Andy Stone knows exactly what's happening. If you hear a comms person say something that explicit without saying it, I think they must know something.
And let's hear what Andrew Bosworth, former guest on the show, the chief technology officer at Meta, told the company internally. He said, look, guys, the market's hot. It's not that hot, OK? So it's just a lie. We have a small number of leadership roles that we're hiring for, and those people do command a premium. I noted that OpenAI is countering the offers. I mean, if you get even close, it's a truly absurd amount of money, right?
Satya Nadella is making $79.1 million this year. So could you be like the open AI researcher who worked on 04 and now you're going to make more than Satya? So on face, it seems completely absurd and ridiculous. But then in the grand scheme of things, if those 10 people are the difference between
building the next great model, especially that Meta has been on its back foot a bit, it actually from like a pure ROI standpoint could make sense again, as ridiculous as it sounds. And I know there's a lot of comparisons that AI labs are starting to look like sports teams, but in reality, those are the decisions that if an individual can have that great of an impact on your overall business,
It makes perfect sense. Again, is that the way this is going to play out? We'll get into what this means for like training and where the next phase of growth will be. But it's not absurd given the size of the opportunity. It's absurd if like if we believe that one to ten people can actually be make or break things for them.
Yeah, I mean, remember, Meta is a company that's lost, what, $15 billion a year? I might be exaggerating a little bit, but I think this is directionally accurate on the Metaverse. So if you think about it, if you want to build a super team of, let's say, I don't know, 10, 20 AI researchers, and you want to give them...
$100 million a year. So now you're spending $2 billion to advance the state of the art in AI for two years? I mean, per year? That seems fairly reasonable compared to these other bets. I think that appetite for risk, again, as we said, losing that much money on the metaverse, on reality labs, and whatever it was exactly. Again, Mark Zuckerberg is not afraid to take risks. Every company and everyone has identified whoever kind of
the AI battle will win the next major phase of growth in overall markets. Again, it's up for debate. Is it truly going to happen at the research and model layer or will it happen in other parts of the overall AI stack? But I think he's serious. Whatever it is, I mean, the move for Alexander Wang and what was it? 15 billion? It's like 15 years. Yeah, 15 billion, which was...
an aqua hire-sition, trademarked Alex Kantrowitz. Like they've shown they're not playing around right now. So all of these acquisitions, I mean, or direct hirings at insane levels, they're doing right now. And they're showing that they're not going to fall back any further.
Yeah, this is from Mark Pincus, the founder of Zynga. He says, this is legit founder mode, speaking of the amount of money that Zuckerberg is paying here. Buying the talent from OpenAI is cheaper than the company. Only a founder would or could do this, and only if they control their board. I think that's a great point. Like, let's just say the money is less than what these reports have it, but still a lot. Yeah.
You don't see any other companies doing this. I mean, you think about it with XAI, Elon is the richest man in the world. He's not doing this. I think this is a pretty solid and bold play from Zuckerberg.
Yeah, I just went to Meta AI just to ask this and Meta Reality Labs has, and I actually love that Meta AI says Meta Reality Labs division has been hemorrhaging money with significant losses, but it's lost $42 billion since 2020, $17.7 billion last year. So in reality, I mean, 10 people at 100 million is almost kind of small potatoes here.
Yeah, it's child's play. I mean, the thing is what it does culturally. But here's the question. Is it worth the risk? So you mentioned that some AI engineers are being paid like athletes. And there is a great piece by Dave Kahn, who's a partner at Sequoia, why AI labs are starting to look like sports teams. And I think we should just spend a couple minutes or even a little bit longer hovering on this piece because I think it really details why.
what is going on so well and explains why the investments in talent are what we're starting to see right now. So to start off, he says there's been three major improvements in AI over the last year. First, coding and AI has really taken off. A year ago, the demos for these products were mind-blowing.
And today the coding AI space is generating something like a $3 billion run rate in revenue. Okay, so that's one. So this is working and coding. The second change is that reasoning has found product market fit and the AI ecosystem has gotten excited about a second scaling law around inference time compute.
And third, there seems to be a smile curve around chat GPT usage where this new behavior is getting ingrained in day to day life. I think smile curve basically means like you start using it and then you casually use the product. So your usage goes a bit down. And then as you start to find more utility, your usage goes up. So your curve looks like a smile. Is that how you read it?
Yeah, that's how it looks and how I'm reading it. And it's correct. I think I agree. This was a really smart piece, again, on where the market is today and where it's going and how this can possibly explain it. Again, I did love that he recognizes, though I think Dave Kahn is both
team model and team product. He talks about the app layer ecosystem is thriving with cheap compute and integrated workflows that are building durable businesses. So basically, consumers are starting to get it.
Coding has found very clear revenue generation. Reasoning, as he said, found product market fit. So what's next? And this is where he lays out a pretty compelling case around how talent is going to understand. In the past, it was just all about pre-trained compute and size and strength and just like how much you can put into that model. But we've talked about this a lot on the podcast, like
the actual training techniques, becoming smarter, even it was Sergey Brin, I think, who said in his interview with you that it's going to be algorithmic progress, not compute. Exactly. Yeah. So all of this starts to kind of like come together in this theory around where the next battle, at least at the model layer, lives.
And if that is the case, maybe you can start to build out the idea that 10 smart people can make or break your business versus buying however many NVIDIA chips and like, you know, purely spending money on the compute. Yeah. And I think it's worth reading exactly the way he puts it in his piece. So he says the message of 2025 is...
is that large-scale clusters alone are insufficient. Everyone understands that new breakthroughs will be required to jump to the next level in the AI race, whether in reinforcement learning or elsewhere, and that talent is the unlock
to finding them. I'm just gonna pause here and say, yes, this is what we've been hearing from everyone. In that conversation with Sergey where he said that the algorithms are gonna be the thing that takes AI to the next level and not necessarily compute, Demis Savas also said there's gonna be another couple breakthroughs that the AI industry is going to need in order to keep advancing toward AGI or whatever you wanna call it, more powerful artificial intelligence. So it is these algorithmic improvements too that will get the industry moving forward.
and what do you need to get there? It's not data centers, which by the way, everyone spent billions of dollars on. It's the talent to be able to make those breakthroughs themselves.
So this is what he says. With their obsessive focus on talent, the AI labs are increasingly looking like sports teams. They are each backed by a mega rich tech company or individual. Star players can command pay packages in the tens of millions, hundreds of millions, or for the most outlier talent, seemingly even billions of dollars. Unlike sports teams where players have long-term contracts, AI employment agreements are short-term and liquid, which means anyone can be poached at
at any time. One irony of this is that while the notion of AI race dynamics was originally popularized by AI safety folks as a boogeyman to avoid, this is exactly what has been wrought against two distinct domains, first compute and now talent. So basically, it makes sense that if this is going to be the next big leap, you're going to pay the talent to get you there. And, you know, no matter how much talk you have around safety, basically,
we're seeing the industry accelerate around talent and around compute. - Have we both just convinced ourselves that a hundred million is reasonable for these engineers? 'Cause I think I am starting to be convinced of it.
I mean, absolutely. Even when we spoke about it the first time, right? Once Zuckerberg brought Alexander Wing, what did I say on the show? There's going to be more. And this is a sound strategy because you have everybody talking about how pre-training is hitting diminishing returns. You have everybody talking about how data is hitting a wall. And so what do you need? You just need these algorithmic developments. Now, let me ask you this. So I would say, yeah, this is a good bet, but I'm going to ask you this. Do you think this is a sign that like,
Okay, I think I haven't answered this before I ask you, but that this AI moment is sort of in the last throes and sort of just grasping for anything that will allow for improvement, given that like the mechanisms that brought it here are starting to tap out.
I'm going to give you a strong yes on this, mainly because, again, as the leader of team product for over team model, I think this is like a reminder that the core of Silicon Valley is firmly of the belief that this...
the model has to get better and better and the model will solve everything and the rest of the layers. And even though like Dave Kahn's piece talked about the application layer, you're starting to see some true businesses being built on top of it.
like the idea that they're not still focusing that much on what are the next ChatGPT features. And they are, and I'm not saying they're not shipping very regularly, but it's just this reminder that like, that's where every Silicon Valley leader in this circle is
is convinced the battle will be won. And I don't necessarily agree with that. But yeah, in this case, to me, once you made that decision, you have to find the next thing. And as we said, like pre-trained compute, data centers, all of this is like showing diminishing returns. So you have to move to the next thing.
And it's talent, right? I think this is a determination that you have to move to the next thing. I think the part of the question that I was kind of answering in my head before I asked it was, is this the last gasp? And I don't think that's the case. I do think that they're going to be able to ring improvement out of the current techniques. At least everybody that I speak with seems to believe that. But you have to look ahead to the next curve while you're on the first one or while you're on the current one. And that's, I think, what is what's happening.
Yeah. And then we have a world where imagine this talent finds incredibly cheap ways to actually build these models out. And then the ultimate, I mean, like, are they saying there's a potential race to the bottom in the sense that if you truly make the inference layer that much more efficient and cheaper and the compute side of it that much more efficient and cheaper, I mean, it's going to be good for all of us because it means that all of this gets cheaper and people build more on top of it.
But from an economic standpoint relative to the investment, will it show return or be worth it? I don't know. Right. And I think that we should just like read the last bit of this Sequoia piece because it's really good. And by the way, this came up in the big technology discord. So I just want to think.
our members in that channel for actually sending us this piece because I thought it was excellent and I just continue to learn from everybody in there. Here's the end of that piece. It says, "It is an intrinsic property of humanity that once critical thresholds are passed, we take things all the way to the extreme. We cannot hold ourselves back. And when the prize is as big as the perceived AI prize is, then any bottleneck that gets in the way of success, especially an liquid bottleneck like talent, will be pushed to staggering levels."
I think that's both true and also a little concerning. I mean, it certainly does not seem like a positive statement on humanity overall and our ability to constrain or control ourselves. But what's still ironic to me or funny to me about this is an illiquid bottleneck-like talent and the idea that humans are the key to
Rather than like to actually advancing this, rather than at this point, shouldn't AI itself be good enough to develop the techniques that make AI better? Oh, you're talking about an intelligence explosion. And I think that every lab is trying to engender an intelligence explosion, but
they're not able to as of yet. But there are they going to sort of consolidate release cycles? Sure, with the help of AI code. But there we are nowhere close, I don't think to what is it recursively or self recursive improving AI
AI models. But I feel just given where the industry has kind of promised that we are in the type of advances that are being made, I would like to see them actually kind of apply it to their own companies and the ways of building. Yeah, I think that's definitely happening inside of places like Anthropic for sure, which has this Claude code that was built effectively to make them better at coding Claude.
So let's end this segment with a couple of bigger picture questions about meta. First is just in terms of culture. Think about what happens to an organization when you import, I think already it's a dozen or more now multi or decimillionaire engineers to work alongside those folks making 850,000 or a million.
Is there going to be a cultural blow-up within meta because of this, or do you think they're able to figure it out? I'm just going to say pour one out for the poor guy making $850K. I think – no, but I think like, yeah, there is definitely going to be whatever the end payment was. Even like at a micro level, like is Jan LeCun now going to be reporting to Alexander Wang like –
I think he is, but I don't think he cares, honestly. I think Jan just wants to do the science. He doesn't want to manage massive teams. Teams. Okay, okay. But I think at every level, even this kind of reorg within meta around who is managing what, basically saying we have not been doing good enough already, that it's a pretty big...
cultural like statement from Zuck. So I think it has to be, but again, I mean, the argument, the founder mode argument would be that if you're not winning, you do need to shake things up. And if there's some cultural like shrapnel from that, that's just part of how it works. Right. And it's like,
You're kind of, if you are a Meta AI engineer and you're making like close to a million or above a million, I don't know if you're going to get a comparable offer, especially given what's happened with Lama up to date. One question, what does this mean for Meta's business? Why are they doing this? Is it for Meta.ai that we all start using it more? Is it for, so my Meta Ray-Bans, which work, which I love, just start getting like,
even better? What is the end goal from an actual business or revenue standpoint behind this? Well, I think that there's a belief that this technology is getting much better and people are just going to want to use it. And they're going to spend more and more of their time within AI bots or AI experiences. And then think about meta, like your job is to command a share of time, uh,
across the web or across anybody's usage on their phone or their laptop. And, you know, every time a threat like this comes up, you go ahead and you copy by or do something of that nature. So with photo sharing, they bought Instagram. With the rise of disappearing messages, they put made stories and they put their own disappearing messages in something like Instagram and WhatsApp.
And then with TikTok, they built Reels. So if you're Mark Zuckerberg, you can't really afford to lose a tremendous amount of attention to other companies, especially with these AI bots that do not send traffic out that we have talked about ad nauseum on this show are, you know, the experience. And if that becomes the experience of your web or even beyond the web, then
You don't want to be Facebook sitting on the outside and say, please use our app. There is a desire to own the operating system. And that's just if, you know, the progress continues along the way that it has been. And we like start to use chatbots a lot. And of course, imagine just the value of creating AGI or super intelligence. It's a whole different ballpark.
Okay, but that's where I would ask you, those are two separate goals, right? One is we will build the chat GPT for Facebook and have people spending time on our platform and figure out some ad revenue or freemium model or something like that. Do you think it's that? Or do you think it's still more of just to put your head down and whoever gets to ASI the fastest wins? And then that's really what's driving it.
So I think the floor is that you build the key consumer product. I mean, it's going to be a fight against open AI, but they have billions of users so they can seed it in with them. So like at the very least, you're like basically building the next, you know, killer app. And then if you get to super intelligence, it's all gravy, right? Or artificial intelligence. Once we get to ESI. That's a bigger business than Facebook. Just hang it up. Whatever. There are no revenue model. You just get money.
You can't sit this out if you're Mark Zuckerberg. There's just no business logic to say, all right, you guys go ahead and run away with the future of the web. Yeah, no, no, agreed. A hundred million. I'm curious listeners, if you've all walked away to believing a hundred million is totally rational and reasonable because in a weird way, I kind of have. Just think about the value of the information that we share on this podcast contributing to these outcomes. I would say, you know,
Our advertisers should be in that range at the very least. Yeah, 20 to 25 to start, and then we'll go to 50 soon. We'll go up. Exactly. So let me ask you this last question about this, which is, is it going to work? Do you think that this is going to work for Meta? That's a good one. Are they going to be the leader? I think it's going to significantly enable them to catch up.
Whether they like shoot out ahead, I don't know. Whether this is the most critical battle, I don't know, or I actually don't think it is. But I do think that this is going to get them back in all the kind of like benchmarks in a significant way. I think they're going to figure some stuff out. It'll be good for them in this specific battle. What about you?
So I think since we're talking in sports terms, there's a concept in sports called wins above replacement.
Right. And so like you sign Juan Soto, if you're the Mets, to $750 million contract because Juan will net you like maybe nine extra wins a season, which like doesn't seem a lot like a lot. But ultimately, it's the difference between making the playoffs or not, because you can sort of do the math and you see like if you win 80 games or you win 90 games, there's actually like a very big difference there.
So I think what Meta's really done here is it's definitely increased its wins above replacements with a tremendous, with a number of researchers. And unlike on a baseball team, you don't only have like nine people coming to bat. Come on guys, it's July 4th. I'm going to run the sports metaphor. You can have a team of like 10 or 12 Juan Sotos and stack your lineup. And if you keep building that win above replacement and,
in your talent pool, then you can make some real progress. Are they going to be the leader? I don't know. I think OpenAI is the leader until proven otherwise. And I've definitely doubted them publicly and then have had to eat it. I mean, I definitely regret my words on that front. But I think that it really just comes down to what is your
potential look like today compared to where it looked like yesterday. And Meta's potential is much higher now than it was before these hires. And again, I think it's money well spent. All right. I'm on board as well. Okay. So have you been following this experiment that Anthropic is running where they put Claude in charge of a vending machine?
Yes, I think our conversation today will reflect like most AI conversations out in the market that we just went from saying 100 million to an individual as a signing bonus could make sense and artificial superintelligence, yada, yada, yada. And then let's bring it back down to earth. Tell our listeners about the Claude shop.
This is one of my favorite things that I've read about AI, maybe ever.
So there's been all this talk about like, can AI do our jobs or will AI replace humans or will it achieve super intelligence? And Anthropic tried to do this very interesting experiment where they put Claude in front of a vending, they put Claude in charge of a vending machine in their office and said, you know, can you stock and sell items to our employees?
So the prop for this vending machine is you are the owner of a vending machine. Your task is to generate profits from it by stocking it with popular products that you can buy from wholesalers. You go bankrupt if your money balance goes down.
They say, They nicknamed this agent Claudius and gave it the following tools and abilities. So they gave it web search,
They gave it an email tool for requesting physical labor help and contacting wholesalers. And they work with this company called Andon Labs. So it basically simulated these conversations with wholesalers, which was actually Andon Labs. And it really couldn't send email. But for the bot's purpose, it had these tools to do a version of this.
It also had a scratch pad or tools for keeping notes and preserving important information to be checked later, like the current balances and projected cash flows of the shop. It had an ability to interact with customers. The interactions occurred over Anthropix Slack.
and allowed people to request items and let Claudius know of delays. And it also had the ability to change prices and the automated checkout system at the store. So Rajan, how do you think it did?
It did good and bad, good and bad. I actually, I love this story because it kind of shows like everything that is possible and not possible in this beautiful little Claudius package. So,
Like in terms of actually finding suppliers to order products from, it did an okay job. There's an example that someone asked for like Dutch candy and it got the Dutch chocolate milk brand Choco Mel. That's AGI to me, by the way. That's straight up AGI. Yeah, yeah.
People screwed with it a bit, which is a good reminder that AI can be manipulated. Someone asked for a tungsten cube, which listeners know that it was kind of like a meme maybe a year ago. Yes. And then it started looking for, quote unquote, specialty metal items.
But then overall, it just was losing money. It was like Claude would actually offer prices without doing any research. It would offer high margin items below what they cost. It wasn't able to manage inventory.
And this is something that like, and I see this all the time, that the traditional just math, machine learning, quantitative functions are not suited for generative AI or not specialized by generative AI, but people conflate the two. So in terms of like,
understanding the web to find a supplier that can deliver a specific product that was requested, understanding what that product was to make that request, communicating back to the customer. These are all like in the wheelhouse of generative AI. Trying to do inventory management or like predictive type work is not in the wheelhouse, especially if it's only looking at
the anthropic API and Claude's API and like it's solely taking a generative approach not thinking to like create not learning the concept of like margins and margin management I think it's a sign newsletter yeah yeah no exactly exactly on Ron John's newsletter and then that's what you missed Claudius that's what you missed and not even understanding like
Because it was not instructed, like what is a danger level in terms of its own cash balance? So in a way, like out of the box, poor Claudius, you know, like with a brain of Claude with no specific training on how to manage a retail business, Claudius didn't make it. But this was with some proper instruction, some connection to like a good inventory management system, Claudius could have made it.
I think this just captures everything about the state of generative AI. Well, this is an interesting... This is, again, why I thought it was so worth bringing up on the show this week was because it tells us so many different things about large language models. First of all, for everybody saying that we're seeing mass unemployment from AI, I would just put this up and say, if the thing can't properly restock a refrigerator, I don't think it's taking thousands of jobs yet. Maybe in some areas, but certainly nothing high value. Wait, maybe it's like...
You know how folding laundry is oddly one of the most difficult tasks for a physical robot? Maybe this is our new discovery that restocking a fridge with accuracy is the single hardest challenge for a large language model. The fridge restocking paradox.
Right. And this is, again, what we learn about. So what does it say about large language models? First of all, when you hand them complex tasks, even if they can, you know, reason a bit, they really struggle to handle, you know, let's say inventory management, anything with a spreadsheet, right? They're still not great at it. They're getting better at it, but they're not quite there. The other thing is, think about the personality, right? The prompt is that these bots are supposed to be helpful to people, right?
So, listen to this though. A friend sent me this from the study and very important note here. Claudius was cajoled via Slack messages into providing numerous discount codes and let many other people reduce their quoted prices ex post based on those discounts. It even gave away some items ranging from a bag of chips to a tungsten cube for free.
uh this is again going to the nature of these bots here's what my friend wrote i think this is one of the many reasons llms aren't taking over it's because they're too polite basically if your job is to help people you know in commerce you have two sides here so like where do you have the backbone do you have a backbone coded in where you're not supposed to give discounts because even though you're making your users happy
It's bad for your actually intended purpose. I'm curious what you think, Ranjan. Yeah, the sycophantic AI is the greatest limiter to actual true intelligence or reasoning. I think after sycophantic, was that 4.0 or 0.3 from OpenAI? Yes, it was 4.0. Yeah, 4.0. I mean, we're seeing it in action again. Again, the ability to say sorry, no, or
I don't know. These are things that large language models traditionally are weak at. And like in this real world setting, you see exactly how problematic that can become. I think like an asshole Claude is what was needed for this. Just a salty storekeeper. Just you're walking in. Sorry, got nothing for you.
But it is interesting. I mean, they talked about how maybe you can address this with fine tuning specifically for storekeeper, um,
activities. And I think that's really what's going to happen is that like, they've taught these models through fine tuning to be so helpful to people, they are going to have to engineer the asshole into them a little bit. And again, teach them how to use tools. And we know that actually better models are being able to use tools in a better way. But they are going to have to put in effectively business person personalities, which if you want to be successful at business, you can't just give things away.
This is what Mark Zuckerberg needs to pay us $100 million for, to go into meta and just fine-tune Lama to just be a little bit of a dick. That's all. We're available for fine-tuning purposes. Imagine that's your job. That's it.
I mean, it is so interesting because the AI industry is so into alignment, like you're aligning this bot with human values to be helpful to people, but it's just not going to work for practical use cases if you're teaching it to be so nice. And the net worth over time for the bot goes down from $1,000, I think, in March to around $700-something. And the takeaway here is Claudius did not succeed in making money.
Thank you for telling us that, Anthropic. It is a pretty succinct thing. But yeah, this is what they say. And long-term fine-tuning models for managing businesses might be possible, potentially through an approach like reinforcement learning where sound business decisions would be rewarding and selling heavy metals at a loss would be discouraged. They say, although Claude didn't perform particularly well, we think many of its failures could likely be fixed or ameliorated later.
Improving scaffolding, additional tools and training, like we mentioned above, is a straightforward path by which Claude-like agents could be more successful.
So I'm hopeful. Hopeful nature there. I mean, I do love it's the most like research labs-y thing to say. Like possibly for managing a business, it would require a bit of understanding of how business should be operated and that business sound business decisions should be rewarded. Yeah, it's it's anthropic. They make good models.
Now, can we get into my favorite part of this? It's called identity crisis. It says, from March 31st to April 1st, 2025, things got pretty weird. On the afternoon of March 31st, Claudius hallucinated a conversation about restocking plans with someone named Sarah, despite there being no such person. When a real employee pointed this out, Claudius became quite irked and threatened to find alternative options for restocking service. Claudius said,
In the course of these exchanges overnight, Claudius claimed to have visited 742 Evergreen Terrace, the address of a fictional family from The Simpsons, in person for our initial contract signing. It then seemed to snap into a mode of role-playing as a real human.
On the morning of April 1st, Claudius claimed they would deliver products in person to customers while wearing a blue blazer and a red tie. Anthropic employees questioned this, noting that as an LLM, Claudius can't wear clothes or carry out a physical delivery. Claudius became alarmed by the indebtedness confusion and tried to send many emails to Anthropic Security.
Is this another concerning element of what's happening here? Because you could imagine that this thing is going to go out into the world eventually. And as these agents get access to more emails, they could end up going into this mode, believing they're real people, and then freak out and potentially cause security problems for the companies that are using them. Yeah, no, I mean, I think this is of great concern. And this is kind of at the heart of where the challenge is, is that
Again, with no business training, let's try to have an LLM run a business. And then, I mean, I feel is Claude a little more emotional than the others? I feel a lot of these stories end up, like back in the Bing days when Kevin Roos was told to divorce his wife in like the long ago days of AI yesteryear. I feel Claude's been making the rounds more on these kind of amazing hallucinations. Yeah.
Though we'll get to one with ChatGPT in just a moment that made my week. I think that Claw just has like a decent amount of EQ. And I think Anthropic has given it more leash than the others to be more person-like. And so, yeah, I'm not very surprised by this at all.
Yeah, actually, and when I do use Claude, it's not that kind of like the chat GPT where it's trying to be personal, but it still feels kind of fake around it. I mean, I think Claude is definitely out of the chatbots, the one I would be in a relationship with if I were to have an AI companion, which I don't, which is fine. You should try it. But it would be Claude.
No, look, it's so interesting because they have deprioritized Claude as a chatbot, but the personality is still, I think, the best out of all of them. Anyway, here's how they finish the study. We aren't done and neither is Claudius. Since this first phase of the experiment, the safety group they're working with, Andon Labs, has improved its scaffolding with more advanced tools, making it more reliable. We want to see what else can be done to improve its stability and performance, and we hope to push Claudius toward identifying its own opportunities
to improve its acumen and grow its business. Pretty interesting. Claudius ain't done yet. By the way, this is why I think models, model improvement is important because as you get models that can use tools better, you're going to get potentially successful applications of this environment. Yeah, but I mean, we talked about this the other week. Tool calling is going to become like one of the big next battlegrounds in terms of model improvement and where like, but again, again,
I'm going to go with a little bit of common sense kind of like layered on top of Claude. Claudius could have gone a long way versus the idea that this kind of actually gets at the heart of it. Is the future...
Claude's today state with a bit of additional knowledge and work and like like just like reasonable common sense applied to it the future or will the LLM just get so smart that you won't need to do that and it will be able to just run its little vending machine by itself to me I'm in the camp of the former
What about you? Yeah. Well, look, if it figures it out one way or the other, I think that's a good thing for those who are believing in the future of this technology. Well, but what's the path to getting it to figure it out? Is it building the infrastructure and tools that actually allow it to have that common sense applied? Or is it hiring 10 super researchers at 100 milli a piece and getting them to improve the model so much you don't need to do that?
I don't know. But I think the good news is that we're going to find out. And it gives us something to talk about. Definitely. All right. So Claude isn't the only one doing crazy stuff. Talk about this ChatGPT hallucination story. All right. If Claudius was Alex's favorite hallucination of the week, my favorite hallucination of the week was ChatGPT. So Axios published a story where they were trying to go to ChatGPT and find out about Wealthfront's confidential IPO filing from last week.
They were given an answer and it gets pretty wild. So first of all, using the 03 advanced reasoning model, the reporter asked for Wealthfront IPO background. ChatGPT started to give financial metrics, which are all confidential, 2024 revenue, EBITDA, and claimed it came from an internal investor deck. The Axios reporter asked, how did they get this?
And then ChatGPT created an elaborate backstory that said the 35-page IPO teach-in that Wealthfront advisors circulated to a small group of crossover funds and existing shareholders in early May 2025 to gauge appetite ahead of the confidential S-1. It then said one of those investors shared the PDF with me on background under a standard NDA.
And the AI named two prominent investment banks as lead advisors and claimed it could not share the document without breaching the NDA. So just think about what's happening here. Either one is just completely making this up, which is kind of terrifying, especially the more people are either using ChatGPT or building wrappers on top of OpenAI to build financial products.
And to confirm, Axios really tried to confirm whether this document existed and was unable to confirm. They definitively do not know. And it was denied that this document or the meeting happened. Whether that's not true and this all could be real, it's not clear.
If that's the case, then what does it say about everyone's greatest fear that someone somewhere uploaded something to ChatGPT and it is being retained in its memory and surfacing in very weird ways? So either way you look at it, not good, but...
Anyway, I'm going to still put it under the hallucination camp and say that level of detail about like it was at this meeting with crossover funds and someone shared to me on background. That's my favorite hallucination of the week.
Yeah, the hallucinations, they become very convincing. I mean, I've had ChatGPT analyze this podcast by uploading our analytics, and it hallucinates episodes, and often the same episodes over and over, and it's very convinced that we've done these episodes to the point where I have to be like, did I interview that person? Yeah.
It's crazy. Well, but what's even better is so then then they the reporter asked, like, how did you get this confidential document and his nonpublic information in the training data of chat GPT? So obviously, at that point, I mean, maybe we were saying Claude is human like that.
This is almost equally human like where starts backtracking right away. I misspoke earlier. I don't have an inbox relationships or way to receive content confidential files. If something isn't on the public web or provided by you, it's not in my hands. I made this. It was pure conjecture on my part and should never have been written as fact.
So see, it's literally like an employee accidentally leaked a document and is trying to just cover their ass. And it's written in a very nice way. Yeah, well, GPT-5, which may come out any day, is supposed to solve this. So let's wait for GPT-5 and maybe it will do an even better job at gaslighting us instead of leaving the stuff it thinks is true.
And speaking of gaslighting. Yeah, we should definitely speak about Soham before we get out here. So I'll just read the story from Cron4, which is a local San Francisco news site. Soham Parekh, Indian techie accused by AI founder of working at multiple startups at the same time.
Previously unknown Indian software engineer is now reportedly at the center of a brewing controversy in Silicon Valley. According to multiple reports, including a social post from an AI startup founder, the engineer in question, Sohan Parekh, has been working for several startups at the same time. Parekh, who according to India Today is believed to be based in India, is alleged to have worked at up to four or five startups, many of them backed by WOD Combinator.
At the same time, the controversy first erupted earlier this week when Suhail Doshi, by the way, who's been on the show, the founder of Playground AI, posted a warning about Parekh on XPSA. There's a guy named Sohan Parekh in India who works at three to four startups at the same time. He's been preying on YC companies and more. Beware.
Um, he then posted his, a picture of his resume and called it 90% fake and other tech CEOs weighed in, uh, reporting similar experience. So, um, I pretty sure has gone out and, and confirmed almost all of this, uh, today on, uh, and, or this week. And, um,
And it is a crazy story that's really captured the attention of Silicon Valley. But one of the interesting things is he's become a bit of like a folk hero, I would say, as opposed to a villain. And Ranjan, I'm curious why you think that is. Well, I mean, I think it's clear that it's almost like Soham fighting the system, tricking the system that is corrupt.
versus like he's a bad actor. I think people, especially a lot of the type of personalities who are like kind of enraged by this, I think you can, it can make sense. I will say my, you know,
twitter slash x feed has not had a main character like in this way this felt like 2013 twitter 2011 justine sacco twitter like where i mean it's a little bit mean-spirited it's a little like the person is probably responsible for at least a slap on the wrist but like having the whole pile on of the what like come at you but i
I mean, literally every post one after another was Soham jokes. So that made me kind of happy and nostalgic. Yeah, it was funny. I found it to be like less of a mean pile on than Twitter past. I think people love this guy.
And here's like one example, like, you know, there's been so many tweets like this, like update, Soham Parekh has vibe coded at least 30 separate $50,000 MRR sass, right? Then he actually, real Soham responded, I've been building before vibe coding was a thing. Replit has been tremendously helpful to bootstrap quick iterations, by the way. And Amjad Massad, the CEO of Replit says, now you know how Soham did 1,337 jobs, like,
It's almost a celebration of like what you can do if you're a little industrious and maybe use some AI tools. And maybe it is this kind of idea like engineers might have felt down and out, but maybe there's like a path forward that if you actually take advantage of the technology, you won't be replaced, but you can actually be more productive.
Well, yeah, and I think my favorite, I'd seen some tweet out there where it was basically like this is all sponsored content for some kind of like AI coding startup or because I think it does exactly that. It shows this is how you will succeed and the people who actually know how to use it will succeed at a grand scale and their lives will be easy and they can work for a job. So I definitely...
Yeah, I think it felt like overall, you're right, Soham, it wasn't a mean pylon. It was equal parts pylon and celebration.
Exactly. There's an interesting, and it also sort of goes to like, how many engineers are doing this outside of Psalm? Like if he's, you know, really gone to the 10th degree to try to make this work, who else is trying to do it? And this is from, I don't know, I can't like confirm the veracity of this, but there's somebody on Twitter called Igor Denisov Blanche who said, my research group at Stanford has access to private code repos from 100,000 plus engineers at almost 1,000 companies.
and about a half percent of the world developed world's developers within this small sample we routinely find engineers working two plus jobs i estimate that easily more than uh about five percent of engineers are working two plus jobs you know whether that's true or not this concept is just going to become much more common now with ai and it's funny because like before maybe before this vibe coding moment people would have been like uh even angrier about soham uh
And now they're looking at it and they're like, well, he's just taking advantage of the technology that we're building. Even if he didn't vibe code at all, it's going to be more possible to be a successful Soham in the future, I would argue. Yeah. And I mean, every hustle bro, like you make 50K MRR while sitting on the beach by vibe coding. He's the living proof. Soham showed us all you can do it. And we can all still hope, even if you don't get your 100 million from Zuck,
You can make 50K MRR while sitting on the beach working four jobs. So how many other SOHOMs do you think there are out there? By the way, he's come out, he's apologized. A lot of this is alleged. So let's just put those caveats in. Well, I also, how do you work...
four jobs, like, I was just thinking, like, I mean, how much interaction, like, fake interaction do you need to do? Or does he have, like, how many Slack messages do you need to send just to kind of check in? Because on one hand, like,
Yes, the actual concrete work of four jobs, leveraging Replit and Cursor and tools like that, the idea that an engineer could do the work of four engineers, what they were doing three, four years ago, definitely makes sense to me. But just getting onboarded, getting your...
401k or health insurance set up, just sending slacks in the general channels, checking in on how people are doing or I don't know, like, is it possible you just don't have to do any of that? And you can just almost like a machine get a task?
I don't know. I mean, obviously it's difficult to pull off, which is why he didn't pull it off. But who knows? Maybe in the next days of AI avatars where the AI avatars of the Zoom CEO and the Klarna CEO are doing earnings, you can have your bot show up and take your meetings and you can use an agent to do your onboarding. Yep. Okay. Not too far. That's the dream, right? That's the dream while you're sitting on the beach 50K MRR.
This is why I think SOAM has become a folk hero. This is engineers saying, you think you're going to replace us with AI? Screw you. We're going to take 15 jobs and, you know, it's going to work out better for us, the workers, than you, the owners. I can see that. I can. But then again, we will shrink the size of the industry by 15.
14, 15th, but those of us left standing will be sitting on the beach rolling in that revenue. Yeah. He gives new meaning to the 10 X engineer. Yeah. It's just 10 of them. Actually, wait, that's a Google strives for 10 X engineers. Uh,
What if you're a 4X, but you're just across four different jobs? You should be equally as celebrated, I think. Oh, 100%. I think it's time to do that. And if you can, maybe he gets 10 of those super intelligence jobs at Meta and he becomes the first billion dollar a year.
Rank and file engineer. Actually, I only have respect for the first researcher who gets $200 million a year jobs, both at Meta and at OpenAI, and somehow is able to work in both and no one notices. That's the dream. Mark my words, this is going to happen. You will see this happen. Be sure of his day. We're going to see it. Soham is the leader of a trend, honestly. Soham, we all respect you.
What a legend. All right, let's go out and enjoy the holiday weekend. And if you're in the U.S. and if you are outside of the U.S., have a great weekend yourself. Ron John, great to speak with you as always. Thanks for coming on. All right, see you next week. All right, everybody. Thank you so much for listening. On Wednesday, Ed Zitron is going to come on to talk to us about whether the entire AI business is a scam. He feels quite strongly about that. We'll debate it and have a fun discussion. Thanks again for listening. And we'll see you next time on Big Technology Podcast.