Doug O'Loughlin argues that the era of aggregation theory is behind us due to the rise of AI, which reintroduces marginal costs to software businesses. This shift challenges the zero marginal cost model that underpinned hyperscalers' business models, making technology more expensive and compute-intensive.
OpenAI's o3 model represents a shift in AI architecture, where the model spends more compute time to deliver better answers. This approach contrasts with earlier models that relied solely on scaling up the model size and data. The o3 model demonstrates that inference-time scaling can significantly improve performance, marking a new direction in AI development.
Inference-time scaling in AI is analogous to Moore's Law in semiconductors, where progress is achieved by focusing on different vectors of improvement over time. Just as Moore's Law evolved through advancements in lithography, metallurgy, and transistor design, AI scaling is now shifting from model size to optimizing compute usage during inference, enabling better performance without solely relying on larger models.
Hyperscalers face challenges as AI reintroduces marginal costs to their previously zero marginal cost business models. The increased compute requirements for AI workloads make technology more expensive, forcing hyperscalers to adapt their infrastructure and business strategies to remain competitive in a more compute-intensive future.
Ben Thompson finds the shift away from aggregation theory exciting because it re-energizes the tech landscape. He views the dominance of aggregation theory as having become stagnant, with antitrust issues being the primary focus. The rise of AI and other disruptive technologies offers new growth drivers and opportunities for innovation, moving beyond the limitations of the aggregation era.
Ben Thompson draws an analogy between the shift in AI and technology and the post-World War II media consensus. Just as society moved from a narrow, centralized set of facts to a more fragmented and diverse media landscape, AI is driving a similar transformation in technology. This shift challenges existing paradigms and requires new ways of thinking about trade-offs and decision-making.
Hello and welcome to a free preview of Sharp Tech. ♪
And Jeremy will kick it off. He says, Hey, Ben, it looks like your friend Doug O'Loughlin wished you a happy new year by declaring, quote, the era of aggregation theory is behind us because of test time compute and the arrival of marginal costs to software businesses. But is it? The way Chalet describes the 01 and 03 models, they sound more like brilliant product design than a breakthrough model.
Are chain of thought and deep learning guided program search the seeds of a brand new architecture? Or are they genius hacks to force a model that doesn't quote unquote reason to correct itself until it delivers something good? It looks to me like a bunch of independent LLMs in a trench coat being puppeteered by a separate model.
And just to anchor the majority of the audience that probably has not read the piece from Doug O'Loughlin on fabricated knowledge, I'll read a portion of his thoughts on 2025 AI and the semiconductor outlook. He writes, "...the era of aggregation theory is behind us, and AI is again making technology expensive."
This relation of increased costs from increased consumption is anti-internet era thinking. And this will be the big problem that will be reckoned with this year. Hyperscalers business models are mainly underpinned by the marginal cost being zero. So as long as you set up the infrastructure and fill an internet scale product with users, you can make money.
This era will soon be over and the future will be much weirder and more compute intensive. Looking back on the 2010s, we will probably consider them a naive time in the long arc of technology. One of our fundamental assumptions about this period is unraveling. So Jeremy in his email continued on with more takes on 01 and then says...
Anyway, I'm not the one whose career-defining theory was just unceremoniously dumped by a close friend. Do you agree with him? So, Ben, we got a couple emails asking about that particular article. What do you think of it? Number one, I am certainly not upset to the extent that is the case because, frankly, the whole aggregation theory angle was getting pretty boring. This is the entire reason why...
I've always been super intrigued by not just AI but also crypto. That is also sort of a technology that is to a certain degree in defiance of the realities of aggregators. Aggregators are all about, as noted, zero marginal cost content, the endless duplication of content, all these sorts of principles and stuff that sort of stands against that is sort of inherently compelling and potentially valuable in that world for...
for that precise reason. I mean, stepping back by and large, nothing goes away. Everything sort of builds up on, on everything else. Right. And this is always a challenge when you talk about new paradigms is people
We humans sort of tend towards zero-sum thinking with a lot of this sort of thing. It's like, oh, well, this era's done. The new one's here. When in reality, the new era sort of layers above. Now, from a sort of analyst and pundit perspective, that's often a useful way to think because what gets layered above –
sort of a growth driver. It's just sort of there, right? The sort of go-to example is we're still using mainframes and like, like they're still undergirding banks and they're still undergirding back offices and ERP systems and all these sorts of things. And even there, there's been lots of interesting changes and disruptions that no one really pays attention to because all the interesting stuff sort of happening at the high end. So I think to the, the, the way I would reaffirm,
reframe this is is sort of aggregation theory as the dominant driver of growth in technology and thus by extension the US and thus by extension the world is that ending yeah I mean yeah I think that that's at least as constructed that's a reasonable way to put it but that for
from my perspective to sort of start selfishly is a reason I personally feel excited and re-energized. Like, like, cause that's like, there was a period circa probably around the middle of the pandemic where,
And sort of, I guess that would put it between 2020 and 2022 when sort of ChatGPT came out and really opened everyone's eyes to this, that like the only thing to write about with aggregation theory is like antitrust. And that sucks. I was going to say, you hit a point where it was like,
Lawsuits were the only potential variable in the middle of the landscape there. I mean, how much would actually change, though, given how deep the moat is for most every aggregator after the last 15 years? Well, so just let me address a couple more sort of points in this article.
And or in this sort of in this question. So I don't want to like just set aside the architecture of like 0103. But I also do want to set aside the architecture because I think to dismiss it as a pile of hacks or this is just a new algorithmic change instead of like a model to rule them all.
That is not an insult. That's how sort of progress happens. And I think actually, speaking of Doug, him and Dylan or maybe it was Dylan had an article about scaling before that made the point that, oh, the reason why, you know, if you have inference time scaling, which I wrote about earlier this year when sort of 01 came out.
There's sort of a theoretical argument that sort of came up around this, which is, well, no, that's cheating. We were told the scaling was on the model. You just make the models bigger and they get better. You can't say the scaling somewhere else. And what I really liked is Dylan made an analogy to Moore's law, which Moore's law, the way in which it progressed was Moore's law itself was linear, but the axes of improvement continually change.
So for a lot of people listening, all the knowledge and awareness of semiconductor manufacturing has been about lithography.
EUV, EUV, ASML. And that was the main vector of sort of improvement in Moore's law and making chips smaller and more efficient over the last 10 years, 10 to 15 years. Before that, lithography wasn't that big of a deal. It was kind of a really big deal in the 80s and 90s when the Japanese really took over. And there was a bit where we went to immersion lithography, and that was a big deal. But there were other aspects that actually really mattered.
There was, you know, whether it be sort of the metallurgy that goes into it in different materials or transistor design. When we went to 3D transistors, it was called the FinFET transition, where it used to have planar transistors, which were more 2D. Then suddenly you're stacking. If you think about it, these chips used to be sort of all these flat gates. Then they basically became like skyscrapers, where they went up and down in addition side to side. If you just can envision that, what an incredible shift that was, right? Yeah.
But if you zoom out and look over the period of processor change from the 60s till now, it looks like a fairly continuous curve that has flattened a bit. And it does continue to flatten somewhat. But it's like a continuous smooth improvement. But that improvement was driven by focusing on different things at different times. So different vectors depending on the decade. That's right. You hit a wall in one era, so you start –
What else can we improve? We're talking about a process that is thousands of steps. All those steps can be transformed and changed and you can sort of do new things. And that applies, I think, generally. And in this case, the sort of scaling question, the reason why 01 was a big deal when it came out. And honestly, I think 03, it's super impressive and the results are amazing. I personally wasn't as blown away by
But that's because I was blown away by 01. Because what 01 showed is, yes, before with all the LM models, it's like all the scaling was on the front. You make bigger and bigger models with more and more data and apply more and more processes to it. You get smarter and smarter models.
Suddenly with 01, it's like, okay, wait, this is the same in broad strokes base model as GPT-4. What's different is that when you're asking it to do an answer, it's spending more time and critically, the more time it spends, i.e. the more compute it uses, the
the better the answer is. That's what Doug is talking about here. And the big thing here, and this is sort of the big theme that I'm thinking about. And, you know, what's going to be my opening article of the year? I don't want to sort of give too much away. But I think there's a counterintuitive reality here, which is we're moving from a time of... There's almost like a broader media analogy to a certain extent where...
you know, you talk about sort of the post-World War II consensus and everyone watched TV and there was the nightly news and everyone had the accepted same set of facts. And there was debates and there was right and left and Democrats and Republicans or whatever it might be, but actually it all sort of operated within a pretty narrow bound. And this applies to the stuff we've talked about when it comes with things like trade and globalization and questions on those lines. And
One of the things that's happening, and this sort of came up during COVID, where I think a critical mistake that was made in COVID was a deference to experts.
And the problem is experts are expert in their specific domain, but political decisions sort of by definition encompass multiple domains. It's about making trade-offs. It's a trade-off between sort of limiting the spread versus kids need to go to school versus businesses need to stay in business versus hospitals overwhelmed. Just to choose, I think I just picked two on both sides.
that argue for one direction or the other. And there was a certain sort of abdication of responsibility amongst the political class by deferring to experts. But the deferring to experts was itself a political choice because you were foregoing the responsibility of making a trade-off. Because experts don't have a priority stack, right?
per se. They have, or at least they're likely to lean heavily on their area of expertise in terms of what they're prioritizing. That's exactly right. That's exactly right. And so guess what? The public health officials think we should do the maximum amount of thing for public health. They're not thinking about the economics. They're not thinking about kids' educational outcomes. They're not thinking about like,
What's the implications of people getting turned off to all vaccines instead of just this vaccine? Right. And it works like if you go to a lawyer and say, should I publish this? I might get sued. The lawyer every single time will say, do not publish that.
This is a lesson I learned at Microsoft because when I was at Microsoft in the 2010 or 2011, 2012 era, we were still under consent decrees or some of them were just rolling off from sort of the Justice Department sort of thing. And of course, you got the whole talk about there's certain words you don't use, especially in email. But one of the key things that my manager sort of told me, he's like, look, a big mistake people make
But here in particular is they overly defer to, I think it was called LSA or legal services or whatever. But they, they, they listened to the lawyers too much. He's like the, the lawyer's job is to inform you of the risks in
It's your job to make a decision like, like, and sometimes you're going to do what the lawyer doesn't want you to, because you weighed the risk. You understood the issue, but you, you were thinking about the bigger picture. It's not their job to think about the bigger picture. So the lawyer example is a great example. You see lots of folks paralyzed and lots of companies sort of get all tied up in this. And I think we've witnessed this to a certain extent, you know, are you with us? So this happened with Google and it could be, uh,
lawyer stuff. It could be sort of like the political fervor sort of aspects. Yeah. Yeah. Like, like as a manager, your job is to make decisions is to make trade-offs as a politician. Your job is to make decisions, to make trade-offs. And then you get held accountable for that as a manager. If it worked out, you get promoted. If it doesn't, you're out the door. If you're a politician, ideally you get reelected. If you don't, you're out the door. That's the way it's supposed to work. But,
But there is a large extent to which I think society generally got away from that. And I think that was probably the case with tech. And that's something I think that Doug is getting at here. The decisions, the architectural decisions, sort of what you do were easy in a certain sense because what in theory is a big decision, like how much do I spend on this, was not a decision at all. You spend the maximum amount possible.
All right, and that is the end of the free preview. If you'd like to hear more from Ben and I, there are links to subscribe in the show notes, or you can also go to sharptech.fm. Either option will get you access to a personalized feed that has all the shows we do every week, plus lots more great content from Stratechery and the Stratechery Plus bundle. Check it out, and if you've got feedback, please email us at email at sharptech.fm.