We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode EP 506:  How Distributed Computing is Unlocking Affordable AI at Scale

EP 506: How Distributed Computing is Unlocking Affordable AI at Scale

2025/4/17
logo of podcast Everyday AI Podcast – An AI and ChatGPT Podcast

Everyday AI Podcast – An AI and ChatGPT Podcast

AI Deep Dive AI Chapters Transcript
People
J
Jordan Wilson
一位经验丰富的数字策略专家和《Everyday AI》播客的主持人,专注于帮助普通人通过 AI 提升职业生涯。
T
Tom Curry
Topics
Jordan Wilson: 生成式AI和大型语言模型的兴起使得计算能力变得越来越重要,开源模型的进步也使得更多公司开始关注计算能力。分布式计算是解锁大规模经济适用型AI的关键。 Tom Curry: Distribute AI通过利用闲置计算资源,为消费者和企业提供更经济的AI解决方案,并创建更开放、易访问的AI生态系统。我们提供双向解决方案:企业可以贡献闲置计算资源,同时也可以使用平台上的AI模型。 当前的芯片技术已接近其计算能力的峰值,无法满足不断增长的AI模型计算需求。大型科技公司也难以满足日益增长的AI计算需求,导致资源紧张,AI模型的计算需求巨大,对电力资源造成压力。 AI模型既在变得更小更有效率,也在变得更大更复杂,这给行业带来了挑战。更小、更高效的模型(例如DeepSeek)结合实时数据和推理能力,将推动AI技术进步。 开源模型与封闭模型之间的差距正在缩小,这将推动边缘计算的发展,并对GPU需求产生影响。边缘计算将承担更多日常任务,而大型模型则用于更复杂的应用场景。未来,人们出于对隐私的担忧,可能会更多地使用边缘计算来运行AI模型。 如果AI模型商品化,计算能力将成为提供AI服务的关键竞争因素。虽然开源模型的崛起对封闭式AI公司构成挑战,但封闭式AI公司仍有其独特的优势和应用场景,例如处理敏感数据(如医疗数据)。封闭式AI公司未来可能专注于处理敏感数据(如医疗数据)的应用场景,以及政府合同等。 商业领袖在使用AI时应保持开放和灵活的态度,避免过度依赖单一供应商或模型,因为AI领域的形势变化迅速。

Deep Dive

Chapters
The conversation begins by highlighting the rising importance of compute in AI, particularly with the increasing prevalence of generative AI and large language models. The discussion emphasizes how the need for compute is now a top priority for many businesses due to the potential offered by advanced AI models.
  • Increased importance of compute in AI, especially with generative AI and large language models.
  • Compute is now a key priority for many businesses due to the new possibilities offered by AI models.

Shownotes Transcript

Translations:
中文

This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. When ChatGPT first came out, no one was talking about compute, right? But over the last few years, as generative AI and large language models have become more prevalent,

The concept of GPUs and compute has become almost like, you know, dinnertime conversation, at least if you're, you know, crowding around the dinner table with a bunch of dorks like myself, right? But I think even more so the last few months, you know, as we've seen closed or sorry, as we've seen open source models really close the gap with proprietary enclosed models and

I think this concept of compute is even more important because now all of a sudden you have a lot of, you know, probably millions of companies throughout the world, medium-sized companies that maybe weren't concerned or, you know, weren't really paying attention to having their own compute maybe two years ago. Now, all of a sudden it might be a big priority because of the new possibilities that

very capable, large language models, and even smaller in open source models, all these capabilities they're giving to so many people. So that's what one of the things we're going to be talking about today. And also how distributed computing is unlocking affordable AI at scale.

All right, I'm excited for this conversation. Hope you are too. What's going on, y'all? My name is Jordan Wilson, and this is Everyday AI. So this is your daily live stream podcast and free daily newsletter, helping us all not just keep up with what's happening in the world of AI, but how we can use it to get ahead to grow our companies and our careers.

If that's exactly what you're doing, you're exactly in the right place. It starts here. This is where we learn from industry experts. We catch up with trends. But then the way you leverage this all is by going on our website. So go to youreverydayai.com. So there you'll sign up for our free daily newsletter. We will be recapping the main points of today's conversation as well as keeping you up to date with all of the other important AI news that matters for you to be the smartest person in AI at your company. All right. So

enough chit chat y'all i'm excited uh for today's conversation if you came in here to hear the uh the ai news technically we got a pre-recorded one uh debuting it live so we are going to have that ai news uh in the newsletter so make sure you go check that out all right cool i'm excited to chat a little bit about uh computing and how it's changing and making ai affordable at scale so uh please help me welcome to the show uh we have

Tom Curry, the CEO and co-founder of Distribute AI. Tom, thank you so much for joining the Everyday AI Show. Thanks for having me. Appreciate it. Yeah, cool. So before we get into this conversation, which, hey, for you compute dorks, this is right up your alley. But for everyone else, Tom, tell us, what does Distribute AI do?

Yeah, so we're a distributed AI app layer. What that really means is we're basically going around and capturing spare compute. It could be your computer. It could be anyone's computer around the world. And we're basically leveraging that to create more affordable options for consumers, businesses, things like that, mid-level businesses. And we're really, the goal is actually to create kind of a more open and accessible AI ecosystem. We want a lot more people to be able to contribute, be able to leverage kind of the resources that we advocate. It's a pretty cool product.

Cool. So, you know, give us an example. So, you know, kind of even in my hypothetical I just talked about, let's say there's a medium-sized business, right? And maybe they haven't been big in the data game. Maybe they don't have their own servers and, you know, they're trying to figure it out. So what is kind of that problem that you all solve?

Yeah, so it's a two-sided solution. It's a great example, right? You go to a business and they have, say, a bunch of computers sitting around in their offices. At night, they can connect into our network very quickly. We have a very quick one-click program to install. They can run that at night and provide compute to the network. And then when they wake up the next day and they want to leverage some of the AI models that we run, they can quickly tap into our APIs and basically get access to all those models that we run on the network. So kind of two-sided, right? You can provide on one side and you can also use it on the other side.

Very cool. All right. So let's, let's get caught up a little bit with, uh, you know, current day, because like I talked about, right. I don't think, you know, compute and GPUs were at the top of, you know, most people's mind, uh, you know, especially when, you know, the GPT technology came out in 2020, uh,

let alone in late 2022 when ChatGPT was released. So why is compute now just like one of the leading, I mean, we're talking about national security. We're talking about $100 billion infrastructure projects. Like why is compute now this huge term when it comes to just the US economy at large?

Yeah, totally. So, I mean, five years ago, if you go back, right, gaming was the biggest use case for GPUs. Nowadays, it's all AI, right? That's why there's huge demand for it. These models are getting bigger in some cases. They're also getting smaller. Chain of thought uses a ton of different tokens. So, although the models are smaller, they still use a ton of resources. The reality is, is that silicon, as it stands today,

One of our team members actually works on chips a little bit. We're basically reaching the peak capacity of what we can do with chips, right? We're definitely stretching then the current technology that we have for chips. So although the models keep getting better, bigger, larger, more compute demand, the reality is that the technology is just not able to keep up. We're about 10 years out, give or take, from actually having a new, basically a new technology for chips.

Sure. And, you know, as we talk about current demand today, right? You know, you always see all these, you know, jokes online. You know, people are like, you know, we'll work for compute, right? And the big tech companies, you know, open AI, right? Like whenever they roll out a new feature, you know, a lot of times they're like, hey, our GPUs are melting. We're going to have to pause new user signups. You know, why isn't that even the biggest challenge

tech companies can't keep up with this demand. Yeah, I mean, it's a crazy system where Anthropic has the same issue, right? Where cloud tokens are still kind of limited to this degree. We're running to the point where you're basically running, you're stretching the power grid thin, you're stretching every resource that we have in the world to run these different models. At the end of the day, you know, OpenAI, I think they use primarily NVIDIA for their data centers. But once again, in

NVIDIA has demand all over the world for these chips. So they can't allocate all of their resources only to OpenAI. So OpenAI has certain on-serve threshold that they rent from and use. But the reality is, it's just there's too much demand. You're talking about millions and millions of requests. And the requests, for example, like image generation, these aren't like one second requests.

You're talking about 10, 20 seconds to actually return these. And video models are even worse. You're talking about minutes potentially, even on H100 to H200. So the reality is, like I said, our compute, our power grid cannot possibly keep up with demand. And we don't have the latest gen chip for not enough. So, you know, one thing, and you kind of mentioned it, I think at the same time, we're seeing models become accessible

exponentially smaller and more powerful, right? Like as an example, OpenAI's GPT-4.0 Mini, yet then you have these monster models like GPT-4.5, right? Which is reportedly like five to 10 times larger than GPT-4, which was, I think, like a 2 trillion parameter model. So walk us through like the whole concept of

of models both getting technically smaller and more efficient, yet models also at the same time getting bigger. And then how does that impact the industry as a whole? Because it seems like it's hard to keep up with.

Yeah, on one hand, it kind of reminds me of cell phones back in the day, right? Where we would progressively get them smaller, and then eventually we had a new feature to get bigger and then kind of get smaller again. The reality is that a year ago, larger models, we were basically just throwing a million different data points into these models, which

made the models much larger and they were relatively good. But the reality is, is that no one wants to run a seven billion, you know, it's 70 billion, 700 billion parameter model, right? So we've gotten them smaller. They're still, now they're kind of working with the intricacies of how we're actually running these models. So chain of thought basically enables you to give a better prompt, right? It basically takes a human prompt, turns into what the system can read better and then gives you a better output. And it also might run through a bunch of tokens to give you a better output.

So Chain of Thoughts is a really cool way to basically reduce the model size. But the reality is, is that although we're cutting the model size so we can put it on a smaller chip, the reality is, is you're still using a million tokens, which doesn't really actually help our computations. It's kind of backwards how it works. Yeah, it is interesting, right? So yeah, even now we have these...

newer hybrid models in Claude 3.7 Sonnet, in Gemini 2.5 Pro. And you use them and they seem relatively fast. And if you don't know any better, you might say, okay, this seems efficient. But then if you look at the chain of thought, or if you click like show thinking, you're like, my gosh, it just spit out 10,000 words to tell me what's the capitalization

the capital of Illinois or something like that. Right. So, you know, as models get smaller, you know, this is something I'm always interested in. You know, might we see a future where, you know, that more, you know, hybrid models or the reasoning models, will they eventually become less efficient or is that always going to be something, you know, kind of like on one side models get smaller, but they're getting smarter and so they're going to have to just think more regardless.

Yeah, that's a good question. I think that will get to the point where they're highly efficient. I mean, realistically, the gains we've made with even DeepSeek is just incredible, right? Even their 7 billion parameter model, which is relatively small, you can run it on most consumer-grade chips, is extremely good. The prompting is great.

It obviously has a pretty good knowledge base. And once you really combine that with the ability to surf the internet and actually get more answers and use more data, that's where I think we'll get to. I wouldn't call it AGI, but we're very close to that, where basically you're adding in real-time data with the ability to kind of reason a lot more. So I do think we'll get there. I think the progress that we made, although it seems like it's been forever since kind of the first models came out, the progress was insane and extremely quick. Yeah, I'm confident.

Yeah. And, you know, speaking of DeepSeq, I know it's been, you know, all the rage to talk about DeepSeq over the last, you know, the last couple of months. But I mean, I think you also have to call out Google, right, with their Gemma 3 model, which I believe is a 27 billion parameter, you know, greatly outperformed DeepSeq V3, which is, I think, 600

plus billion parameter, at least when it comes to ELO scores. And it's not even close, right? So what does this say about the future, right? I know I kind of named, you know, two open models there, but you know, they're getting even, even the open, right? Everyone's like, oh, deep seek is, you know, changing the industry. Well, I'm like, yo, look at Gemma three from Google. It is a

5% the size and way more powerful when it comes to human preference, right? So what does this even mean for the future of edge computing? And how does edge computing impact compute need or GPU demand? Yeah, well, we started this business. The reality was is that although we wanted to convince ourselves that open source models were good, we were based in violence. Open source models were relatively bad. Open AI was extremely dominant at that time.

It was like you couldn't even believe that anyone would ever catch on to OpenAI. Nowadays, we're probably running at like a one to two month lag between parity of private source and open source model, which is really interesting. And when you tie that in with the idea of kind of data privacy and things like that, I think there is a huge argument for basically edge compute taking over a lot of the smaller daily tasks and then reserving some of the more

Private models and things like that and the larger Models for things that might be a little bit more Deeper like research and things like that but A lot of things that you do on a daily basis That AI can actually improve I think you can run purely on edge compute And basically have your House and your couple computers and things like that Maybe your laptop or iPad basically Turning to this little tiny data center That allows you to run whatever model you want to run At that time we're just really far Away the reality is you can do that today Right we could probably be able to do that

The only problem is that getting it from teaching people to basically use that and set it up, right? It takes time for people to learn how to, oh, install your own model and start running things. So it's more of like the UX of it more than anything. Yeah. You know, and that...

I always think with these models becoming smaller, more capable, will those things be edge in the future? I even saw the NVIDIA GTX, formerly called Digits,

You know, I did the math on that. I'm like, that would have cost five years ago, I think like $70,000. It wasn't even capable to do it anyways, right? Like, are we going to have the average, you know, smartphone in five years? Will it be able to run state-of-the-art large language model? And if so, like, how does that change the whole cloud computing conversation?

It will be really interesting. I think you're 100% right. And I think five years might even be a stretch. I think what we'll come down to, like I said, is privacy. If people are really worried about their privacy, then I think that people will push for edge compute to be running and you'll be able to run your own model that only uses access to your own data on your phone, device, whatever it is, right?

If people don't care about that as much, it might take a little bit longer just because people won't build that. But I really do think there are some teams that are building in that angle where essentially you're going to have your little database of information about yourself and your life and your wife and whatever else. And essentially you'll be able to run all that stuff without ever touching any centralized model for obvious reasons, privacy reasons, things like that.

We already give so much data to the big tech, right? I think we're good on giving any more and sharing any more intimate details about our lives. It'll be a good thing if we can do that. Yeah. And, you know, even as we start looking, you know, at this race, which, you know, if you looked at it two years ago, you know, I don't know if anyone, even the, the, the staunchest, you know, open source believers would, would believe that we're at the point that we are now, but

You know, between whatever we're going to see from Meta in their next Lama model, I've already talked, you know, we've already talked about DeepSeek and, you know, Gemma as well. And, you know, OpenAI also has recently said that they're going to be releasing an open model. Supposedly, supposedly. Yeah, yeah, yeah. We'll see what happens. We don't buy any of that.

Yeah, I remember the GPT-2 open fiasco, right? But regardless, I mean, what happens when and if open models are more powerful than closed and proprietary models? So number one, what happens from kind of a GPU and compute perspective? But then how does that change the business leader's mindset as well?

Yeah, so at that point, once things become commoditized, right, and the models are essentially all on the same level, give or take a little bit of change between them, variation. The reality is, is that compute becomes the last denominator of basically being able to offer those models at the cheapest cost, right? So at that point, it basically comes down a race to the bottom in terms of who can get the cheapest compute and offer it to people with the best selection of models. And UX and UI all comes into that, right? Marketing, things like that.

Assuming that that does happen, the question then comes down to what happens to all these private services, right?

Which my personal view on it is, is that there is probably a world where essentially open AI and anthropic eventually burn so much money, which they lose money every day already, that they don't get to the point where they're looking to get to. And essentially they have to just either change business models or run out of money, right? I think that's probably a little bit of a point, a contentious point. But the reality is, is that right now we're running models that are

very close to as good as what they have. And it's like, at what point does the marginal gain isn't working right? When H100s become a lot cheaper, we'll be able to run some of the biggest models very quickly and easy, and the access will just be so good that it might not matter.

The problem is that I personally do, I've always believed in private source. I do believe that there's great use cases for it. And the reality is, whether you love Sam Altman or hate Sam Altman, he's pushed things forward a lot. He's been really productive for the entire environment. So you don't want them to go bankrupt. They might just have to figure out a way to appeal to consumers or businesses in a different way as opposed to just general models, which is what they do, right?

I think in a great way, they talked about Siri and things like that. They'll probably figure out ways of tying to the real world. Yeah. So speaking of, you know, affordable AI, and you just brought up as well, you know, companies like OpenAI and Anthropic, right? Their burning of cash is well documented. You know, but I mean, does this at a certain point, if large language models become commoditized because of open source models, you know,

Is it just more of the kind of the application layer that becomes the thing, you know, these companies real differentiator, right? Because aside from, you know, OpenAI is $200 a month, you know, pro subscription. It's like, okay, which they also said they're losing money on. Like aside from that,

How else are these big companies that so many people rely on going to continue to exist five, 10 years after their $40 billion of funding might run out if they're not? We've been saying about this about Uber for how many years now, though. To be fair, these companies can exist a long time without being profitable.

But reality, I think the reality is, is that the one thing that the centralized type of providers offer, like OpenAI, is that they're able to work with a lot of data that would be very sensitive, primarily like health data and things like that. So I'm sure there's a lot of very good business use cases that they can provide to very large enterprise consumers, or not consumers, businesses. And I don't really know what those are outside of like health and things like that, that data that's very private.

you know, government contracts and things like that. Those models are super useful for that. But it will be tough. I mean, it would really, I mean, I feel like we're almost there already, to be honest with you. Like I said, I don't think we're that far away from the point where people are like, why don't, let me just cancel OpenAI and go and use Long. Like, let me go cancel and use Jump. You know, all these different models that are out there. There's so many good ones at this point.

Um, but it might be more integrations. It might be more, like I said, UI UX. Um, it might be the fact that at the end of the day, we use iPhones every day and, and Androids, and maybe they just put a true monopoly on being able to use them, you know? So we'll see. Yeah. It's interesting. So, um, you know, we've, we've, we've covered,

a lot in today's conversation, Tom, when this concept of distributed computing and how it's the race between open source AI and closed AI is really changing the compute landscape and just the AI landscape as a whole. But as we wrap up today's show, what's the one most important or the best piece of advice that you have

for business leaders when it comes to making decisions, right? About how they are using AI at scale.

Yeah, that's a great question. I think the best advice, the thing that we've learned the most from our personal business that I can provide is that the landscape changes so fast. The last thing you can do is lock yourself into one specific provider or model. Don't allocate too many resources and sell the house on one specific setup because the next week something comes out and totally breaks everything before it, right? So make sure you're open, make sure you're flexible on what you're using and how you're using it.

and be ready for someone to come out and completely break the mold and change the direction of everything. It's such a fast-paced environment. It's really hard to keep up. And, you know, I think we're just kind of still scratching the surface on where AI will actually integrate into it.

All right. Exciting conversation that I think a lot of people are going to find valuable. So, Tom, thank you so much for sharing your time and coming on the Everyday AI Show. We appreciate it. Thank you so much for having us. We really appreciate it. All right. And hey, as a reminder, y'all, if you miss something in there, you know, a lot of big terms we're tossing around and getting a little geeky on the GPU side, don't worry, we're going to be recapping it all in.

in our free daily newsletter. So if you want to know more about what we just talked about, make sure you go to youreverydayai.com, sign up for the free daily newsletter. Thank you for tuning in. We hope to see you back tomorrow and every day for more Everyday AI. Thanks, y'all. Thank you.

And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.