We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode 894: In Case You Missed It in May 2025

894: In Case You Missed It in May 2025

2025/6/6
logo of podcast Super Data Science: ML & AI Podcast with Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

AI Deep Dive AI Chapters Transcript
People
J
Jeroen Janssens
J
John Roese
M
Martin Brunthaler
M
Mary Spio
T
Thijs Newdorp
Topics
John Roese: 作为戴尔的全球首席技术官和首席AI官,我认为企业应用AI包含两个关键部分。第一部分是利用专有数据,这是企业差异化的重要来源。第二部分是数字化企业内部的独特技能。RAG-based Chatbot是一种工具,可以解锁和利用我们的专有数据,使我们能够通过生成式界面提问和回答任何问题。而AI Agent则更进一步,它不仅仅是解锁数据,而是数字化技能,实现工作的自主分发。我设想AI Agent可以自主运行,无需人工干预,只需设定目标即可。当然,目前的AI Agent还不能完全替代人类,它们只能在特定领域执行任务,但它们代表了企业AI应用的未来方向。我认为,将专有数据和独特技能相结合,将深刻地改变大多数企业,为企业带来前所未有的发展机遇。

Deep Dive

Shownotes Transcript

Translations:
中文

This is episode number 894, our In Case You Missed It in May episode. Welcome back to the Super Data Science Podcast. I am your host, Jon Krohn. This is an In Case You Missed It episode that highlights the best parts of conversations we had on the show over the past month. In the first of my four highlights from May, I speak to Jon Rose, Global CTO and Chief AI Officer at the computing giant Dell. What a guest.

John and I had a detailed conversation about multi-agent teams, quantum computing, and the future of work in episode 887. And in this clip, I ask him to define two terms, AI agent and RAG-based chatbot. Let's talk about the natural next step that has emerged after generative AI, which is agentic systems. Because as

Generative AI has become powerful enough as LLMs have become reliable enough. We've started to be able to rely on them more and more on their own. Do you have, John, your own definition of what an agent is? Yeah, I'm going to give you a bigger picture view and then I'll define an agent. So AI is

Attached to the enterprise applying AI to the enterprise actually has two different parts to it of which only one we've done so far agents are the second one and the reason for that is the source of differentiation of an enterprise a lot of us in the industry have said this over the last couple of years and Even though people weren't necessarily paying attention but there were two parts that make an enterprise an enterprise the real core source of differentiation and

The first is your proprietary data. You know things other people don't know. That's actually very powerful. That's why you don't share your proprietary data with people. My customer list is very valuable. My source code is very valuable. And those are a sustainable source of differentiation. Even if the people change, the brand changes, the world changes, having proprietary data is very, very important.

The second source of differentiation is the unique skills in your organization, that you have people that can do things better than other people. At Dell, we have the best thermal and cooling people in the world, the best client developers in the world, the best storage software developers in the world. And the result of that is that translates into better products, interesting innovation, patents. And so if those are the two sources of differentiation,

And the journey we're on is to apply AI to an enterprise. And those are the two things that matter. It's interesting because for the first couple of years of Gen AI, we actually went after the first one. A chatbot, a RAG system, all of these things are just tools that allow us to unlock and create value from our proprietary data. What is a RAG-based chatbot?

It is a tool that takes proprietary data and makes it generative. You could take all of your service information and if I gave it all to you in raw format, it would be of no value. If I embed it into a vector database and present it to you through a generative interface, you can ask and answer any question on anything I know anywhere.

That is incredibly powerful, and we have been doing that now for about a year at scale in the industry, and it's transforming everything. We're getting huge value out of this. In fact, almost all of our projects that are in production are just that. They're a generative capability to unlock our proprietary data in novel ways that just changes the curve in terms of productivity. That's great. Agents are not that. Agents go after the second one. They are about the digitization of a skill.

They're about saying, I'm not just interested in unlocking the data. I'm interested in distributing the work.

I actually want an AI that doesn't even require me to do a task, that it can actually operate autonomously. It can operate without human intervention. In fact, I'm not even going to tell it how to do the job. I'm just going to give it an objective and let it go. And I'm doing this aligned to the skills that I need it to do. So for instance, you know, when we think about agents in the enterprise, now there's two views of this in the

Current thinking one one thinking out there is that agents will be replacements for multi-dimensional humans that can do everything That's a GI and a si we're a long ways away from that The reality of agents is that they are actually the digitization of more narrow skills. It's what I use this self-driving car example I do not have a self-driving car today that can drive anywhere in any situation and Navigate it successfully what we do have is self-driving cars they've been in San Francisco and other places where if you geofence it if you narrow the scope of

We see this in the trains and airports. There's no driver on them because it has one job. It moves from terminal to terminal without a human intervention. Well, that's what's going on with agents. The first generation of agents are saying, could I take a task, a skill?

And could I move it into AI, not as a tool that a person uses, but as a manifestation of that skill autonomously that I can just tell it to do something. I can give it an objective, and it's smart enough to figure out how to reason through that objective. It has access to a set of data, and it can deliver an outcome equivalent or better than what a human would have done for that particular skill. Yeah.

And yeah, there might actually be humans doing those specific jobs that might not do them anymore because agents can absorb them. But what you don't have is a fully well-rounded entity that is the equivalent of a full human being that can do lots of different things. Like think about in your life, how many different things can you do? Well, today, the manifestation of agents can probably pick off a few of those. But what they can't do is pick off all of them and create a completely...

equivalent of your whole well-rounded human being, including your ethics, your morality. That's a really hard problem. That's AGI and ASI, a different journey. And so bottom line is, you take these two technologies, first-gen, gen AI, which is what we call reactive AI, that a human is in the loop, and the human asks the AI to do something, and it gives an immediate response. But ultimately, the human is the doer of the work, and these are tools around the human.

And then you move over to this kind of second generation of agentic AI, which are complementary. And now you have a situation where the human is on the loop. They are the supervisor. And all they are doing is creating objectives and delegating work. And now the AI independently is able to take that task, figure it out, run with it, and even run with it in perpetuity. That it may never go back to the human being because it's been delegated below the machine line. The reason it's so important to distinguish these is that, one, they aren't even the same technology.

While this one, the center of the universe, is a large language model with some data around it, it's a very static data set. An agentic environment has large language models, but they're used for part of the equation. They act as somewhat of its brain, but it has a body. It has a knowledge graph where it creates its own representation of data, that it represents what it's learned and its memories and its evolution of skills. It has interfaces around it that allow it to reach out into the real world, something called tool use and function serving, where it can actually

go and activate a tool and interact with the world and perceive things. Very different technical architecture and quite frankly, appropriately so because it's solving a different problem. Now, fast forward into the future of an enterprise, well, yeah, still got proprietary data and still got unique skills, except now I have a path to digitize both of them. And that's the thing that's going to profoundly change most enterprises.

John touches on an important point about the future of AI being in the interconnectivity between tools such as AI agents and RAG-based chatbots, and I recommend listening to the entire episode to hear more about how John's team applies such integrations at Dell. Based on the social media response, listeners absolutely loved that episode. All right, my next clip is from episode 885 with Jeroen Janssens and Thijs Newdorp.

In maybe one of the liveliest conversations I've ever had on the show, we talk about the co-author's latest O'Reilly book on pollers and how their work with the major Dutch utility Alliander helped them to write Python Pollers, the Definitive Guide. Earlier in the episode, you mentioned about a real-world implementation of pollers and maybe, as you said, maybe the first ever production instance of pollers. Am I right in understanding that's Alliander? I'm probably butchering the pronunciation of that.

Yeah, Alliander. It's a power grid provider in the Netherlands. Also, they provide the infrastructure for both electricity and gas in a third to a half of the Netherlands, I believe. So the largest utility company in the Netherlands, therefore. I can't even say Netherlands. That's how bad I am at Dutch pronunciation. Nederlands. That's actually easier, isn't it? Yeah.

For us it is. Oh, that's what you're talking about. I was wondering. Where are these Netherlands? That ain't no country I ever heard of. And yeah, so tell us about that project and what it was like. And actually, it'd be interesting to know if, like, when was there overlap in working on the book and working on that project? And did working on a Polar's book help with a real world implementation? Anyway, that's kind of an interesting side question. Yeah. Yeah. So the origin story here is that

Thijs and I, we were both very excited about Polars. We were writing a book about it. And then all of a sudden, it became clear that at Aliander,

We needed to speed up the pipeline, right? We need to lower cost. We need, we needed to process much more data. And in the current state that just wasn't possible. It was a combination of not only Python and pandas, but also our code. So it was very inefficient to give you an idea. We were running this on a single AWS instance that had, that had over 700 gigs of Ram.

700 gigs of RAM. And so, yeah, we can provide you a link with more backstory to this with some actual numbers. But we were very excited and we were like, hey, let's try this out. Let's do this. At first, the team was very hesitant, right? We're there, two people or three, actually. We had another colleague, three people.

promoting Polars that is being developed at Xomnia. So they were very skeptic, understandably. So what we did in order to convince them is to just take on a very small piece of code, some low-hanging fruit, and benchmark it, and reimplement the Pandas code into Polars, and then just show the numbers.

And by then they were immediately convinced, right, this is indeed way faster, uses way less memory. Let's try this out. Let's take on this huge code base piece by piece by translating, not one-to-one because you can't do that. You really have to reason about the inputs and the outputs and then do it in an idiomatic way, right? You cannot just translate pandas to Polars.

And, you know, I think it took us, well, what, six months, a year?

I don't even remember. But eventually, I left that client at that time. But there was a moment like, okay, we can now get rid of R and pandas as a dependency of this project. And it's been running smooth ever since. Yeah, definitely. Yeah, I think ultimately, like the size of jobs at the beginning was about 500 gigabytes for just that task.

doing like one calculation and we shrunk it down like both uh being a consequence of implementing polaris but also uh on the as we were going rehashing some of the uh the the code structure that we were using in the project we hashed it all the way down from 500 to 40 gigabytes which makes it a lot more a lot more doable calculations now and

So the second part of your question was like, okay, how did this influence each other, the book writing and putting it into production? And this was, yeah, it was a perfect match because when you're just, when you're actually need to put it into production, when you have a real problem to solve, that's also when you start to notice the limits, right? Or maybe inconsistencies or missing functionality, right?

For example, there was this random sampling with weights. That's something that you can do in pandas. You just give it another column that indicates the weights for the sampling. That's something, maybe even up until this point, something that Polaris doesn't have. Luckily, that was for an ad hoc analysis that we had to do. But at that point, it becomes clear what Polaris can and cannot do.

Also, when you write, you start to look at things from a little bit of a higher level. So sometimes we noticed inconsistencies in naming or missing methods. Like, hey, why is there no inline operator for the XOR operation? That's something that nobody ever thinks about. But when you need to put in a table in your book,

and you need to fill in all the pieces, that's when you start noticing these kind of things. So we were able to

Also, you know, submit some issues, maybe even a few pull requests to Polar's itself along the way. From writing nonfiction, I turn to Seek, C-E-E-K, a new platform for education with VR capabilities. In episode 889, I talked to Seek's founder, the space engineer, Mary Spio, about the potential for Seek to revitalize the way we learn and make even specialist education accessible worldwide. Another thing that's really cool about Seek

is how it could potentially, I mean, platforms like Seek or VR in general, how it could help with education. So in the US, for example, there's a shortage of 400,000 kindergarten to grade 12 teachers, so primary and secondary school teachers. And post-secondary institutions, there will also be shortages coming because those workforces are unusually old. And so college and university faculty are going to be retiring.

And so education is failing to attract and retain faculty and has equity failings that cascade into long-term disadvantages for students.

And you've described previously how one professor can teach 100,000 students via SEEK. So where do you see VR, SEEK potentially having a long-term impact on improving educational outcomes? Right. Where we see it is in multiple ways, right? Because right now, like you said, the shortage, I saw a stat somewhere that was like,

You know, Europe needs this much. U.S., North America needs this much. It was like four point six million total. And then it says Africa needs a miracle because it's like when you look at the rest of the world, the shortage is so dire. And so a platform like SEEK, we allow a single individual to be able to teach English.

you know, at scale so they can present their course, you know, virtually. And then people are also able to experience it. The reason why we're getting interest from the likes of the evitals of the world is because this is a brand new industry. Right. So, for example, you have the mass displacement of.

of a lot of the current jobs as we know it. And they also have to train people for all these new autonomous vehicles, all these new industries that are coming out as a result of automation. And you need, for example, 100,000 people

pilots, EVA 12 pilots within the next few years, which means you have to train a million. You cannot put a million people in these very expensive aircraft. It's also very dangerous. You know, it's a danger to the person. It's a risk to the aircraft. But

On Seek, you can have a million people training at the same time with the VR headsets. And they're able, so that person is able to really scale themselves and have all these people train around the world. And that's basically what we're building today. And this isn't just a sample scenario. We are working with the leading companies.

You know, electric aircraft company and they're exploding. Right. They have a massive backlog because right now, when you look at fuel based airplanes or helicopters that are being used for logistics and delivery and stuff like that.

that costs about 4,000 an hour in fuel. It's costing about 300 an hour. So even beyond being good for the environment, it's also good for business, which is why they have these massive backlogs and they have this need to train people at scale. And these are things that you just can't do physically, which is why our platform is now in demand. And then the other aspect is the fact that in VR,

the brain hasn't developed the ability to differentiate between what you do in VR. For the first time, we're creating memories, right? So it's almost as if you're actually flying the aircraft. It's almost as if you were actually, you know, moving the...

equipment and doing all these things. So you're building memory, which means you're building experience. So you can now show up day one, now able to train in that helicopter because you've gotten that thousand hours or however many hours that you need before you can step inside the real deal.

And the same thing applies, you know, whether it's elementary education, primary education. For the CPR, we built CPR for the children's hospital for adult, infant, and child CPR. The interesting thing is this was for new mothers because actually before I did the CPR program, I didn't even know there was a difference between infant, child, and adult CPR. And a lot of new moms don't either, you know. And so...

By them putting on the headsets, they were able to learn and train and they felt more confident than watching a video because they were actually holding the baby and they were, you know, doing all the different actions. And the, you know, the clinicians and the EMT that we worked with also felt better equipped.

The reason Baptist Health was looking at the nursing residency is because there's such a huge gap between nurses. The average age of a nurse today is 50 years old. That's how big the gap is because a lot of people are staying a year or two and then they're leaving. And the reason there is such a high turnover is not because of competence, but rather confidence. You know, a lot of people that a lot of nurses by nature are very, very

very caring. So a lot of them are afraid that they don't want to hurt someone. So by now being able to learn and make the mistakes, they don't want to make a mistake on a real person. So now they can make the mistake, they can practice, they can do all of these things in VR and feel more confident to be able to do it in real life. And then there are other areas that you just...

need to do in VR versus in person, like intubation, right? Where they're learning how to insert a tube into somebody's throat. And a lot of times they will perforate the throat. Today, what some hospitals do to train is they hire low income people and pay them

And then they can test, you know, intubation on them. Oh my goodness. Perforate their organs for $50. I mean, not me. I don't want to do that. Yeah. And unfortunately, uh, you have, you know, the homeless, the elderly, some people also do the testing on, um, the elderly with Alzheimer's, um,

And, you know, to train the nurses how to do the intubation. Yeah. Oh, my goodness. That's shocking. VR definitely seems like a more humane way to be learning that. Yeah. And you can do that without the risk of perforating anybody's organs. Yeah. Being able to make mistakes and learn from them is such a core part of education. Learning from past mistakes is also an unavoidable part of running a business.

We have listeners at home. A lot of our listeners are either hands-on data science practitioners, like machine learning engineers, AI engineers, data scientists themselves, or people who are interested in building products or companies that are

the leverage generative AI, what are the kinds of lessons that you've learned in implementing a product like data conversations at Verity? What do you need to do? What are all the things you need to line up in advance of bringing in a large language model and having conversations work effectively with data? You talked about

a moment ago about the issues that you typically see without this kind of conversation in place where people have a dashboard and it's not exactly the information you needed, it's too fixed in its outputs, and so then people end up going and digging under the covers into the raw source data to try to really find answers, which adds strain onto the data analyst team. So I get all of the advantages of being able to have a conversation with your data analyst

But what are the things that you at Adverity, that our listeners, if they want to be making this similar kind of transition, what do they need to get right in order for that conversational aspect to work out? So I think one really critical piece is the quality of the data underneath. So each source, and there's many aspects of data quality, if you will, also from an academic perspective, you can list those out. But from a more practical perspective,

You need a complete data set that is also very well aligned with all the various sources that you have. So harmonization plays a role in this as well. And we built up actually a data quality component in our platform that helps you monitor all those issues that you can have in your data. There's specific monitors for data quality in marketing. There's a concept called naming conventions, for example, for campaign names.

that we can monitor and act on in an intelligent manner. But there's also simple things like if you onboard a generic source from a database or from a REST API, you want all the data types need to be aligned, you know, date formats need to be aligned. You want all your data to be harmonized in UTC, for example. You need to clean up some stuff. This is also why there's some transformations going on usually either by splitting up, combining various sources and all those things.

But I think it's very critical to get the quality right. You need to be alerted. If something's going wrong, you want to prevent kind of, not saying dirty, but problematic data sets to hit your production environment. And I think we can help in this discipline quite a bit. You can help in the discipline by having these kind of data quality reportings

built into the platform. Yeah, but also the multi-layer approach to this. So we keep always a raw data set that can then be used as a starting point to reiterate on transformations, for example. So you can always go back to the previous state and improve your transformations. There's also obviously today an AI system helping you to compose those transformations. But if you, and this is specifically always very useful for those type of generic sources, but you can, you know,

It's kind of a simplified data wrangling exercise, if you will. And then once you're satisfied with that, there's a component that helps you monitor the quality as it flows through the system. There's anomaly detection and all the things that you want to monitor. Right, right, right. Yeah, so built-in anomaly detection would be key to this working out. How about when you think about

there's a huge amount of breadth of capabilities that you could potentially get from a conversational interface. When you're designing a conversational product, how do you...

How do you figure out, okay, this is the range of things that we're going to support or not support? And then how do you select the right large language model for that breadth of features that you decide to support? Yeah, let's start there. I have more kind of follow-on questions from that, but I feel like that's kind of a good starting point. Yeah, I think it's useful.

And maybe one thing to add to the previous question, in terms of quality, like I already said, the data dictionary, descriptions, understanding of lineage is very critical as well. And this goes also into the design of our conversations interface and how people can interact with that. We iterate very quickly. So we're going through

I'd say a pretty fast-paced development cycle with this, adding features every week. And we have a dedicated team taking care of benchmarking and analyzing the quality of responses. So we're using...

frameworks to monitor that. And the data science team is having a continuous test on, you know, we have a kind of a predefined set of responses that we expect from our questions and we can monitor on those and improve and test models as we go. And to be fair, the plan is also at the moment we committed to one model, but there's also the plan to use different models for different aspects of our capability.

So for example, we could use a different model to compile our SQL query, a different model to do the pre-flight qualification of a question, a different model to do the actual conversation. So yeah, that's also possible. Nice, nice. So I imagine something like, you know, obviously...

The questions that I asked you were kind of tricky because I'm trying to get at what are the things that people need to be doing in order to build these kinds of conversational interfaces like you did, but obviously there's proprietary things involved. Yeah, I think there's no trade secret in building, if you will, a lot of...

and the type of APIs they offer are similar in regards to their capabilities. And you see like all models kind of reaching the same capability. And, you know, basically the leaderboards change just, you know, every other month you'd have another leader, but everyone's catching up to the same state of quality, if you will.

I think where it then boils down to is how you put the components together to create a compelling and exciting use case on top of that. And I think in terms of how this works from a technical perspective, it's pretty straightforward. You can qualify a user input into a type of question, select a model that you want to run with.

basically feed it with a system prompt and additional information about the model, which is very critical to get the answer right. I use this to create a SQL query, verify it's actually a valid query that can be executed, fire

fire the query, use the data to run some basic analysis and create a decent, nice answer for the user. And for us, the use case then circles a lot around the table that we generate from that kind of response. Because what our approach to this is, is first of all, in terms of democratization, we are targeting two sides of the business, one of which is IT and the other one is the business user.

And both have a requirement to access data. So rather than going through a full chain of various teams, you know, so it used to be that you had to create a ticket to get access to a data set. The data set would then be prepared within two weeks and put onto, I don't know, a Snowflake table or whatever. Today, a Snowflake table, it used to be something entirely different.

And with this, you can actually run the query, create a table in near real time available for your further analysis. And that's kind of exciting for us. All right, that's it for today's In Case You Missed It episode. To be sure not to miss any of our exciting upcoming episodes, subscribe to this podcast. But most importantly, I hope you'll just keep on listening. Until next time, keep on rocking it out there. And I'm looking forward to enjoying another round of the Super Data Science Podcast with you very soon.