We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Tool calling and agents

2025/2/14

Practical AI: Machine Learning, Data Science, LLM

Daniel Whitenack: 我认为 OpenAI 发布深度研究产品很有趣，特别是考虑到 Hugging Face 迅速采取行动，在开源代码中复制了类似的功能。这种动态表明，在通用应用层面建立护城河非常困难。虽然在特定领域或垂直领域，或者利用专有数据可能建立护城河，但在通用应用层面则不然。我关注的重点在于，我们是否在构建增强人类能动性的系统，而不是取代它？我们构建的系统是让我们更加信任人类机构，还是更加恐惧和不信任它们？我们构建的系统实际上是将我们更多地推向个人孤立，还是推向社区？ Chris Benson: 我对 OpenAI 的未来不太确定，也不太关心。埃隆·马斯克收购 OpenAI 的努力以及 OpenAI 从非营利组织转变为营利性公司，都让我觉得没有明确的立场可站。开源选项和应用层面功能的出现速度加快，让我对 OpenAI 等公司的未来商业模式感到好奇。我认为，将 AI 集成到企业堆栈中，例如微软的捆绑效应，可能是一种生存方式。但真正的商业价值在于处理敏感数据，这需要垂直领域的 AI 参与者和工具基础设施参与者来解决。

Deep Dive

Chapters

This chapter discusses the potential trajectory of OpenAI, considering Elon Musk's attempted acquisition and the company's evolving business model. The conversation touches upon OpenAI's release of deep research products and the rapid response of open-source alternatives.

Elon Musk's bid to acquire OpenAI
OpenAI's release of deep research product
Rapid response from Hugging Face reproducing OpenAI's functionality with open-source code

Shownotes Transcript

Translations:

中文

Welcome to Practical AI, the podcast that makes artificial intelligence practical, productive, and accessible to all. If you like this show, you will love The Change Log. It's news on Mondays, deep technical interviews on Wednesdays, and on Fridays, an awesome talk show for your weekend enjoyment. Find us by searching for The Change Log wherever you get your podcasts.

Thanks to our partners at Fly.io. Launch your AI apps in five minutes or less. Learn how at Fly.io. Welcome to another fully connected episode of the Practical AI Podcast. In these fully connected episodes, Chris and I keep you updated with everything that's happening in the AI world, if we can. Yeah.

There's a lot. And we try to give you some learning resources to level up your machine learning and AI game. I'm Daniel Whitenack. I'm CEO at PredictionGuard. And I am joined, as always, by my co-host, Chris Benson, who is a principal AI research engineer at Lockheed Martin. How are you doing, Chris? I'm doing great. I just don't know what we're going to talk about because nothing ever happens in AI. There's nothing happening. There's never anything going on. Elon has done nothing. Yeah.

Oh my gosh. Elon is throwing spit wads at, at people again. He's let's say, you know, he's been suing open AI and now he's put his bid out, you know, in the last few days, uh,

for open AI. That's, uh, yeah, we, we are the, the article that I saw was, we are not for sale. Chat GPT boss says, I, I know good old Sam. Sam said, we're not for sale because you know, the two of them really love each other. Oh yeah, definitely. Elon Musk and Sam, Sam Altman, uh, they are best friends, best friends. That's how we'll report it here. That's how we're reporting here. Cause we always look for the, uh,

the upside in the AI world here? Yeah, who knows the motivations behind billionaires. It's an interesting thing to watch. It certainly spices up conversations in the workday and is a nice point of

with friends, you know? So yeah, that's about how I'm taking it. Yes, that's that. I would agree with that. It's the spats between billionaires just doesn't quite make it to my, not being a billionaire myself, it doesn't quite make it onto my list of concerns. Yeah. What do you think is kind of the trajectory with players like OpenAI? You know, they're,

Someone was... Well, the other thing that happened in the U.S.,

was, this fits into maybe the other thing I was going to ask as well, although I got sidetracked in my mind, because I remember that there was a Super Bowl and there was a Super Bowl commercial that OpenAI had, which, you know, it was a cool commercial. I think I didn't know what it was going to be at first because it's just the artistic dots around, you know, forming scenes. And then I think I gradually realized that this is

Those little dots on the open AI chat GPT app that, you know, expand and cool ad. And someone, someone commented to me,

they spent like $14 million or whatever a Super Bowl ad costs. I forget how much it was. But I was like, that's really nothing compared to what OpenAI is losing generally on hosting models and infrastructure. So yeah, that's what circles back to my other question, which was,

Yeah, what are your thoughts on, I mean, if Elon doesn't buy OpenAI, what's the future? I honestly don't know. And I got to be honest with you, I'm not sure that I care a whole lot. I was thinking about that as we were leading into this is that, you know, I mean...

There's not a protagonist here from my standpoint. There's not a side, there's not a side that I'm for or against so much. Um, uh, you know, you have Elon with all the, you know, the, the, the adventure around Elon Musk. And I say that word kind of tongue in cheek, um,

And then open AI and, you know, you know, there is a kernel of truth to what Elon says when he talks about it being, you know, going from being the nonprofit with the grand vision that it started out with in the early days. And then it has increasingly gone commercial and become for profit, you know? So it's another big, it's another big giant AI company, you know, like the others and stuff. It's,

I'm watching it with half an eye, uh, like everybody else in the world, but, uh, not sure. I just don't, don't know. And I'm not terribly sure I care. Yeah. Yeah. Is there somebody out there in our audience that is, that is deeply concerned about this? I would love to, I would love to hear somebody who is not Elon or Sam Altman. Tell me why this is a big deal. Yeah.

Yeah, maybe we'll leave it at that. It's a good point. I am interested in some of the dynamics like OpenAI released their deep research product. So if you kind of look at the trajectory of what they're releasing, what they're doing, there's this deep research product which is really geared towards this

You know, multi-step online information research type of task. Yeah. So, you know, going and looking at various, you know, trends across various sites with various data, reasoning, certain information, consolidating that, you know.

contributing to some sort of research project. And I find it interesting that OpenAI introduced this. One of the dynamics I love watching is OpenAI releases the application level products, so like deep research, and then

So I see the blog post by Hugging Face. It was like the day after. So they say yesterday OpenAI released deep research. So this is a blog post that I'll link in the show notes.

from Hugging Face. And basically, they just decided to make sure that they could reproduce the functionality with open source code, maybe some recently released models like DeepSeq models or others in 24 hours. And then they wrote the blog post and released it. I don't know how long of a 24 hours it was, but you see that dynamic happening. So you see that with deep research and then you have

the open deep research thing, you see kind of the operator stuff where it's operating your screen, your browser window. Now, earlier today, I was running Hugging Face small agents. They have a web agent, which is essentially that it spins up a browser window. It does certain tasks for you in the browser window. You can type a prompt like,

Hey, find the most recent episode of Practical AI, summarize the topic, and then find seven

other articles of a related topic, list them out in markdown format and output that. Something like that where it requires this sort of agent operating over the internet. Super slick, super fun. I would definitely recommend people if they want to try that sort of thing, try the small agents web agent. But yeah, you see this kind of trend where at the application level, some of this is just, you know,

It seems like you can't develop a moat generally there. Now, you might be able to develop a kind of moat as a company in a specific domain or a vertical or with certain knowledge or proprietary data. Right. But it's very hard at that kind of general general application level, I would say.

I think, you know, I keep wondering as, you know, OpenAI had had a substantial lead and it was taking quite a period of time for a while for open source options and application, you know, level things to come about. And we're and we've seen that the, you know, that time interval shrink tremendously here.

So, you know, and ironically, at the same time that that Elon makes his ninety seven billion dollar effort to buy open AI. But you can't help but wonder a little bit about what the future business model looks like. You know, to your point there about, you know, if it takes so you never have time to create a moat, you know, if you're one of the main players now.

you can, you know, there's, there's certainly business models for other players to come in their industry, as you just mentioned and create a capability because that's their thing. And it's not something that the big boys are going to go after. But as we've seen this interval between the commercial players and open source shrink to almost nothing, um,

How does that, what do you think that means for the business models going forward for the Googles and the open AIs in the anthropics of the world? I mean, I think part of, part of it is maybe this sort of integration in the kind of enterprise stack. And what I mean by that is, is the kind of bundle effect that you get from something like offerings from Microsoft. So,

So, you know, absolutely no one in the world wants to use Teams because it's absolutely terrible. And I will go on record as saying that. Sorry for those that work on it. I have to use it. I have no choice. I guess, you know, you have a podcast, you have an opinion, but that's my opinion. But, you know, I'm also not going to pay hundreds of thousands of dollars to Slack if I can just flip on Teams in Slack.

in my Microsoft tenant and they already have all of my data and all this stuff. So the fact that they're tying in Copilot and those licenses around Copilot in an ecosystem that's already so embedded in the enterprise world, there is a very strong bundle effect there.

And yeah, it's very real. Right. And it doesn't mean that it's necessarily the best solution, but it is it is a solution depending on what you're looking for. Right. At that kind of generic copilot level. Right.

In a case where you need single tenant, meaning in theory, the terms and service are my data is not being used in certain ways, that gets that generic case. But again, the real business value that a company has, the way I see it is you've got these generic cases where someone random is going to

want to find a word document or paste in an email, then you've got like the core business value, right? So a pharma company that has their most sensitive tiers of data that are the, you know, lifeblood of their company or a,

healthcare company or a finance company that has certain classifications or regulatory burdens around certain tiers of data. It's a whole other thing to think about integration of those tiers of data into a generic system like that because they're a generic co-pilot system for those kind of

less sensitive tiers of data, there's still something that needs to be solved at those other layers, which is where I think, you know, vertical AI players, but also, you know, tooling and infrastructure players can still make, you know, a lot of progress. Do you think that the bundling that you're describing that's, you know, occurring between the, you know,

vertical capabilities where they're producing these and, you know, and, and open AI going and doing deep research or Google integrating Gemini into, you know, the Google suite, which they've been doing and trying to drive a premium, you know, from users for that. Is that bundling going to be critical to them going forward? Or do you think that the open AIs of the world and, and, you know, and we've seen this historically with Google, maybe not in an AI context always, but

driving into specialties where they, you know, they open up a new vertical underneath the umbrella and stuff. Do you, you know, is open AI going to have to do that to survive? Because since it's going to have open source chomping in its heels, coming down the general path. Yeah. I don't know. I, it could be by vertical. It could be, I mean, you look at Palantir, for example, uh,

stock price soaring. Most regular people aren't using a Palantir co-pilot in their day-to-day, but they have that a certain market, particularly around DOD or defense or other areas. They have really put a lot into serving that well with the

less generic, but still fairly generic across different use cases set of functionalities. And that has served them well, at least from an outsider's perspective, if I'm looking at that. So it may be a specialization in terms of tools or vertical. It might also just be a segment of the market that you choose to focus on and is kind of the bread and butter. It's interesting because you've got all of these

really end users direct to consumer traffic on open AI and these things now where a lot of what we had talked about before with data science and AI and machine learning was really enterprise focused, not direct to consumer. So allow me to throw one other layer onto this conversation as we circle back around to AGI ideas.

with, you know, kind of having artificial generalized intelligence being bantered about. Sam Altman was just saying that he was expecting GPT-5 to be smarter than he was. And so as we look at that, you know, I think GPT-3 is smarter than I am.

I mean, me too. I agree with you. But with that, you know, with us, with that, the AGI chase continuing at this point and, you know, we've heard, you know, with DeepSeek and all these others going in and talking about business models and bundling and such and exploring new verticals. How do you think that the AGI race fits into that? Yeah, maybe that's the piece that in my mind I'm not

isn't really entering into my mind much in the same way that you don't think about Elon so much, which is probably good. Yeah, I think it's an interesting question and there's implications. The questions that come into my mind at a more general level, which is what you could talk about it as AGI or not, I don't know. But the questions that come into my mind are more the downstream questions.

of some of these things? Are we building systems that enhance human agency rather than replace it? Are we building systems that allow us to trust more in human institutions or fear and distrust them more? Are we, you know, are we building systems that actually drive us more into isolation as individuals or into community together? I think those are...

Those are interesting kind of directions that are on my mind as I think about the more general side of this.

Well, there's no shortage of AI tools out there, but I'm loving Notion and I'm loving Notion AI. I use Notion every day. I love Notion. It helps me organize so much for myself and for others. I can make my own operating systems, my own processes and flows and things like that to just...

Make it easy to do checklists, flows, etc. that are very complex and share those with my team and others externally from our organization. And notionally on top of it is just

It's so cool. I can search all of my stuff in Notion, all of my docs, all of my things, all of my workflows, my projects, my workspaces. It's really astounding what they've done with Notion AI. And if you're new to Notion, Notion is your one place to connect your teams, your tools, your knowledge, so that you're all empowered to do your most meaningful work. And unlike

other specialized tools or legacy suites that have you bouncing from six different apps. Notion seamlessly integrates. It's infinitely flexible and it's also very beautiful and easy to use. Mobile, desktop, web, shareable. It

It's just all there. And the fully integrated Notion AI helps me and will help you to work faster, write better, think bigger, and do tasks that normally take you hours to do it in minutes or even seconds. You can save time by writing faster, by letting Notion AI handle that first draft and give you some ideas to jumpstart a brainstorm or to turn your messy notes. I know my notes are sometimes messy into something polished and

You can even automate tedious tasks like summarizing meeting notes or finding your next steps to do. Notion AI does all this and more, and it frees you up to do the deep work you want to do. The work that really matters, the work that is really profitable for you and your company.

And of course, Notion is used by over half of Fortune 500 companies and teams that use Notion send less email. They cancel more meetings. They save time searching for their work and reduce spending on tools, which kind of helps everyone be on the same page. Try Notion today for free when you go to notion.com slash practical AI.com.

That's all lowercase letters, notion.com slash practical AI to try the powerful, easy to use Notion AI today. And when you use our link, of course, you are supporting this show. And we love that. Notion.com slash practical AI. Well, Chris, we talked a little bit about...

tools and agents. Well, agents, generally the web agents, the deep research things. And we've kind of talked about tool calling and the connection to agents at certain points on the show. But I don't think we've really dug into, you know, the detail in a, in a way that, that maybe will make things clear for people. I still see a lot of confusion around this. Um,

Even, you know, in my day to day as I'm talking to customers,

The question of, well, how do I make an LLM talk to this system? Right. Or how do I, you know, that deep research tool, how do I make an LLM go and do a thing? Right. That's often how the question comes. And what I think I realize when I'm hearing those questions is there's kind of a fundamental misunderstanding of what the LLM does and

and how it's tied into a framework, which you might call tool calling, you might call agentic, the

The names kind of get mushed around a lot these days, unfortunately. They do. I was thinking that as you were saying all that, and then you got... That's literally what was in my head in terms of the misuse of different names of these technologies and what's doing what. Yeah. Yeah, exactly. So in my mind, so this is...

I'm feeling very opinionated today. I don't know why. Go for it. Excellent. In my mind, how I kind of draw the lines here, there's, you know, of course, models, large language models. They predict probable text. They generate text or images or whatever you want them to generate. Then there's other systems kind of over on the other side. So you could think of, you know,

your email or your bank account or an external system like an Airbnb where I might want to make a reservation or my company's database, which contains transactional data or another system that I use like HubSpot or all of these types of things. There's all of these other things. And to ask a question, well, how could I...

how could an LLM go and create a new deal for me in HubSpot? It just hurts me when you phrase it like that. It causes pain in my head. But that's how people phrase it, to be clear. These questions are the questions that come up every day. The question is often phrased, how do I make the LLM create a new deal for me in HubSpot? So,

Right in that phrasing, to your point, I don't know, what makes you cringe about that? It's just that's a fingernails on the chalkboard kind of moment for me is...

You know, to answer that question, in the six and a half years that we've been doing this show, and we have evolved through a number of technologies, you know, that at each point in time were the hot thing. And inevitably people focus in on just that for a while. But right now we're at a point where generative and LLM the last few years have been the hot thing.

And we forget that they don't necessarily do everything out there. It's not, you know, people will say LLM. In fact, they only do one thing. That's exactly right. And not only that, but there might be an AI architecture that could do the thing that they want to talk about, but it's not necessarily the thing that they're talking about.

and they're mislabeled. It's not the model. It's not the model. And so that's the fingers on the chalkboard of, we've kind of talked about this over the last year, the tunnel vision of the generative AI era, you know, in terms of everyone focusing on that. But,

it's to the point that there are other technologies in the mix and there is a technology that will do the thing they want to do. They're just not picking the right one in the way that they're verbalizing it. So, yeah. Yeah. So let's, let's maybe break this down into components. So, so let's say there's the LLM, you know, we'll just talk about text now. Certainly there's multimodal and all that stuff, but just think about text. There's the LLM, which all it does is complete probable text.

So I could, you know, ask it to auto complete. I could ask it to write something for me. I could ask it to generate something for me that that's what it does. Right. Then there's the system. Let's say we'll take the HubSpot example since I use that HubSpot for those that aren't familiar. It's a popular CRM solution for those that maybe aren't, you know, don't don't want to mess with Salesforce and all of that world. So HubSpot,

I can create a deal associated with maybe a sales lead I have, right? That is its own software system that's hosted by HubSpot, right? And I actually, I don't know this, but I assume HubSpot has an API, a REST API, meaning you could programmatically interact with HubSpot. This is how apps on HubSpot work, right? An app on HubSpot is regular good old-fashioned code that maybe...

allows you to add these fields to these records or retrieve this data or report on this data. That's just good old-fashioned code. It uses the API. So this is a separate system. And so there's really no connection between, there can be no connection directly between the LLM, which generates text, and this other system out there that's a CRM that does certain things. There's no connection between the two.

Except in the middle of that, there can be this process, which I would generally say I would categorize as tool calling generally or function calling, which let's say that you wrote a good old fashioned software function.

that creates a deal in HubSpot via the REST API of HubSpot. That has nothing to do with AI. It's just a software function where you tell me the email of the person, the name, the company, I'm going to go in and create the deal in HubSpot via the API. So there's a function. You give me these arguments. I'm going to create the deal in HubSpot.

Okay, still no connection to the LLM. But if I then ask the LLM to say, hey, I have this customer information, email name, etc. Generate the arguments for me to call this function, which takes these specific arguments.

then the LLM could generate the necessary arguments to call that function. And if you create a link between the function and the output of the LLM, so the LLM is still not really doing anything other than generating text, but in your code, you literally take the output of the LLM

and you put it into the input of that function. Now, you could put something on the front end into the LLM and have the result be a flow of data out of the LLM into the function

and then into the HubSpot API. So that's sort of how this tool calling, function calling thing works. Which makes perfect sense. And that's standard software development. You know, that's the only thing that is different there is the fact that the function parameters that you're using have been generated by the LLM, which is a generative model. Perfect. That's what it does. And there are some special things related to this in the sense that,

If you look back in time at LLMs, first we had kind of really good autocomplete models because that was a meta task for people training language models. Then people figured out, oh, I kind of want to use these as general instruction following models. And so they developed specific prompt formats and prompt data sets for

to fine-tune LLMs for specifically instruction following, right? So here's your system message. Here's the message I'm providing you. Give me the assistant response. And they trained it on a bunch of general instruction following things. Well, they've done the same thing now because they've realized, oh, a lot of people want to do this tool or function calling mechanism.

So certain people, including OpenAI in a closed sense, but others in an open sense, like Noose Research, who we had on the show, they have a data set called Hermes.

This includes a set of prompts that are related to function calling specifically. So they've given a huge number of examples of function calling prompts to a model that they would train like a LAMA model. And now you have Hermes, LAMA 3170B. It's been fine-tuned to follow that Hermes style prompt format for function calling.

Which means it kind of has an advantage, if you like, or certain models that have been trained with these examples have an advantage specifically for that function calling task. Right. So there is an AI element in the sense that some models are better at this than others because of the way that they've been trained. And there's certain prompt formats that are special, right?

And you'll get better performance if you use those prompt formats or if you use a model server like VLM that supports or has the inbuilt translation to those prompt formats, etc. So there is an AI element of it, but it's only in the sense that you're preparing the model for this type of use case rather than, you know, connecting. There's some inbuilt connection of the model to something external. Yeah.

So I'm curious, can you tie in the tool calling into what might be considered a full agentic implementation? What's the leap there, if any? Yeah, interesting question because people use the term agent very loosely. So some people would say what I just described is,

Even just that chain of processing. So I put something in the front end of the LLM, deal is created in HubSpot. That might be considered an agent, my HubSpot deal creation agent. I would say that's really just a tool calling example of how to use an LLM. In my mind, what separates out the agentic side of things is

is where you have some sort of orchestration performed by the LLM. So what I mean by that is you have a set of tools. So let's say I have access to Airbnb's API and Kayak's API and United Airlines API or whatever other travel things I need to do, maybe my Gmail for various things.

And I say, hey, I need to book a car next week for my trip to wherever. Right. That input could then be processed through the LLM not to call a single tool, but first as an objective to determine what tools to call and in what sequence with what dependencies. Right.

Try to do a first step of that and then reevaluate and then do the next step until you reach the objective. Right. So first, in order to book my thing, I need to know when my flight is. So I go to my Gmail and I look for the confirmation. Right. Or, you know, second, I use that date in the kayak API to look for choices. And then I evaluate those choices and then I use it to book the reservation. So there's a series of steps that might call different tools.

Or systems, you know, it could be data sources, unstructured or structured data sources like a database or a rag system. And so that thing that I talked about, like that HubSpot deal creation tool, right?

might be one of those tools in an agentic system where an agent could choose to use it at certain points. And I'm being, I'm anthropomorphizing here. It's not choosing anything, right? But it's useful to talk about it sometimes in that way. So forgive me. It's choosing to use that tool in one case and maybe other tools and other sequences in other cases. In my mind, that's what really distinguishes the agentic side from just the tool calling side. ♪

Well, Chris, it's fun to talk about some of the agents thing. Normally we wait till the end of the episode to share some learning resources. But since we've been talking about tool calling and agents, I just wanted to mention this new course by Hugging Face. So they now have an agents course, which I think was just released recently.

And is coming out live on YouTube, if I understand correctly. And so in the course, they talk about studying AI agents in theory, design and practice, using established libraries like Small Agents, LinkChain, Lama Index.

sharing your agents, evaluating your agents, and then at the end you earn a nice certificate. So plug for the Hugging Face Agents course if those of you out there are intrigued by some of the tool calling and agent stuff. It's

Seems like a good one. Yeah. As we record this, uh, yeah, they're actually doing it in about an hour and 20 minutes from right now, as we record the show, it'll be passed by the time you're listening. If you're listening, you missed it. You missed it. Sorry. You're going to have to do the replay. Yeah. But you can do the replay. Yeah. Yeah. And it's interesting. Um, you know, one of the packages there that they mentioned is called small agents, which is, is really great. I love using that, that package. It's a lot of fun.

And, you know, I've even used it in a, in a couple of really interesting, uh, internal, internal use cases at prediction guard. So, so do me a favor here. And depending on, uh, if so long as there's no secret sauce moments there for prediction guard, uh,

Can you plant a couple of seeds on things that you've done, you know, that people could explore in terms of what you found useful and, hey, I did this thing and just kind of let people get a sense of how you're looking at it and what things they might be able to do so that they can ideate on their own.

Yeah, yeah, definitely. So I'll speak somewhat generically here. So I don't reveal certain things, but, you know, customer things. But one of the cases that we actually experience fairly often with customers is they want to build, you know, maybe it's they want to build a chatbot that has access to some or has access to some special knowledge or can access information.

special knowledge in one way. So on the one hand, if you have

bunch of unstructured text, right? That's a typical case where you would use a rag workflow and you would put that into a vector database. You can retrieve it on the fly. That's a rag chat bot. On the other side, there are text to SQL methods, for example, or, you know, API calling methods that could allow you to interact with your database, right? So there's those methods. Sometimes

though you have a source of data. And there's been a couple times for us where it's maybe a web app that doesn't have a really convenient API, but has a really complicated and annoying user interface. And the company has this web app that has a bunch of knowledge in it, right? But there's really no good way to extract all of that content from the web app. It

has an annoying interface, so no one wants to use it, right? And so something like the small agent's web agent, like a system like that, and what the web agent does is it executes a series of tool calls that leverage Helium under the hood, which is a package that allows you to automate interactions with a browser. And so if it's a web app,

it can basically spin up the application in the browser and then interact with certain elements like search for a certain thing or find a certain component or an object, summarize that output and output it from the web agent. So one of the interesting cases where we're thinking about that is, is these cases where

A company has invested a lot of money in some system or application that's maybe a legacy system that they have to keep on using, right?

But no one really wants to engage with it because the UI sucks. But it also doesn't have a really nice API or way to access the data in there. So actually using an agent as a kind of extra user that you can control programmatically to interact with the application is really an intriguing kind of prospect to tie in that knowledge and extract things from the app. The other one that I think comes up a lot for us because we work...

We work in a lot of regulated security privacy conscious context. That's kind of what we do as PredictionGuard and deploying secure infrastructure for AI in people's companies.

Often people will want to, once they now have a private secure system, tie in their transactional databases to their queries, right? And that's often a text-to-SQL type of operation where you're querying a database, you're generating a SQL query. That can be error prone, right? Like you can generate errors.

SQL queries that don't execute or potentially problematic SQL queries or ones that are very computationally expensive. And so you can tie in other elements, agentic elements into that where you kind of try to answer the question iteratively with different SQL queries until you reach an objective having the agent. That's this is kind of an agentic way to go about the text to SQL.

Or you could tie in other tools like SQL query optimizers and that sort of thing to help in that process as well. So on more on the enterprise kind of business side, those are a couple of things that have come up for us. No, that sounds interesting. It's, uh,

I'm just kind of curious what you're thinking is, how does this change the human side of the workflow as you've seen, you know, and recognizing these are some small use cases and everything, but, you know, this is the beginning of the agentic wave thing.

as we go forward. And I guess, especially prompted with the kinds of things that we're seeing in the news these days, you know, about evaluation of, of government departments and, and just that general, that general notion of reassessment for better or for worse is,

How do you think that that's going to be taken into commercial spaces in terms of deploying these agents? Will it change jobs significantly, do you think? Or do you think it will just be adding in without that kind of... I'm kind of curious what your lay of the landscape is. Yeah, I mean...

I think there will be a shifting of jobs. I think some of the things that we've talked about specifically in those examples are actually good examples of expanded human agency because a lot of times people don't do certain tasks or can't do certain tasks that they would like to do as a part of their job because of limitations of

you know, really complicated UIs or that this, you know, doing this and then this and then that will take me a ton of time and I've got to jump to this meeting, right? And so I think there's a lot of those things where that is expanded human agency of that. And so it's amplifying the effect of that worker and helping them feel like they have superpowers because they really didn't want to log into that application and use it one more time, right? Yeah.

So I think there's an element of that. Now you could make the argument, well, maybe they've hired three people under them because of those inefficiencies to do some of those tasks, which in some ways is a shame because if they're really just grunts, you know, cranking through extraction of data from horrible API or horrible user interfaces, like you could...

I mean, maybe there's people that enjoy that all day. I think generally that's not a very dignified sort of way to go about it. Now I'm realizing that I'm kind of making generalities here and there's the reality of people's work. Not everyone kind of gets to do the work that they, you know, they might desire to do or would give them most dignity. So I want to recognize that. And I think there will be a,

there will be a negative impact for portions, but I'm hopeful that there's also this positive impact. And even for people that are maybe in less skilled professions, if there's more of a natural language way to access skilled knowledge and kind of these amplifying effects of AI, it could hopefully open up new types of opportunities within the market as well.

So I would hope so. I mean, I think that's certainly, I think I suspect that we'll see all, just as we do in life and every other aspect, we will see people enhancing human agency on that kind of to the use cases that you're talking about. And we'll probably see people with, uh, that, you know, that would, that would rather take alternative paths to that as well. I think it will be a mixture of the whole thing. So.

Yeah. Yeah. As we kind of close out here, and I guess we're talking already about

About new trends and other things, one thing I wanted to note is Deloitte just put out there in January, they put out the state of Gen AI in the Enterprise Quarter 4 report, which I've been going through. So for those maybe business leaders or managers or other people that are wanting to get a sense of some of the things that are being tracked across different industries in the enterprise world,

There's a great report there. I see, you know, for example, they are tracking barriers to developing and deploying Gen AI, worries about complying with regulations, difficulty managing risks. They're tracking certain use cases, volume of experiments and POCs or proof of concepts, benefit sought versus benefit achieved, which is an interesting one. And

And also Gen AI initiatives, where they're most active within certain job functions, all of these sorts of things and many more. So if you're interested in those sorts of insights, which I do think are interesting to track, then that's a great learning resource that we'll link in the show notes and hopefully people can find and peruse if they're interested. Definitely. Yeah.

Well, Chris, it's been a great time. I felt like I functioned well in my tooling as a

as a podcast agent. Um, so you did good. You did so good that who knows Elon Musk may, uh, maybe coming after prediction guard any to any day now. So, yeah. Or maybe what I'm saying is just being generated by notebook. I'll, um, that could be true. Yeah. Okay. Well, good, good conversation today. All right. Yeah. Thanks Chris. Um, have a good one. You too. Okay.

All right, that is our show for this week. If you haven't checked out our ChangeLog newsletter, head to changelog.com slash news. There you'll find 29 reasons, yes, 29 reasons why you should subscribe.

I'll tell you reason number 17, you might actually start looking forward to Mondays. Sounds like somebody's got a case of the Mondays. 28 more reasons are waiting for you at changelog.com slash news. Thanks again to our partners at Fly.io, to Breakmaster Cylinder for the beats, and to you for listening. That is all for now, but we'll talk to you again next time.

Tool calling and agents 45:02 Share

Practical AI: Machine Learning, Data Science, LLM

Deep Dive

Shownotes Transcript

Tool calling and agents