This is episode number 865 with Cal Aldubabe, Head of AI and Data Science at Further. Today's episode is brought to you by ODSC, the Open Data Science Conference.
Welcome to the Super Data Science Podcast, the most listened to podcast in the data science industry. Each week, we bring you fun and inspiring people and ideas exploring the cutting edge of machine learning, AI, and related technologies that are transforming our world for the better. I'm your host, John Krohn. Thanks for joining me today. And now, let's make the complex simple.
Welcome back to the Super Data Science Podcast. I'm delighted to have my longtime friend, Kal Aldubaib, a tremendously gifted communicator and data science entrepreneur as our guest on the show today.
Cal is head of AI and data science at Further, a data and AI company based in Atlanta that has hundreds of employees. Previously, he was founder and CEO of Pandata, an Ohio-based AI and machine learning consultancy that he grew for over eight years until it was acquired by Further a year ago. He delivers terrific talks, don't miss him if you have the chance, and he holds a degree in data science from Case Western Reserve University in Cleveland.
Today's episode should appeal to any listener, particularly anyone that would like to drive revenue and profitability from data science or AI projects. In this episode, Cal details why his first startup was unsuccessful, but how the experience allowed him to discover an untapped market and build Pandata, a thriving data science consultancy.
He talks about his unconventional strategy of requiring clients to make a sizable upfront commitment that initially scared away clients but ultimately attracted the best ones. He talked about the way core values inspired by his tin can to Mars thought experiment shaped his hiring and company culture.
and how making data science boring, helping his clients trust AI systems, and delivering a clear return on investment became his formula for success. All right, you ready for this invaluable episode? Let's go. Cal, welcome to the Super Data Science Podcast. We've been friends for a long time, so it's cool to have you now on the show. How are you doing, man? Where are you calling in from? I'm doing well. I'm at home in Cleveland, Ohio today. It's so great to be here.
For people watching our YouTube version, you have a beautifully decorated office. You did that yourself? I did. I did. This was...
Just before the pandemic, I found myself working from home once a week, and this was going to be my reward to myself. I get to work from home one day a week, and so I wanted a beautiful space to be inspired. And then this became my very fun prison cell during COVID. And I started arranging it with all the things that brought me joy. And so that's the explanation.
of what's behind me right now. Do you remember how we met? Do you remember the time that we met? It was at Open Data Science. I believe it was the very first time we hung out at ODSC West. Was it like 2021? ODSC West, that's right. Yeah. It was probably something like 2021. I think it was the first post-pandemic conference that I'd maybe been to, period.
And I was in a Uber van with you. And you were so positive. And I was like, who is this guy? Like, what is he on? And like, what's his deal? I'm just excited to be there. I know. And then I discovered over time that that is how you are all of the time. That is Cal. That's on brand. Which is really cool.
You're maybe the most positive, happy person in data science. That can be like the title of the episode. I would love to have that be my brand. So I'm sure that having that kind of attitude is helpful as an entrepreneur in data science. So you have a fascinating story. I'd love you to fill our audience in on how you created a data science consultancy from scratch yourself, having...
You know, the confidence to do that. It's a really, I'm sure there's tons of listeners out there who have thought, you know, I wonder what it would be like if I left the comfort of my employer and tried doing data science on my own. So yeah, fill us in on how that went, how it all came about.
So there's lots of fun twists and turns that we can dive into later, but the story in a nutshell is I was actually studying computational neuroscience at Case Western, and I was dealing with a lot of the application of mathematics and data, and I had an internship with a health system where I was looking at public health data, and that was my very first entry into the world of data science.
I was encouraged after demonstrating some of my undergraduate research at a research showcase to consider commercializing. I had no idea that one could say, I want to start a business. So the Entrepreneurship Center, they showed me how to file an LLC. I take this big, giant leap into this knowing nothing. Raised some money, got into an accelerator. And this thing falls flat in about a year and a half.
And I can talk about all the lessons I learned about not selling data science the right way. But an interesting outcome of this is we had some research pilots with hospital systems. And I was trying to get them to sign these agreements to say they'll eventually buy this software. And in return, I was doing free research for them.
And so when that first startup collapsed, some of these researchers were upset. They're like, we were depending on this work. And it was funny to me because I asked the question, I'm like, wait, you would pay for this?
And so it turns out all along, the thing I was giving away for free was actually this growing, booming demand. And so to give you a snapshot in time, this is the end of 2015, the start of 2016. That's the birth of Pandata. In Northeast Ohio, where I'm based, there were fewer than 150 data scientists at that point in time. And about a third of them worked for IBM, the Cleveland Clinic, or Progressive Insurance, household names for many people.
And then these massive enterprises based here have like one or two.
So I started Pandata to address this talent gap. And over time, Pandata grew bootstrapped painfully by focusing on what was hard to cultivate. So as data analytics platforms and business intelligence became more commoditized, it became machine learning. As machine learning started to get commoditized, it became machine learning in heavily regulated environments. And that's ultimately how we found our niche.
And after growing that business and surviving the pandemic for about eight years, completely bootstrapped, we were acquired almost a year ago this month by a company called Further. And that's where I'm now their head of AI and data science. Love to tell you about all that, but that's my story in a nutshell. Yeah, let's dig into some of these things in more detail. That was a really great summary. It's amazing that you delivered that so concisely. It's like,
Eight years of your life distilled into just a few minutes, which is nice. There's something to that. So you kind of talked about this whole, like, why am I so positive all the time? I'm actually a very grumpy person when it comes to certain things. I'm like, no, we're not going to analyze the data that way. But I found early on with Pandata,
that when I was selling data science, most people didn't know what data science was. It was like a salt shaker. It's like, hey, can we sprinkle some data science on it? And maybe some money is going to come out on the other side. And it was this novelty. And most people didn't understand what this thing was. And a lot of data scientists at the time were leaning into being smart, being very technical, working on the algorithms that were the next greatest thing that they actually liked it when other people didn't understand what they did.
And on the other hand, I really wanted to find ways for people to get it because I found that when they get it, they were a lot more likely to work with us, stick with us longer. That ultimately became one of the core values that I built Pandata around and the type of people that I hired. And so my motto over the years evolved into make it boring.
And if I can use cats and dogs and puppies and weird little memes to explain a very complicated mathematical concept, you bet I will. And people love it. It's fun. It can be easy. It can be funny. And people go, wow, that was easy.
So that's become a big part of how I do things. I like that you, in the same breath, described it as making it boring while simultaneously having cat memes. It seems like you're making it more interesting, but I know what you mean. You're making it less technical, less... The data scientist isn't going to feel like, wow, this is... Yeah.
They're not going to be seeing all the equations that make them excited. It's making it boring for data scientists and making it exciting for everyone else. Exactly. Exactly. Actually, I had the same reaction from a class I gave recently. And they're like, that wasn't boring at all. I'm like, but you now understand how boring AI is. And they go, yep. And I'm like, that's the point. Nice.
So let's talk about the kinds of things that made Pandata so successful. We have already this make it boring idea of making it boring for data scientists easy for your clients to be able to understand the data science that you're delivering. What are the other keys to scaling a successful data science consultancy? So something that I didn't quite nail in my first startup that really stuck with me is this notion of product market fit.
And anyone who's in the space of entrepreneurship will hear this term bandied about. And for those of you who haven't been in the field of entrepreneurship, what that means is you found a pain point that someone is willing to spend something on solving. And there's enough of those people at enough scale. You know how to reach them. And you can consistently deliver that thing that they're willing to pay for. And...
Clients vote with their money. And I found early on, because I bootstrapped, that meant I didn't raise any capital. The only source of growth I had was when a customer is willing to pay for it. And so it's one thing when somebody says, hey, that's a great idea. It's another thing when they're willing to sign a big check for you to solve that problem. And then they come back to you to solve that same problem or similar problem again and again and again.
So product market fit and listening to what people were willing to spend on was a really big part of Pandata. My first year, all I had to do was say, hey, we can do data science things. And I was able to land a few contracts here or there, but it was a rotating window. I'd work with one enterprise and then they'd go away. Another enterprise would come. And that's a very common story for consulting companies. There were maybe one or two clients that stuck around or kept coming back to us.
And I remember having a conversation with my stakeholder there. I finally worked up the guts and I said, not that I want you to question the situation at all, but why are you coming back? And I was like really trying to do some market research and understand. And it turns out that they really liked that we were approachable, right? That was one of our core values is hold back the jargon, always speak plainly.
And then there were a couple of formulaic things that we accidentally ended up doing. We have this process called discovery and design that now is a mandatory requirement. Anybody that hires us to do any work, I say you have to do this up front or I won't work with you. With those clients, we accidentally did it. And that's where we spent just
30 days, six weeks, diving into a problem trying to figure out where the skeletons is the solvable, how can we approach this? What are the unknown unknowns, which is a really big part of solving problems that have not been solved before with pattern matching algorithms just to simplify it. And
So I tried to recreate that magic. So there were these attributes that we had that became our core values. We had five core values that I can talk about later. And then there are these processes. And one of these processes was discovery and design. Now, the funny thing is I decided, all right, I'm now no longer going to work with any client that doesn't want to do this. And we're going to charge an arbitrary amount of money.
That engagement size is now 50,000 at that time. That was a measly 12,000. And I was really a first time entrepreneur nervous about throwing that about. But I'd say, hey, you know what? Unless you're willing to spend this, I don't even want to work with you. And it helped me weed out two things. One, clients that weren't serious, if they weren't willing to pay that, they definitely weren't willing to pay for the rest of the engagement.
And two, if they didn't philosophically agree with the importance of that step, then I knew that they were likely to be a client that was consistently disappointed by the results because they didn't quite get the data science process. So I went from spending a lot of time talking to a lot of people that seemed interested at first in data science. And then I got no, no, no, no. My pipeline started to dry out.
And this is one of three times that Pandata's bank account reached like less than, you know, a month's worth of expenses. And I was like, this was the end. This was maybe the dumbest idea. And within that same period of time, I landed three of the biggest clients I had ever engaged, two of which remained clients until Pandata's exit. So over a period of about six years.
And that process became a part of how we were able to scale so much larger than most small solopreneur consulting shops. Right. So the key was having this 30-day discovery and design initial engagement at the beginning of trying to consult with somebody. And you'd say, you know, there's going to be this $50,000 price point to do that initial 30-day engagement. Yeah.
And so that initially seemed to put you in peril where your pipeline dried up, everyone was saying no, but then it did ultimately lead to discovering solid long-term clients that were with you for six plus years. Cool. Well, and so I would use this tactic and now I use this tactic to scare off non-serious people. And it actually allows me to save them time. It allows me to save time. And then I find the companies and the groups that say,
"Heck yeah, that sounds amazing. I love how you think about this." And there's a lot of fish in the sea and it's all about this matchmaking process. And one of the counterintuitive lessons I learned was the art of saying no or ruling others out by saying no to them. And it really allows you to spend more time on the bigger things, the higher value things. And this is a common tactic I see a lot of most of my friends who are wildly successful.
Excited to announce, my friends, that the 10th annual ODSC East, the Open Data Science Conference East, the one conference you don't want to miss in 2025, is returning to Boston from May 13th to 15th. And I'll be there leading a hands-on workshop on agentic AI.
Right, right, right, right. That is tricky. It's very hard.
Yeah. To say no to smaller or more challenging projects, because you remember those times where you got to only a month of expenses. Oh my value left in your bank account. You're like, well, I guess I better say yes to everything. But then that ultimately it slows you down. You have the death by a thousand cuts of just all of these low value touch points. Well,
Well, it's funny when we were going through due diligence on this acquisition, there were about three points on the balance sheet in the financials that they had virtually circled. And they're like, we want to talk about this, this and this. We don't like that. I said, I didn't like those either. Those are really bad moments for me too. Right, right, right, right, right. All right. So let's talk about what did you call them? Your five pillars, your five... Core values. Core values. So...
Along the way, I became a member of this group called the Entrepreneurs Organization. It's a global network of 16,000 members worldwide. And to qualify, you have to have a business that generates over a million dollars in annual revenue, where you're a founder, majority shareholder, and...
They have an accelerator program for businesses between a quarter million and a million. And so before I really knew what I was doing, I got into this accelerator program and it helped equip me with a lot of mindsets around how to grow a scalable business, how to think about a business as an operating system.
And one of the concepts is core values, not as something that just sits on a wall or things that we say, hey, we do things with trust and integrity, which is great. We want everybody to do things with trust and integrity. That's a give me. Core values are really those unique attributes that describe the type of character that individuals, when they exhibit within your organization, help you be even more successful. And they're usually patterned off of the strengths of the founders.
And so it's just this the way in which you discover these core values is if you think about if you're about to go on a tin can to Mars, who are the five people you would choose to take with you? And what is it about those five people that you really want with you on that mission? And then on the flip side, what are the five people you'd never want to have in that tin can? What about that makes you not want to bring them along?
And it's, you know, there's really this process of what about these attributes. And so I found for Pandata that there are these five values and we kept refining them over the years, but it was be approachable, win together, cultivate trust, pursue growth, and tame uncertainty.
And so these are five traits that are somewhat related to data science, but also somewhat related to the way in which we approach data science. So.
The most important to me that was the hardest to find in data scientists was this idea of be approachable. In fact, in our interviewing process, we had interview questions aligned to each of the core values. And we'd specifically look for individuals who naturally gravitated to explaining things without leaning on jargon. They looked for ways to help the other person understand.
Taming uncertainty was also another really big data science indicator. Individuals who get poorly framed problems, not a lot of assumptions, but naturally gravitate towards, okay, I'm going to unpack this. I'm going to list my assumptions out. If I don't know what they are, I'm going to know how to get them.
Cultivate trust was another really big one. And because there's this pattern of a lot of unintended consequences that have unfolded in the world of machine learning and AI over the years, I looked for individuals who didn't need to be reminded to press pause and ask difficult questions. I would look for individuals who, even if they didn't know how to do it, had the capacity to do it, to engage in a difficult conversation.
And there is a period of time where it wasn't widely known that machine learning algorithms discriminated, for example, against people of color.
And that's a very difficult conversation to have if you're not even willing to use the words to describe. This is a person of color. This is a, yeah. I haven't figured all this joke out yet or how this would work, but I like how one of your values is tame uncertainty. And that makes me think about in data science, the bias uncertainty trade-off. And then now you're all of a sudden talking about bias. Is that where you're going with this? Yeah.
Yeah. So like this is all, yeah. So like this, this all comes full circle. So it's like this package of traits when you put them all together, helped me find individuals that would thrive in my environment and consequently would do the things that would make clients hire us again and again and again. So these core values combined with our processes were the secret sauce behind Pandata. Very nice. Yeah. So the five values, uh, being approachable, winning together, uh,
Having, being able to tame uncertainty, pursue growth and cultivate trust. So those five core values combined with, yeah, these, these chunky initial engagements that get people to commit, you can do discovery and design solutions to the problems that you discover, get the skeletons out of their closets between those kind of two facets, the core values and
And that big initial engagement, you were able to have huge success at Pandata. Thanks. Well, it's crazy, right? A lot of people think like core values are these soft things or fluffy things. But I hope that this gives an example to those maybe considering going into entrepreneurship of how powerful it can be to have these simplifying things that are really unique.
And it doesn't have to be a single word. It can be a phrase. It can be a motto. But it's these concepts you rally around. And if you play with it, it really turns into a powerful process. You use it in interviewing, in evaluating promotions and raises. You can talk about it in context of, hey, you really exercised ABC core value on this one client engagement, or you really didn't. Let's talk about that.
Yeah. And I like how you had interview questions related to them. I also like how you came up with your core values. We're going to have to, everybody needs to sing a little bit of David Bowie. Am I sitting in a tin can?
fun in the world and think about who you're in that tin can with and who you want to be in that tin can with. I mean, when I was talking to you and I saw your face light up, you're like, I'm like, I know you know you're five and I know you definitely know the five you don't want on there. I'd want to spend some time thinking about it. I've been, I'll spend some time maybe after this podcast episode thinking about that. It's kind of hard while I'm in the middle of the episode trying to focus on what you're saying and keep this interview on track. But
Yeah, I definitely, I had some things that came to mind right away. There was, there was kind of, I, I immediately thought of one or two people that I definitely want to be with and one or two that I definitely wouldn't. Amazing. Yeah.
So yeah, I'll expand on that a bit after the episode. Personally, probably won't share. I'd love to hear what those values are. I'll tag them on LinkedIn. Great. I love it. Five people I definitely want to be in a 10K with, five people I definitely don't. Tag them on LinkedIn. That'll be fun. Tag all 10. Exactly. Exactly.
All right. So let's move on a little bit. Something that Pandata ended up specializing in was highly regulated environments. So doing data analytics, building machine learning, AI systems within highly regulated environments. Tell us about how Pandata ended up getting into these highly regulated environments, what those environments were, and the particular challenges that you face in those kinds of environments.
So we ended up working with hospitals, life sciences companies, higher education institutions, energy and utilities, some mild defense work, and then financial services. Mild defense. Mild defense. So like not actually like, you know, all the cool, exciting stuff, but like defense companies looking at operational efficiency and little use cases like that. But yeah,
but you still have like these setups where you're dealing with code that maybe lives in an environment that's totally internet air gapped. And, uh, you've got a VPN into super secure systems. It was, it was crazy, uh, getting to see all the logistics. In fact, on my team, uh, the joke was that he didn't have at least three laptops. You weren't busy enough with client work. Uh, so we often got like super secure devices shipped out to, uh, our team members and, uh,
It was really cool. So that's what I mean when I talk about heavily regulated environments. Now, how we got into this was actually probably a touch of not stupidity, but naivete at the start of this journey. As it turns out, you need big insurance bucks if you're going to serve this type of a market. And the companies that would typically go after these engagements were looking at minimum spend, multi-seven figures.
And we came in and we'd say, well, the small folks don't know how to do this and they don't want to engage the big folks. I think there's an opportunity here to build something scalable. And we, you know, I had my early background in healthcare research and working with hospitals and that was kind of the early phases of PAM data. It helped build this track record of saying, Hey, we've worked with very sensitive data before. And, um,
As we built up that portfolio, it became a lot easier to navigate that work. And as we did that, we had to be a lot more cautious going through data protection training, data privacy training. And it's not to say that all of these different data privacy and data protection laws are the same, but you start to see patterns. You start to see generally acceptable patterns about data.
How you think about privacy, how you hit stop and ask some questions, think about unintended consequences, the ways in which variables can reveal information unintentionally, et cetera, et cetera. And you also think about your obligation, secure passwords. And we ended up building processes that helped differentiate us for these midsize projects that the big guys really didn't care about. It felt great. Every so often we'd steal a project from McKinsey or Center and
That just made my day. But as we started to build up that portfolio, we started to build up the expertise in how do we handle these tricky situations? How do we talk clients through it? And that ultimately became a big part of the reason why I further wanted to acquire Pandunia.
Very nice. Yeah, it's a great story there. And I can see how this is helpful for people who are thinking about their entrepreneurial journey and where could they carve out a particular niche? Where could they find product market fit? And that was a great idea there where you're okay, there's tons of data science projects out there that are six figure contracts that McKinsey and Accenture don't want to do.
And that might be a pretty good engagement for somebody just getting started on some data science consulting. Well, it's interesting because those very same projects are the ones that their companies actually want to de-risk. Like even if they have a data science team, they actually want somebody else outside to put a layer of protection between their team's work.
And when this product is delivered. And so over time, we built up insurance, we built up practices, we built up these processes that were so airtight, like we were way more secure than any of our clients were. And we worked with some major enterprises. And it allowed us to navigate those spaces, charge premium, and those same clients would come back and work with us again and again and again. Very cool. Where did the Pandata name come from? I just suddenly...
I'm curious about that. So funny story. I was racking my brain. I had just failed my first startup and I was doing these data science tutorials and Pandas was on my mind. And I was like, you know, this is a data science company. Let's go with Pandas.
Pandata. And then I used a very early form of an AI logo generator. So the Pandata logo was actually generated by an algorithm that used simple A/B testing. It's like this or this, this or this. And it eventually came up with a font, a logo. The brand was pretty much AI generated back in 2016.
Wow, that's cool. It does make sense that you would have kind of pandas in there for people who aren't aware, super popular, open source. If you're listening to this podcast, I hope you know what pandas are because it's a part of the building blocks and history of data science. The lovable bears. So when you're coming into these kinds of organizations that you're working with, higher education, defense,
There's probably a lot of people in those organizations that aren't data literate or AI literate. How do you tackle that? So literacy is, I think, one of the biggest barriers to AI adoption. And I love to share this with a story that is just one of my favorite examples of how AI or lack of AI literacy can backfire.
We were working with a health tech company and they partner with insurance carriers. And what they're trying to do is they're going through claims. They have about 50 million or so patient lives represented in their database. And they're trying to identify who might qualify for certain government assistance based off of recent diagnoses, medications. And it's not as clear cut as it sounds. There's guidelines that Social Security puts out.
And they had a rules-based approach and they would go through and try to identify who are the 10,000 or so members that we want to reach out to this month.
and help them through that process. And then when they get reimbursed, our client would get a small fee for helping with that increased coverage. So it'd be a direct mail, hey, we think you qualify, we can help you, we'll help you with the application process. So machine learning is actually a really useful approach here. We partnered with them early on, and we built a model, it was an ensemble model. And
We showed that our model could actually help identify 30% more patients than they were identifying with their traditional approach. This was exciting. 30% more revenue made a lot of sense. Let's pilot this. We rolled it out. And if their baseline was reaching 100, you know, if they reached out to 100 people, they'd get about 10 people that would go all the way through this process.
in our pilot out of 100, only two made it through this process. It was worse than the human-led approach. And we almost got fired. This was really, really, really bad results. And I was racking my brain. I'm like, no, no, we validated this. The stats were great. We double ran the numbers. It turns out that the team responsible for sending out
or selecting the mailing pool have year-end bonuses that depend on the quality of their selection. And then it was also marked with pilot. They didn't trust. They didn't trust these new individuals, this new mix of individuals we were bringing in. And
We started to understand why. We had a slightly different diagnosis mix. We were like finding patients that ultimately would have gotten the subsidy, but they had different combinations of characteristics that this group just wasn't used to looking at. So it looked weird and they didn't want to risk their year-end bonuses. It said AI pilot all over it and totally failed.
So we went back, we worked with this team, we sat down with them, we showed them this is how we train a model. This is how we know it works. We put up some of those cases that they thought were weird. And we
And we started to work through them. Why would you not trust this? What about this do you need? So we ended up ultimately doing this educational tour, built their trust in the process. And then two, we built a process where we could use explainability. Anytime a patient was predicted, we could show the factors that were contributing most to that prediction. And that also built more trust in the process. We relaunched the model with no changes to the model.
And we were able to help them grow their reach by 18%. And so the only thing we really changed is how humans interacted with the model. That was one of the very first times I started to see the importance of cultivating AI literacy.
And there's a lot of great courses out there that I think satisfy this. I love to do little workshops with cats and dogs, but we got folks like Andrew Ng, who does AI for everybody, now Gem AI for everybody. Our dear friend, Cassie Kozarkov, who does amazing work with decision intelligence and does it in such a fun and playful way. So I try to bring in materials and inspiration from folks like that
to help build that intuition of, well, how does AI go wrong? When can we trust it? And I find that when you empower organizations with that skill, they're able to do much more with the tools they already have.
AI is transforming how we do business. However, we need AI solutions that are not only ambitious, but practical and adaptable too. That's where Domo's AI and Data Products Platform comes in. With Domo, you and your team can channel AI and data into innovative uses that deliver measurable impact.
While many companies focus on narrow applications or single-model solutions, Domo's all-in-one platform is more robust with trustworthy AI results, secure AI agents that connect, prepare, and automate your workflows, helping you and your team gain insights, receive alerts, and act with ease through guided apps tailored to your role. And the platform provides flexibility to choose which AI models to use.
Domo goes beyond productivity. It transforms your processes, helps you make smarter, faster decisions, and drive real growth. The world's best companies rely on Domo to make smarter decisions. See how you can unlock your data's full potential with Domo. To learn more, head to ai.domo.com. That's ai.domo.com. Nicely said.
And really great story there to bring to life. Actually, throughout this episode, it's been great how you bring in specific case studies that make it easy to understand the principles that you're describing. Another aspect of consulting that I suspect you might be able to provide a good analogy for, I guess we'll find out now the pressure's on, is that when you're doing consulting,
You have to be able to demonstrate that you're delivering a return on the client's investment. Some ROI. Oh my gosh. Yeah. How do you, how have you achieved that in your engagements? So this might sound super obvious, but one of the biggest missteps is not having the right value hypothesis at the onset of a project. And I, I,
I talk about this a lot in context of AI because we get enamored with the model itself. Like the patient example I was just talking about. We get enamored with, can it accurately predict who's going to qualify versus really stopping to think about what decisions are we influencing?
And what does the success of that decision look like? In another example where we worked with cancer readmissions, we were working with the health system and we were trying to help them build models that could help them with solid state tumor cancer patients and who nationally have a readmit rate of about 25%. It's really bad. So if you can use machine learning to identify who's most at risk, you can maybe come up with a more effective intervention.
In our very first iteration, we were actually modestly successful. It's a very tricky population to work with because of the complexity of the patients. But the providers would say, okay, now what? This patient's at risk. What do you want me to do?
It's actually cognitively overwhelming and not helpful when you have this alert and you're not actually offering a solution. So we actually had to go back to interpretability again and think about what would be useful to know. Is this patient at risk because they're on a certain type of medication or they have a certain comorbidity or...
some social attributes that we knew about the patient. Like they come from a zip code that has low reliable access to transportation. So this patient actually might be readmitted, not because they're super sick, but because they might use the ER in an inappropriate way. So let's get them some social assistance. So it's really interesting to think about, like when we talk about models, we get very excited as data scientists about the accuracy of the models and we don't.
Stop to think what levers, what decisions are we influencing? What's the value of a successful intervention? And then realistically, like how often is that decision going to be influenced by whatever work we're doing? So we now work with clients up front to try to map that out. We try to challenge it. We try to stress test it.
In this readmission example, we actually would say, okay, we can identify 80% of these patients, and we think realistically that there's only going to be a 20% intervention rate. And so the value is based off of that. And we stop, check with the client, hey, if this is the only value that we get, is it still worth X investment to try to figure out this process?
And then you go through the project, and as you learn more, as you validate your models, you keep going back to that early hypothesis. Very nice. Well delivered. Again, and with an example, five stars. Great work, Alex. I hope I can keep delivering. I don't know.
All right. So in all of these projects that you've done over all these years, the eight years that you had at Pandata and then now continuing to do consulting work at Further, you've hired a lot of data scientists over the years. And you've alluded a little bit earlier in the episode to the kinds of things you're looking for. You have interview questions around your core values, for example. But what are the key skills or attributes that you look for in data science hires today
For example, it seems like AI engineering skills are a really big deal recently. Oh my gosh. And hard to find. The most challenging and frustrating thing about data science today, really just over the last 10 years, is what being a data scientist meant.
10 years ago, five years ago, three years ago is drastically different than today. So some of the technical skills that we're looking for today are individuals who are really strong with evaluation and quality control, specifically when it comes to multimodal data. So text-based data, image-based data, the ability to formulate a statistical design and say, okay, we're working with
this unstructured problem, and we need to come up with a way to measure is our approach working and do it in a way where we're not creating statistical weirdness, where we're basically predicting something we've artificially put into the data.
And so we really look for individuals who have that skill. We're also seeing the rise of machine learning engineers, specifically with this AI engineering flavor. Right now, we're actually looking to hire folks who have more experience with deploying language models and
running ML Ops or whatever we want to call it today, AI Ops around these language models and how we diagnose when they're being prompted, when they're drifting, when they're behaving weirdly. But those are some of the technical skills we look for.
As far as the soft skills, I'd say that the core values that were really successful at Pandata for me are going to be an evergreen. And the traits that I consistently look for in individuals, regardless of the importance of the technical skill today, tomorrow, five years from now, we're always going to want people who explain things plainly, who naturally are driven to learn the next thing, who can deal with ill-formed problems. These are things that if
you know, you really want to stand out as a data scientist. They're worth cultivating. Nice. Thanks for those tips on what you're looking for from data scientists that you hire, both in terms of hard and soft skills. I'd love to learn more about kind of the culmination
of a lot of the things that we talked about today. So we talked about core values, hiring great people, obviously just now delivering a return on investment to clients. Ultimately that allowed you to be acquired by further a much larger consultancy. And so what was that process like? How did you start to think, wow, you know, I've built this company over eight years. I might like to sell it. And then you find a sell. I mean, how does all that work?
So over the years, different folks have reached out and they'd say, oh, I think we want a consultancy or we like what you're doing. We don't have a consultancy, but we need a data science team. And it's a lot easier to just buy a data science team. So I flirted with the idea. The goal of building a business is ultimately you do want it. You do want to exit. You want to find a home. You want to find a place where your business can then create more value consistently as a part of something bigger.
So I'd always been open to it. And a funny story is Mike Gustafson, who's the CEO of Further, or president of Further, had seen Pandata's story in the Case alumni. So I went to Case Western Reserve University. And he started following and thinking about the future of Further at the time it was called Search Discovery. And they wanted to really move into cloud infrastructure, cloud
AI and advanced machine learning projects. They had done great work in the analytics space and performance marketing space with these heavily regulated environments. And so it made sense for them to want to get into data science, machine learning. He reached out to me on LinkedIn and I was in the middle of conference season. I totally missed
His message. And so some of his employees who I was connected to, they have an office here in Cleveland. And so I'd met them at local meetups. They said, you really should talk to Mike. I answer his message. We set a time to meet. I go to his office. We do these pleasantries, fellow alumni. And then within five minutes, he says, I think we want to buy Pandata.
And I said, oh, okay. And as much as I wanted to say, heck yeah, this is the thing that I wanted the most. I had to stop and I asked a few questions. And one of them was, tell me about your core values. And it wasn't again, because I'm going to this whole like kumbaya, warm, fuzzy place. It was actually, I wanted to understand if I were to sell what I have,
to this organization, what is the compatibility of my client base that would now be moving over? Would they stay? Would they have people taking care of them that are similar to the people taking care of them today? Would my team thrive? Would they be able to move over and seamlessly move into projects? And even though the core values are named different things, there was a lot of organizational compatibility. We had a lot of the same kind of problems that we were trying to solve, same attitudes, similar pay scales.
And so when I saw that I could take my engine and put it into this machine and it would create even more value, that's a win-win. And that's one of the biggest lessons learned I talked to other entrepreneurs about is find that home that maximizes the value for them and you. And not every organization is going to look like that. So in the case of Further, it made a lot of sense.
We went through diligence at a record pace. It went from, this is cool. Let's explore this some more. Conversation got a little bit deeper in December. And then between January and March of last year, we went through so much paperwork and the deal came together in about three months. Wow. Really great story there. I love this idea of finding the right home where there's a win-win for sure for both organizations, the one being acquired as well as the acquirer. Very
Absolutely. And, you know, I hear a lot of horror stories of similar exits and then it doesn't work out. And it's because that upfront configuration matchmaking really didn't happen. And it's very hard. We talked about it's hard to say no to good things. And so I see a lot of entrepreneurs who end up pursuing it because it sounds great in theory, but then it falls flat. And so in this instance, and I can't really end this episode without giving a plug to further and saying we love our new home.
We're getting to work on exciting things and a lot of that vision and excitement that we had then has came forward today because of that matchmaking process.
Fantastic. That's great, Cal. Congrats to you, the Pantated team and the further team. Sounds like a great marriage there. One kind of final question that I have for you before I get to my questions that I always ask at the end of every guest. So my last question for you specifically is over all these years, you know, from the startup that you raised capital for right out of Case Western to
founding Pandata to the acquisition by further, what were your biggest lessons learned? Like what were your biggest failures? So I, it's interesting when you first start a business, your temptation or you're tempted to keep the idea a secret. And we have a lot of pride over presumed intellectual capital or presume something that's valuable.
And over the years, I've found that some of the most successful people are the most giving of their ideas, the most giving of their time, the most humble and willing to admit when they're wrong. And I think I learned it the hard way, but a lot of what defines how I approach talking with people, engaging with people,
really shaped by that. And so one of my biggest lessons learned is just be okay with being wrong and be okay with asking for help. And that's a really hard thing to do, especially when you're a data scientist and you're trained to be smart. You're trained and taught that your smartness is your differentiator. And then you're trying to become an entrepreneur where you're trained and taught that you have to scale or go bust. And the contrary advice is be okay with being vulnerable.
Do you ever feel isolated, surrounded by people who don't share your enthusiasm for data science and technology? Do you wish to connect with more like-minded individuals? Well, look no further. Super Data Science Community is the perfect place to connect, interact, and exchange ideas with over 600 professionals in data science, machine learning, and AI. In addition to networking, you can get direct support for your career through the mentoring program where experienced members help beginners navigate.
Whether you're looking to learn, collaborate, or advance your career, our community is here to help you succeed. Join Kirill, Adelant, and myself and hundreds of other members who connect daily. Start your free 14-day trial today at superdatascience.com and become a part of the community. Nice. That's a great tip, Cal.
I like that a lot. Really nice point to kind of end this interview on. Yeah. At least the unique questions that I have for you. I do, as I alluded to a moment ago, have questions that I end every episode with. So we need a book recommendation from you, Cal. I got one. So
I spend a lot of time on communicating. I keep trying to improve my ability to communicate. And a recent book that I just finished is Super Communicators. It's actually right here behind me. And it's by Charles Duhigg. And it's a classic in the space, but it really talks about the art of individuals who are able to
Maybe they don't seem like the most charismatic person in the room, but they have this natural ability through conversation to get everyone to feel at ease around them and more willing to
reach consensus. And so it goes into the science of how people do that stories and examples of it in practice. And as somebody who constantly tries to improve communication, I got so much out of it. So 10 for 10 recommend for anybody wanting to get more effective at just having
better conversations with people. Yeah. And if people want a taste of what the super communicators book might be like, uh, before buying it, you can check out episode eight Oh five of this podcast when Charles Duhigg came on and we talked just about super communicators. I love that. Nice.
Awesome. And then very last question before I let you go is how can people follow you after this episode? Great question. So I am the most active on LinkedIn. And if any of these points resonate with you, if you're thinking of starting a data science oriented business, or you're super excited about responsible AI and heavily regulated environments, reach out to me. I'd love to hear.
Awesome, Gal. Thank you so much for that generous offer to our Super Data Science listeners. Thank you so much for being on the show again. So amazing talking with you. I mean, yeah, we've been friends for years and so I always enjoy chatting with you. And, you know, I'm not surprised that you give an amazing interview, but this was exceptional. Really well done. Thank you. I was honored to finally be a part of the show, John. Thanks again.
What a great episode that was with Cal Aldubabe. In it, Cal covered how he built Pandata by focusing on mid-sized data science projects in regulated industries like healthcare, defense, and education that were too small for major consulting firms but required specialized expertise like his firm had.
He talked about how his company's success was built on two pillars. First, a mandatory 30-day discovery and design phase costing about 50 grand to ensure client commitment and project clarity. And two, his core values, be approachable, win together, cultivate trust, pursue growth, and tame uncertainty.
He also talked about how making complex data science concepts accessible and boring through simple explanations and analogies proved more effective than emphasizing technical sophistication. He talked about how AI literacy among stakeholders proved crucial for project success, in one case, improving a model's adoption from 2% to 18% simply by helping users understand and trust the system.
And he talked about how Pandata's eventual acquisition by further succeeded because of strong alignment in organizational values, client base, and vision for creating value at a larger scale. As always, you can get all the show notes, including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Cal's social media profiles, as well as my own at superdatascience.com slash 865.
And if you'd like to connect with me in real life, as opposed to online, I'll be giving the opening keynote at the RVA Tech Data Plus AI Summit in Richmond, Virginia on March 19th. Tickets are super reasonable and there's a ton of great speakers. So this could be a solid conference to check out, especially if you live in the Richmond area. It'd be awesome to meet you there.
Thanks, of course, to everyone on the Super Data Science Podcast team, our podcast manager, Sonia Brayovich, media editor, Mario Pombo, our partnerships manager, Natalie Zheisky, researcher, Serge Massis, writer, Dr. Zahra Karche, and of course, our founder, Kirill Aramango. Thanks to all of them for producing another invaluable episode for us today, for enabling that super team to create this free podcast for you. We are deeply grateful to our sponsors. You can support the show by checking out our sponsors' links below.
which are in the show notes. And if you'd like to sponsor an episode of the show, you can get the details on how to do that by heading to johnkrone.com slash podcast. Otherwise, share the show with people who would enjoy it. Review the show on your favorite podcasting app or on YouTube. I recently got a question about this and on platforms like Apple Podcasts,
You can only review the whole podcast as opposed to individual episodes. So that's a limitation. I hear Spotify is going to allow you to have more sophisticated commenting soon. We'll see. But anyway, yeah, so review the episode or the entire podcast on whatever podcasting app you use or YouTube. Subscribe, obviously, if you're not a subscriber. Feel free to edit our videos into shorts to your heart's content.
But most importantly, we just hope you'll keep on tuning in. I'm so grateful to have you listening. And I hope I can continue to make episodes you love for years and years to come. Until next time, keep on rocking it out there. And I'm looking forward to enjoying another round of the Super Data Science Podcast with you very soon.