We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Unlocking Data Strategy: Data Literacy for Better Business

Unlocking Data Strategy: Data Literacy for Better Business

2022/11/15
logo of podcast Smart Talks with IBM

Smart Talks with IBM

AI Deep Dive AI Chapters Transcript
People
N
Nicholas Renotte
Topics
Nicholas Renotte: 数据素养和数据战略对商业成功至关重要。从11岁开始学习Excel宏,到在澳大利亚储备银行工作,数据处理能力贯穿其职业生涯。他强调数据解读的准确性,避免数据误导,以及数据在商业决策中的重要性。他还分享了其在YouTube上创作数据科学教程的经验,旨在降低数据科学学习门槛。他认为,成功的企业数据战略需要数据准备、组织、分析和迭代,机器学习是最后一步,数据准备是关键。企业实施数据战略面临的主要痛点是数据收集和组织,以及数据发现和搜索。IBM的Cloud Pak for Data工具可以帮助企业收集、组织、分析和利用数据,并进行数据目录编制和元数据管理。数据可以优化工作流程,提高效率,并通过自动化工具减少重复性工作。创造力意味着开放思维,勇于尝试新方法解决问题。数据故事讲述的创造力体现在将数据用于公益事业,例如改进辅助技术和语言翻译模型。量子计算的未来发展趋势将极大地改变机器学习模型的创建和训练方式。 Ronald Young Jr.: 作为访谈主持人,Ronald Young Jr. 主要负责引导话题,提出问题,并对Nicholas Renotte的观点进行回应和补充。

Deep Dive

Chapters
Nicholas shares his early experiences with data and coding, starting from age 11 when he began working with spreadsheets and macros, influenced by his father's interest in stock trading.

Shownotes Transcript

Translations:
中文

Hello, hello. Welcome to Smart Talks with IBM, a podcast from Pushkin Industries, iHeartRadio, and IBM. I'm Malcolm Gladwell. This season, we're talking to new creators, the developers, data scientists, CTOs, and other visionaries who are creatively applying technology in business to drive change. Channeling their knowledge and expertise, they're developing more creative and effective solutions, no matter the industry.

Our guest today is Nicholas Renaut, Senior Data Science and AI Technical Specialist at IBM. Nicholas's job is to help companies formulate a data strategy that streamlines the way they do business and prepares them to use sophisticated AI technologies. But beyond his day-to-day, Nick is also a content creator on YouTube, where his channel has over 100,000 subscribers.

His videos explain computer science concepts in a way beginners can understand, and he often demonstrates how to use machine learning and data science to solve novel problems. On today's show, how Nicholas learned data science from the bottom up, the fundamentals of data management, and how an innovative data strategy can help businesses create novel solutions.

Nick spoke with Ronald Young Jr., host of the Pushkin podcast Solvable. Along with being a frequent contributor to NPR, Ronald also hosts and produces the podcasts Time Well Spent and Leaving the Theater. Okay, let's get to the interview.

So tell me a little bit about how you got into data and when you found out like the power that it really harnesses. Do you have a story or anything that kind of like when you first piqued your interest in data?

My first interaction with data and with coding was actually when I was around about 11 years old. So this was really just getting started with just looking at spreadsheets. So my dad would come home and after working a nine to five job, he actually started working with investing in stocks and doing value based trading that way.

I'll always remember I walked up to his desk one time and he said, Nick, if there's one thing that you should learn, I'm seeing all these people work on these things called macros in spreadsheets. And these people are like wizards inside of my business.

I know that you're still in high school, but I really think you should learn this stuff. And I started dabbling in some Excel spreadsheets and started just recording macros and tweaking stuff. And that's where it all started. But from there, it's always been a recurring vein throughout my career that I've done some sort of

wizardry with data, whether it be coding or business intelligence or data viz. It's always had a bit of a strain throughout whatever I've done, whether it be startups or YouTube or what I'm doing now at IBM.

Your dad was right. Let me just say that because as someone who's trying to put together a spreadsheet just to manage my personal finances, trying to look up the formula to actually bring a value from one worksheet to another is enough of a struggle for me. So I'm glad to know that. It's wizardry, right? It really is. It absolutely is. So knowing that this was how you started getting into spreadsheets, you're looking at stocks and all of that.

Can you talk to me about how you found out the importance of data literacy, how you begin to value understanding what the numbers meant and what power that could have? I got a cadetship at one of the big four accounting firms and started out as an auditor there, which is pretty much 100% data focused. So I saw that

These numbers ultimately fed into a significantly bigger picture, which was a formal annual report. And numbers being wrong in an annual report can move markets, right? Those numbers need to be absolutely bang on. But I think that is sort of where it started. Where it really culminated was when I started doing some work at the Reserve Bank of Australia. And

Those numbers don't just impact the metrics for a particular organization. They impact the entire country's metrics. Getting those numbers wrong on a particular chart or getting them right on a particular chart can move entire organizations or can shift an entire country. It's kind of crazy what the value that...

doing things correctly with data has. So when you're presenting a metric, you have to ensure that you are portraying the appropriate message. It's not just about the raw number because correlation does not necessarily imply causation. So understanding what it is that you're saying is so, so important. And it is so,

so much more powerful now that we've got so much more data available at our fingertips. It's really easy to go and grab a bunch of metrics and go, hey, I'm going to grab this data from over here, grab that data from over here, mesh it together. Hey, look, these two lines follow the same trend. They must be related. Do you find yourself ever looking at data points and say, those...

I don't understand this chart. Where did they pull this from? Do you find yourself doing that a lot in your regular life? Oh yeah, there's some great charts out there as well that you always see and they plot the number of Nicolas Cage movies against the GDP of Bolivia or something and it's like, well, they're going in the same direction. They must have some relationship. But

People can really quickly look at a picture and go and make an assumption about what that is saying without actually interpreting, hey, are these on the same scales? Are they what time period is being displayed? What am I actually looking at here? And I find myself doing this more and more often when I just see a chart. I'm like, hold on, let's just not make any assumptions. What is this chart actually trying to say? What is it actually trying to portray? Because

You can lie with statistics if you know what you're doing. It is, they're so powerful and people can gloss over them so quickly. We've got attention spans that are so much shorter these days. It can be very, very easy to take away the wrong message.

So you also produce content across various platforms, including YouTube and your personal blog. As a content creator, how did you get started in that field? And what type of content are you creating? Yeah, that's a crazy story, right? So I always wanted to get into tech and said, hey, I'd really, really like to work for IBM. I saw what they were doing with Watson and I'm like,

why aren't people talking about this more? And I had no affiliation with IBM at the time. And I'm like, this is so cool. There used to be this thing called, or this service available on the cloud platform called Personality Insights. And you could plug in a little bit of text

And from that piece of text, it would analyze that particular person's personality based on the big five personality traits. And there actually used to be this demo app where you could hook it up to a Twitter account. So I could pass through Oprah's Twitter account or LeBron's Twitter account, and it would actually analyze their profiles. And I was like, this is so, so cool. Wow. It was nuts.

And I was like, but a lot of people don't know how to use this. So that was quite possibly one of the first tutorials that I made on YouTube. And I actually used a bunch of videos that I made following after that to...

finally land a job at ibm i actually spammed a bunch of links in my resume and my cover letter i was like hey i'm already working with this stuff i'm i could do it and um the person that hired me she actually said that that was like such an amazing way to to portray what what you love about what you do that that that had such an influencing factor in actually getting the job but um

Yeah, I did it because one, the tech was so cool and I thought it was so interesting and so powerful. And yeah, eventually it helped me land that job. So you do a lot of tutorials where you're breaking down complex topics to kind of a wider audience. Why is that important for you to do? Yeah, I think one of the amazing things about knowledge is it's one of the things that you can give away and never lose, right?

And I think one of the trickiest things about the whole data science and machine learning field is that it can be pretty tricky to get started. And sometimes we get hung up with learning from the bottom up, right? And

There's nothing wrong with learning fundamentals and learning foundations and really getting stuck in. But in order to stick with something, you have to find it interesting. So if you can see the end result and then work your way back up and work out how that's worked.

then it is so much more attractive because you get that instant gratification and go, hey, I've just built this machine learning app that is able to decode sign language. It's so cool. Now I'm going to go and work out the tech behind it. Admittedly, not everyone goes and works out the tech behind it, but what I'm trying to do is make it so that more people can get involved and get started with it. Lately, I've been doing these things called code that challenges and they're kind of crazy, right? But I love doing them. So

I have to build entire machine learning or data science applications without looking at any reference code, stack overflow, or looking at any documentation within 15 minutes. So it is literally just like a trial by fire. I'll have my phone, I'll set a timer, and I'm like, all right, guys, we're on. And it is literally just...

coding non-stop and me explaining on the go, but it allows people to see and explain my thought process as I'm developing it. That's obviously super fun, right? Because it's highly engaging and it shows people that, hey, you can get started in this relatively quickly. Nicholas is the kind of person whose passion for data science is so great, it spills over from his professional life onto his YouTube channel.

But when he's not making videos, he's using that same expertise to help his clients make their businesses work better. At IBM, Nicholas works with businesses to formulate a data strategy, preparing them to get the most out of technology like machine learning or deep learning. He explained to Ronald why thinking critically about the data it generates can help a company run more efficiently.

So there's a quote that you've used in your presentations. 99% say their firms are trying to become insights driven, but only one third report succeeding. What is the role of creativity in the successful one third? And how are you at IBM helping to increase that number?

I remember going to a talk by our previous CEO and she said that there's a large number of organizations that are just experimenting with random acts of digital. So they're just testing out some of these new technologies, they're seeing kind of what's possible, but the ones that are truly being successful are the ones that are getting their

their data ready their data strategy in play they're the ones that are starting to collect their data they're starting to get it ready and organized they're starting to take a look at it and starting to iterate and prototype and in a structured manner they're starting to roll this stuff out the journey to get something as sophisticated as machine learning into production

is a lot more difficult than I think people realize because you're now building a box that has its own rules. You haven't defined those rules yourself. So how do you explain that when something goes right? But how do you explain when something goes wrong? And having governance around that is absolutely critical, which is really where the data strategy does come into play. So let's get into more business-focused data strategies.

Why is it so important to have a data strategy in place to fuel AI modeling? And how does data literacy play a role in getting value from these models? We've got algorithms left, right, and center these days. But I think the thing that people forget is that you can't use any of these algorithms unless you've got data. So ensuring that you have a structure in place to

One, collect your data. Two, organize it. Three, analyze it. And then four, infuse some machine learning or deep learning into it is absolutely critical because if you don't collect it, you can't do anything with it. If you don't organize it, you can't discover what you've actually got, what the quality looks like. If you don't analyze it, you don't know whether or not you can trust it. And then the

He infuses always like the icing on the cake, right? So the machine learning, the deep learning, all the cool buzzwords that people throw around. That is like the last step. And it is always the coolest step, but you can't ever get to that last cool step unless you've gone through the hard work that's come before.

Let's expand a little bit on the pain points for companies when they're developing or implementing a data strategy. What do those pain points look like? Honestly, the biggest pain point that I see organizations, actually the top two that I see them coming back to over and over again is collecting and organizing their data. So

Let's say for example you've got a manufacturing type organization and what they want to do is they want to improve the production quality on a particular manufacturing line. So ideally

If they see that they've got defective products on the manufacturing line, they want to get rid of those sooner rather than later because they don't want to be shipping them out to the customer going through the whole warranty and claims process. That just costs a ton of money. So they're like, well, it'd be great to use some computer vision or some deep learning to detect when we've got defects on the product line and then we can grab those and rip them out.

somebody along the line is like, "Great, let's go and do it." The first stumbling block that you're going to trip up at is, "Hold on, do you have any images of defective products from example cameras that are looking at that production line?" If you haven't gone and collected images of that or video of that, there is no way in hell that you can actually go and build that system to improve your organizational productivity.

So knowing well in advance what data you're likely to need is absolutely critical. It is the first step in the data science lifecycle. So collecting, understanding and exploring your data is the absolute first step. The second one is a little bit more interesting. So let's say, for example, you sort of want to get in on the craze that is data science or machine learning.

and you bring on a data science team, the next biggest stumbling block that I find a lot of organizations trip up on is discovering their data. They've got a ton of data, but nobody knows what they've got. So being able to find, search, discover, rate, review, and rank that information is paramount because you'll have people come in and go,

Okay, so a line manager has approached me and said that we want to take a look at our top performing customers and we want to build a retention strategy. So we're not losing customers anymore. Well, your data scientist is then going to go, well, do we have data of customers that have left previously? If you can't easily search and find out what you've got, that makes it pretty hard to go and build those models. So

Collecting, organizing, and discovering are really absolutely critical, but they can be a little bit tricky to handle in a large number of organizations. What kind of supporting technology and new solutions do we need to meet growing data management issues?

it really comes down to a few things. So ensuring that you can one, collect the types of data that you're looking at. So I think when people think of data, they're always thinking of, hey, it's just going to be a bunch of spreadsheets. It might just be stuff that we can throw into a database. But there is so much more out there, right? There's video. How do we store that? How do we hold that? There is images. There's natural text like we were just talking about.

ensuring that you've got appropriate processes in place to be able to store, hold, and catalog that I think is absolutely critical. We talked a little bit about data cataloging and the need to be able to search and discover that data. That is absolutely paramount. Once you've got it collected, how do you find it? What is IBM's unique approach to facilitating access to data within companies? So,

One of the biggest things and one of my favorite things that I get to work with is a particular tool set, right? And this tool set is called Cloud Pak for Data. So without getting too pitchy, the absolutely amazing thing about this is that

those stages that I was talking about, right? So collect, organize, analyze, and infuse. It actually helps facilitate each one of those stages, right? So you can actually collect, store, and hold your data in a secure and governed place. You've got data cataloging capabilities, which allows you to search. Like one of my favorite things is that you might have a data set, right? So I might be a data scientist, and then we might have another data scientist on the team.

I can have a data set inside of there and I can actually rank it and add comments and go, hey, just be wary of this column. We've got certain features that you need to be mindful of. And that provides additional metadata to understand what is or what my data actually looks like and things that I should be mindful for. So I'm I'm Joe employee. How can data be helpful to me?

Great question. So, I mean, data is impacting everyone, right? Whether you like it or not. And more often than not, what you're going to find is that you can improve whatever it is that you do by looking at that data, whether it's let's take an organization out of it.

If you use sleep trackers, you can begin to see when your sleep or when you're getting good quality sleep versus when you're getting bad quality sleep. If you start to collect additional data points like, hey, am I drinking enough water during the day? Am I doing certain things like looking at my phone just before I go to bed? Are these things influencing my sleep? And is that causing a negative impact on my quality of life?

So that's taking a broader view of it. But when you step into a team or a business view,

data can make your life a billion times easier. If you know that there's a particular issue in a system earlier on in a data pipeline before something crosses your desk, you might go and say, "Hey, look, if we just change how we collect these pieces of information, if we just transformed what we actually did with it, this is going to streamline my entire workflow and help me out." But not only that, I work a little bit with the automation team and

They're really big on robotic process automation. Let's say you're doing something each and every single day. You're copying a file from here to there. You're grabbing some information from a website. You're throwing it into a form and you have to do that 20 times a day. There are tools that can automate that entire process for you. And they're smart. They're not just looking at where you're clicking on the page. They're looking at what applications you're opening. They're looking at what fields you're pulling data out of.

you can automate those entire workflows. That means that you don't have to do that repetitive kind of boring work that you don't really want to do. You can palm that off and to the robot and do the stuff that you actually really want to get involved in. As Nicholas said, the way a company leverages its data has an impact on every level of the business. Data informs how we do our jobs day to day and how we plan for the future.

Having an open mindset about data makes it easier for a business to come up with creative solutions. In the next part of their conversation, Ronald asked Nicholas how data science and creativity come together. So let's talk a little bit more about creativity. We talked a little bit about your YouTube channel and how you use that to help people get started with data science. What does creativity mean to you? And do you see your work as creative?

I definitely see my work as creative and I think creativity is truly thinking outside of the box and looking at just different ways of doing things. I think the biggest thing that I try to embody is having an open mindset and really never being willing to shut something down or not look at a particular solution or option and

because you really never know where a particular solution might come from. If you look at where some of the advancements in the medical field are coming from, it's because they've been open to new ideas, new materials, new ingredients, new recipes, new technologies. Having an open mindset really helps improve that ability to solve complex problems. And I think...

For me, creativity is really just having that open mindset. Tell me a little bit about how you approach novel problems. What do you do when you get stuck? I think the most important thing, I really like when I push myself to do something that I've personally never done before. And a lot of the time that yields new solutions to problems.

problems that that might be really difficult to solve it doesn't necessarily need to be using this particular set of techniques it's what else can we do to solve this problem and sometimes like it'll be staring you in the face and you'll just have no idea until you go hey i'm gonna throw everything out of the box and just give it a crack and see what is possible um but sometimes it does require that that little bit of grit to to push yourself to see just what is possible and uh

I think that's when I've come up with some of my favorite things that I've ever done. So something that I'm trying to adopt in my daily life and I'm reading a lot more about stoicism and philosophy and I'm seeing that you kind of really just got to push through sometimes to see what's on the other side.

We talked a little bit earlier about how folks can take bits of data and kind of tell their own story with it, especially if they know the story that they're trying to tell. But let's talk about using that for good. How does creativity play a role in data storytelling? I think there's just so much good that you can do with data that if you have that

in your core ethos, then the world's your oyster, right? I always come back to my favorite project that I've ever done and that was using computer vision to try to decode sign language. It is by no means a state-of-the-art model, but I figured, hold on, why has nobody ever approached this or at least shared how they've tried to do it? And I kind of just had to get real creative in trying to build that. I had...

I literally spent weeks just trying to install stuff and trying to get it running on my computer before I even got anywhere near building that particular model. And it's super hardcore in terms of trying to get it set up. But there's so many opportunities for good, whether that's improving accessibility to certain technologies, improving the quality of life for people that could benefit from us using data a little bit better. There's a large body of work with...

a bunch of different data scientists where they're actually building language translation models for languages which aren't hyper popular or aren't as widely spread as we might see in our day-to-day lives. If you look at

India, there are a ton of dialects. If you look at even where my parents from Mauritius, there's, there's a whole completely separate dialect where if you'd never heard it before, you'd be like, is this just slang French? But no, it's, it's like, um, it, it's its whole separate language that obviously allows or improves the ability for people to, to,

to tap into data and do a little bit of good. But there's so much. I mean, people are using medical image data to improve medical segmentation and improve diagnoses. There's just so much amazing work that's happening in that space. There is obviously the temptation to use data for bad, but I'd like to think that the large majority of the community are really trying to use it for good.

You started talking about a little bit just now, but what are some future trends and challenges and future topics or projects you're excited about? Anything in particular? Looking real further forward, what I'm super excited about, and I still don't know how it's necessarily going to impact me, whether or not that's going to change my experience as a developer or not, that we've got quantum computers coming, right? There's a ton of work that's happening in that space. It's going to radically shift

how large a machine learning model we're able to create, how fast we're able to train them. I'm just excited to see what happens in that space. I'm not a quantum physicist by any means, but I'm still excited to see what I'll be able to do with them in the future. I love that. As y'all continue to develop this technology, you're excited to play with it after it's built, which I'm totally on board. Yeah, I'll play with it. I don't want to have to build it.

Nicholas Renat, thank you so much for talking with me today. It's been an absolute pleasure. Thank you so much for your insightful questions. It's been awesome, Ronald. Nick made a point that I think is important to remember. When it comes to technology's ability to improve our businesses or make our jobs easier or even do social good, a thoughtful data strategy is always the first stepping stone.

Without good data, using machine learning or artificial intelligence to create innovative solutions becomes much, much harder. Our technology gets more sophisticated every day. But that doesn't mean we should lose sight of the fundamentals. If we want to get the most out of smarter technologies, better business decisions, more optimized technology, fresh and unexpected insights, we're going to need smarter data strategy.

on the next episode of Smart Talks with IBM, the power of Salesforce to transform the customer experience. We talk with Phil Weinmeister, head of product for Salesforce Americas at IBM Consulting, about transforming digital experiences with the power of Salesforce and IBM. Smart Talks with IBM is produced by Matt Romano, David Jha, Royston Besserve, and Edith Russollo with Jacob Goldstein.

We're edited by Sophie Crane. Our engineers are Jason Gambrell, Sarah Bruguere, and Ben Tolliday. Theme song by Gramascope. Special thanks to Carly Migliore, Andy Kelly, Kathy Callahan, and the 8 Bar and IBM teams, as well as the Pushkin Marketing Team. Smart Talks with IBM is a production of Pushkin Industries and iHeartMedia.

To find more Pushkin podcasts, listen on the iHeartRadio app, Apple Podcasts, or wherever you listen to podcasts. I'm Malcolm Gladwell. This is a paid advertisement from IBM.