SummaryMachine learning has the potential to transform industries and revolutionize business capabilities, but only if the models are reliable and robust. Because of the fundamental probabilistic nature of machine learning techniques it can be challenging to test and validate the generated models. The team at Deepchecks understands the widespread need to easily and repeatably check and verify the outputs of machine learning models and the complexity involved in making it a reality. In this episode Shir Chorev and Philip Tannor explain how they are addressing the problem with their open source deepchecks library and how you can start using it today to build trust in your machine learning applications.Announcements
Interview
Introduction
How did you get involved in machine learning?
Can you describe what Deepchecks is and the story behind it?
Who is the target audience for the project?
What are the biggest challenges that these users face in bringing ML models from concept to production and how does DeepChecks address those problems?
In the absence of DeepChecks how are practitioners solving the problems of model validation and comparison across iteratiosn?
What are some of the other tools in this ecosystem and what are the differentiating features of DeepChecks?
What are some examples of the kinds of tests that are useful for understanding the "correctness" of models?
What are the methods by which ML engineers/data scientists/domain experts can define what "correctness" means in a given model or subject area?
In software engineering the categories of tests are tiered as unit -> integration -> end-to-end. What are the relevant categories of tests that need to be built for validating the behavior of machine learning models?
How do model monitoring utilities overlap with the kinds of tests that you are building with deepchecks?
Can you describe how the DeepChecks package is implemented?
How have the design and goals of the project changed or evolved from when you started working on it?
What are the assumptions that you have built up from your own experiences that have been challenged by your early users and design partners?
Can you describe the workflow for an individual or team using DeepChecks as part of their model training and deployment lifecycle?
Test engineering is a deep discipline in its own right. How have you approached the user experience and API design to reduce the overhead for ML practitioners to adopt good practices?
What are the interfaces available for creating reusable tests and composing test suites together?
What are the additional services/capabilities that you are providing in your commercial offering?
How are you managing the governance and sustainability of the OSS project and balancing that against the needs/priorities of the business?
What are the most interesting, innovative, or unexpected ways that you have seen DeepChecks used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on DeepChecks?
When is DeepChecks the wrong choice?
What do you have planned for the future of DeepChecks?
Contact Info
Shir
shir22) on GitHub
Philip
@philiptannor) on Twitter
Parting Question
Closing Announcements
Links
SHAP)
The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano) by The Freak Fandango Orchestra)/CC BY-SA 3.0)