We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Data Skeptic

The Data Skeptic Podcast features interviews and discussion of topics related to data science, stati

Episodes

Total: 579

2016 Holiday Special

2016/12/23

Today's episode is a reading of Isaac Asimov's Franchise. As mentioned on the show, this is just a

[MINI] Entropy

2016/12/16

Classically, entropy is a measure of disorder in a system. From a statistical perspective, it is mor

MS Connect Conference

2016/12/9

Cloud services are now ubiquitous in data science and more broadly in technology as well. This week,

Causal Impact

2016/12/2

Today's episode is all about Causal Impact, a technique for estimating the impact of a particular ev

[MINI] The Bootstrap

2016/11/25

The Bootstrap is a method of resampling a dataset to possibly refine it's accuracy and produce usefu

[MINI] Gini Coefficients

2016/11/18

The Gini Coefficient (as it relates to decision trees) is one approach to determining the optimal de

Unstructured Data for Finance

2016/11/11

Financial analysis techniques for studying numeric, well structured data are very mature. While usin

[MINI] AdaBoost

2016/11/4

AdaBoost is a canonical example of the class of AnyBoost algorithms that create ensembles of weak le

Stealing Models from the Cloud

2016/10/28

Platform as a service is a growing trend in data science where services like fraud analysis and face

[MINI] Calculating Feature Importance

2016/10/21

For machine learning models created with the random forest algorithm, there is no obvious diagnostic

NYC Bike Share Rebalancing

2016/10/14

As cities provide bike sharing services, they must also plan for how to redistribute bicycles as the

[MINI] Random Forest

2016/10/7

Random forest is a popular ensemble learning algorithm which leverages bagging both for sampling and

Election Predictions

2016/9/30

Jo Hardin joins us this week to discuss the ASA's Election Prediction Contest. This is a competition

[MINI] F1 Score

2016/9/23

The F1 score is a model diagnostic that combines precision and recall to provide a singular evaluati

Urban Congestion

2016/9/16

Urban congestion effects every person living in a city of any reasonable size. Lewis Lehe joins us i

[MINI] Heteroskedasticity

2016/9/9

Heteroskedasticity is a term used to describe a relationship between two variables which has unequal

Music21

2016/9/2

Our guest today is Michael Cuthbert, an associate professor of music at MIT and principal investigat

[MINI] Paxos

2016/8/26

Paxos is a protocol for arriving a consensus in a distributed computing system which accounts for un

Trusting Machine Learning Models with LIME

2016/8/19

Machine learning models are often criticized for being black boxes. If a human cannot determine why

[MINI] ANOVA

2016/8/12

Analysis of variance is a method used to evaluate differences between the two or more groups. It wo