Detecting and Handling Missing Values and Outliers
This chapter discusses the techniques of handling missing values and outliers, two critical challenges that can significantly impact the integrity and accuracy of our data products. We will explore a wide range of techniques to identify and manage these data irregularities, ranging from statistical methods to advanced machine learning models. Through practical examples and real-world datasets, we will present strategies to tackle these issues head-on, ensuring that our analyses are robust, reliable, and capable of generating meaningful insights.
The key points for the chapter are as follows:
- Detecting and handling missing data
- Detecting univariate and multivariate outliers
- Handling univariate and multivariate outliers