11. Data Preparation
Overview
By the end of this chapter you will be able to filter DataFrames with specific conditions; remove duplicate or irrelevant records or columns; convert variables into different data types; replace values in a column and handle missing values and outlier observations.
This chapter will introduce you to the main techniques you can use to handle data issues in order to achieve high quality for your dataset prior to modeling it.