Understanding Feature Engineering and Preparing Data for Modeling
Wow – look how far you’ve come! Congratulations on making it to Chapter 9, where we will prepare you for machine learning concepts in the next chapter!
In this chapter, we will delve into the critical phase of pre-modeling. Here, you’ll combine your knowledge of Python, data wrangling, and statistics.
While numerous data science texts emphasize the latest machine learning models, data preparation is the true foundation of successful prediction. This chapter is a vital bridge between collecting data and applying advanced machine learning techniques, emphasizing the data science principle, “garbage in, garbage out.” Poor input data will yield unreliable results no matter how advanced a model is.
Pre-modeling data preparation is about ensuring our data is accurate, consistent, and relevant. Mastering this stage means understanding issues such as outliers, feature engineering...