Summary
In this chapter, we started our journey around the data world on board the Titanic. We started with a preliminary statistical analysis of each feature and then continued with univariate analysis and feature engineering to create derived or aggregated features. We extracted multiple features from text, and we also created complex graphs to visualize multiple features at the same time and reveal their predictive value. We then learned how to assign a uniform visual identity for our analysis by using a custom color map across the notebook.
For some of the features – most notably, those derived from names – we performed a deep-dive exploration to learn about the fate of large families on the Titanic and about name distribution according to the embarking port. Some of the analysis and visualization tools are easily reusable and, in the next chapter, we will see how to extract them to be used as utility scripts in other notebooks as well.
In the next chapter...