More images are even better
We saw how using graphs for the distribution of each feature we can get very interesting insights into the data. To make easier our observations, we grouped each feature presented on train and test data as well as, for train data only, on Survived / Not Survived. We then experimented with feature engineering to get useful, more relevant features. While observing variables separately can help us to get an initial image of data distribution, by grouping values and looking to more than one feature at a time can reveal correlations and more insights in how different features are interacting. In the following will use various graphics to explore more such correlations of features while we also explore the visualizations options. We keep for now our initial option for using a combination of matplotlib
and seaborn
graphical libraries.Figure 3.15. shows the number of passengers / Age interval, grouped by Passenger Class. We can see from this image that in 3rd class...