Machine learning-based insights
Unlike the previous analysis methods, the methods discussed in this subsection and other similar ones are based on more complex mathematical models and ML algorithms. Given the scope of this book, we will not be going into the specific theoretical details for these models, but it’s still worth seeing some of them in action by applying them to our dataset.
First, let’s consider the feature correlation matrix for our dataset. As the name suggests, this model is a matrix (a 2D table) that contains the correlation between each pair of numerical attributes (or features) within our dataset. A correlation between two features is a real number between -1 and 1, indicating the magnitude and direction of the correlation. The higher the value, the more correlated the two features are.
To obtain the feature correlation matrix from a pandas DataFrame, we must call the corr()
method, as shown here:
corr_matrix = combined_user_df.corr()
We...