Identifying variable correlations
When performing data analysis or considering which variables you might want to use in a machine learning model, you will often want to look at variable correlation.
Correlation describes how frequently changing one variable has an impact on a specific second variable – and whether that impact is positive or negative.
Let’s consider an example. If we were to look at all of our players’ birth years and all of their ages, we’d notice a correlation: the earlier the birth year is, the higher the player’s age will be.
While this particular relationship shouldn’t shock anyone, it is a good example of a pair of strongly correlated variables. We can say that age and birth year share a strong negative correlation; meaning the more one goes up, the more the other one goes down.
Calculating variable correlations
Correlations can be represented numerically with values ranging from -1
to +1
, with correlation...