EDA – correlation analysis
Correlation analysis measures the statistical relationship between two different variables. The result will show how the change in one parameter would impact the other parameter. Correlation analysis is a very important concept, popular in the field of predictive analytics. Also, it is mandatory to complete the correlations analysis before building the model and before arriving at a conclusion about variable relationships. Though correlation analysis helps us in understanding the association between two variables in a dataset, it can't explain, or measure, the cause.
So far, we haven't explored the relationship between different parameters. In this section, we will focus on the bivariate and multivariate analysis of the GitHub dataset.
We will use the dataset that was created for plotting the heat map to perform the correlation analysis. The following code will get us the required dataset:
cordata<- ausersubset[c("id","full_name","size","watchers_count", "forks_count...