Descriptive statistical analysis helps you to understand your data properly. Although R provides some functions by default to perform basic statistics, we will use two better alternatives, the DataExplorer and fBasics packages.
Follow these simple steps:
- As the number of variables in the dataset is high, we will create a list with the variable names to use in our descriptive functions:
Class<-as.data.frame(sapply(train, class))
colnames(Class)<-"variable_class"
Class$variable_name<-colnames(train)
numeric_vars<-Class[Class$variable_class=="numeric","variable_name"]
- A list of 1,492 variables is created. Pass this list to the basicStats function included in the fBasics package:
library(fBasics)...
descriptives_num<- as.data.frame(t(basicStats(train[,numeric_vars])))
head(descriptives_num)