Scalar statistics
Various functions are provided by Julia's package to compute various statistics. These functions are used to describe data in different ways as required.
Standard deviations and variances
The mean and median we earlier computed (in the describe()
function) are measures of central tendency. Mean refers to the center computed after applying weights to all the values and median refers to the center of the list.
This is only one piece of information and we would like to know more about the dataset. It would be good to have knowledge about the spread of data points across the dataset. We cannot use just the min and max functions as we can have outliers in the dataset. Therefore, these min and max functions will lead to incorrect results.
Variance is a measurement of the spread between data points in a dataset. It is computed by calculating the distance of numbers from the mean. Variance measures how far each number in the set is from the mean.
The following is the formula for variance...