DataFrames and statistics
We were introduced to Julia’s implementation of DataFrames in the previous section and used the availability of a series of datasets, first made available by the Comprehensive R Archive Network (CRAN), hence the epithet R-Datasets.
A full listing can be obtained from the R-Datasets page and also from the package maintainer’s, Vincent Arel-Bundock, GitHub page.
The equivalent package in Python is pandas, of which there is also a Julia package (Pandas.jl
), which is a wrapper around the Python one, available via the JuliaPy GitHub page.
When dealing with tabulated datasets, there are occasions when some of the values are missing. It is one of the features of statistical languages is that they can handle such situations.
Support for this has been changed in version 1.0 due to the introduction of the Missings.jl
package (via the JuliaData group).
DataFrames
The DataFrame is one of the cornerstones of Julia. Implementations go back...