Summary
In this chapter, we learned what data munging is and why it is necessary for data science. Julia provides functionalities to facilitate data munging with the DataFrames.jl package, with features such as these:
NA
: A missing value in Julia is represented by a specific data type, NA.DataArray
: DataArray provided in theDataFrames.jl
provides features such as allowing us to store some missing values in an array.DataFrame
: DataFrame is 2-D data structure like spreadsheets. It is very similar to R or pandas's dataframes, and provides many functionalities to represent and analyze data. DataFrames has many features well suited for data analysis and statistical modeling.- A dataset can have different types of data in different columns.
- Records have a relation with other records in the same row of different columns of the same length.
- Columns can be labeled. Labeling helps us to easily become familiar with the data and access it without the need to remember their numerical indices.
We learned...