Data cleaning, or rather tidying up the data, is the process of transforming raw data into specific consistent data that includes analysis in a simpler manner. The R programming language includes a set of comprehensive tools that are specifically designed to clean the data in an effective manner. We will be focusing here on cleaning the dataset in a specific way:
- Include the libraries that are needed for cleaning and tidying up the dataset:
> library(dplyr)
> library(tidyr)
- Analyze the summary of our dataset, which will help us to focus on which attributes to use:
>summary(longley)
GNP Deflator GNP Unemployed Armed Forces Population Year Employed
Min. : 83.00 Min. :234.3 Min. :187.0 Min. :145.6 Min. :107.6 Min. :1947 Min. :60.17
1st Qu.: 94.53 1st Qu.:317.9 1st Qu.:234.8 1st Qu.:229.8 1st Qu.:111.8 1st Qu.:1951 1st Qu.:62.71
Median :100.60 Median...