Applying Your Knowledge to a Real-life Data Wrangling Task
Suppose you are asked this question: In India, did the enrollment in primary/secondary/tertiary education increase with the improvement of per capita GDP in the past 15 years? The actual modeling and analysis will be done by some senior data scientist, who will use machine learning and data visualization for analysis. As a data wrangling expert, your job will be to acquire and provide a clean dataset that contains educational enrollment and GDP data side by side.
Suppose you have a link for a dataset from the United Nations and you can download the dataset of education (for all the nations around the world). But this dataset has some missing values and moreover it does not have any GDP information. Someone has also given you another separate CSV file (downloaded from the World Bank site) which contains GDP data but in a messy format.
In this activity, we will examine how to handle these two separate sources and clean the data to prepare...