4. A Deep Dive into Data Wrangling with Python
Activity 4.01: Working with the Adult Income Dataset (UCI)
Solution:
These are the steps to complete this activity:
- Load the necessary libraries:
import numpy as np import pandas as pd import matplotlib.pyplot as plt
- Read in the Adult Income Dataset (given as a
.csv
file) from the local directory and check the first five records:df = pd.read_csv("../datasets/adult_income_data.csv") df.head()
Note
The highlighted path must be changed based on the location of the file on your system.
The output is as follows:
- Create a script that will read a text file line by line and extract the first line, which is the header of the
.csv
file:names = [] with open('../datasets/adult_income_names.txt','r') as f: Â Â Â Â for line in f: Â Â Â Â Â Â Â Â f.readline() Â Â Â Â ...