2. Exploratory Data Analysis and Visualization
Activity 2.01: Summary Statistics and Missing Values
The steps to complete this activity are as follows:
- Import the required libraries:
import json import pandas as pd import numpy as np import missingno as msno from sklearn.impute import SimpleImputer import matplotlib.pyplot as plt import seaborn as sns
- Read the data. Use pandas'
method to read the CSV file into a pandasDataFrame
:data = pd.read_csv('../Datasets/house_prices.csv')
- Use pandas'
methods to view the summary statistics of the dataset:data.info() data.describe().T
The output of
will be as follows:Figure 2.50: The output of the info() method (abbreviated)
The output of
will be as follows:Figure 2.51: The output of the describe() method (abbreviated)
- Find the total count and total percentage of missing values in each column of the DataFrame and display them for columns having at least...