2. Exploratory Data Analysis and Visualization
Activity 2.01: Summary Statistics and Missing Values
The steps to complete this activity are as follows:
- Import the required libraries:
import json import pandas as pd import numpy as np import missingno as msno from sklearn.impute import SimpleImputer import matplotlib.pyplot as plt import seaborn as sns
- Read the data. Use pandas'
.read_csv
method to read the CSV file into a pandasDataFrame
:data = pd.read_csv('../Datasets/house_prices.csv')
- Use pandas'
.info()
and.describe()
methods to view the summary statistics of the dataset:data.info() data.describe().T
The output of
info()
will be as follows:The output of
describe()
will be as follows: - Find the total count and total percentage of missing values in each column of the DataFrame and display them for columns having at least...