10. Data Analytics with pandas and NumPy
Activity 24: Data Analysis to Find the Outliers in Pay versus the Salary Report in the UK Statistics Dataset
Solution
- You begin with a new Jupyter Notebook.
- Copy the UK Statistics dataset file into a specific folder where you will be performing this activity.
- Import the necessary data visualization packages, which include
pandas
aspds
,matplotlib
asplt
, andseaborn
assns
:import pandas as pd import matplotlib.pyplot as plt %matplotlib inline import seaborn as sns # Set up seaborn dark grid sns.set()
- Choose a variable to store
DataFrame
and place theUKStatistics.csv
file within the folder of your Jupyter Notebook. In this case, it would be as follows:statistics_df = pd.read_csv('UKStatistics.csv')
- Now, to display the dataset, we will be calling the
statistics_df
variable, and.head()
will show us the output of the entire dataset:statistics_df.head()
The output will be as follows: