There are more functions built-in for working with data frames than we have used so far. If we were to take one of the data frames from a prior example in this chapter, the Titanic dataset from an Excel file, we could use additional functions to help portray and work with the dataset.
As a repeat, we load the dataset using the script:
import pandas as pd df = pd.read_excel('http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.xls')
We can then inspect the data frame using the info function, which displays the characteristics of the data frame:
df.info()
Some of the interesting points are as follows:
- 1309 entries
- 14 columns
- Not many fields with valid data in the body column—most were lost
- Does give a good overview of the types of data involved
We can also use the describe function, which gives us a statistical...