We saw in the last section that missing data is often indicated by NaN. The way missing data is indicated depends on the datatype of the column. Missing timestamps are indicated by the pandas object NaT, while missing data with another non-numeric datatype is indicated by None.
The dataframe method isnull returns a Boolean dataframe with the entry True at all places with missing data.
We will study various methods for treating missing data before returning to the solar cell data example.
Let's demonstrate these methods on a small dataframe:
frame = pd.DataFrame(array([[1., -5., 3., NaN],
[3., 4., NaN, 17.],
[6., 8., 11., 7.]]),
columns=['a','b','c','d'])
This dataframe is displayed as:
a b c d
0 1.0 -5.0 3.0 NaN
1 3.0 4.0 NaN 17.0
2 6.0 8.0 11.0 7.0
Dataframes with missing data can be handled in different...