Building pandas Series and DataFrames
A Series is a one-dimensional labeled array that can hold any data type, including integers, floats, strings, and objects. The axis labels of a Series are collectively referred to as the index, which allows for easy data manipulation and access. A key feature of the pandas Series is its ability to handle missing data, represented as a NumPy nan
(Not a Number).
Important
NumPy’s nan
is a special floating-point value. It is commonly used as a marker for missing data in numerical datasets. The nan
value being a float is useful because it can be used in numerical computations and included in arrays of numbers without changing their data type, which aids in maintaining consistent data types in numeric datasets. Unlike other values, nan
doesn’t equal anything, which is why we need to use functions such as numpy.isnan()
to check for nan
.
Furthermore, the Series
object provides a host of methods for operations such as statistical...