Introduction
In the previous chapter, we learned how to use the pandas
, numpy
, and matplotlib
libraries while handling various datatypes. In this chapter, we will learn about several advanced operations involving pandas
DataFrames and numpy
arrays. We will be working with several powerful DataFrame operations, including subsetting, filtering grouping, checking uniqueness, and even dealing with missing data, among others. These techniques are extremely useful when working with data in any way. When we want to look at a portion of the data, we must subset, filter, or group the data. Pandas
contains the functionality to create descriptive statistics of the dataset. These methods will allow us to start shaping our perception of the data. Ideally, when we have a dataset, we want it to be complete, but in reality, there is often missing or corrupt data. This can happen for a variety of reasons that we can't control, such as user error and sensor malfunction. Pandas has built-in functionalities...