Summary
In this chapter, we deep-dived into the pandas
library to learn advanced data wrangling techniques. We started with some advanced subsetting and filtering on DataFrames and rounded this off by learning about boolean indexing and conditionally selecting a subset of data. We also covered how to set and reset the index of a DataFrame, especially while initializing.
Next, we learned about a particular topic that has a deep connection with traditional relational database systems – the groupBy
method. Then, we deep-dived into an important skill for data wrangling – checking for and handling missing data. We showed you how pandas helps in handling missing data using various imputation techniques. We also discussed methods for dropping missing values. Furthermore, methods and usage examples of concatenation and merging DataFrame objects were shown. We saw the join
method and how it compares to a similar operation in SQL.
Lastly, miscellaneous useful methods on DataFrames...