Summary
This chapter covered key steps we need to take the day after we convert our raw data into a pandas DataFrame. We explored techniques for examining the structure of our data, including the number of rows and columns, and data types. We also learned how to generate frequencies for categorical variables, and began to look at how values for one variable change with the values of another variable. Finally, we saw how to examine the distribution of continuous variables, including with sample statistics such as the mean, minimum, and max, and by plotting. This sets us up for the topics in the next chapter, where we will use techniques to identify outliers in our data.
Join our community on Discord
Join our community’s Discord space for discussions with the author and other readers:
https://discord.gg/p8uSgEAETX