Chapter 2: Exploring and Cleaning Up Data with fastai
In the previous chapter, we got started with the fastai framework by setting up its coding environment, working through a concrete application example (MNIST), and investigating two frameworks with different relationships to fastai: PyTorch and Keras. In this chapter, we are going to dive deeper into an important aspect of fastai: ingesting, exploring, and cleaning up data. In particular, we are going to explore a selection of the datasets that are curated by fastai.
By the end of this chapter, you will be able to describe the complete set of curated datasets that fastai supports, use the facilities of fastai to examine these datasets, and clean up a dataset to eliminate missing and non-numeric values.
Here are the recipes that will be covered in this chapter:
- Getting the complete set of oven-ready fastai datasets
- Examining tabular datasets with fastai
- Examining text datasets with fastai
- Examining image datasets with fastai
- Cleaning up raw datasets with fastai