Before we dive into data analysis, data needs to be properly prepared and structured. Some datasets, for example, structured computer logs, are ready to go from the start, but, most of the time, the majority of the time is spent preparing data properly. This process inevitably requires certain decisions that depend on the specifics of the task.
In this chapter, we will learn how to prepare the data with pandas, using the dataset we collected from Wikipedia in Chapter 7, Scraping Data from the Web with Beautiful Soup 4, as an example.
We will cover the following topics in the chapter:
- Quick start with pandas
- Working with real data
- Regular expressions
- Using custom functions with pandas dataframes
- Writing the file