Previous chapters have dealt with data manipulation of data on a macroscopic level, without much emphasis on the values in each data entry. In other words, the content up until this point has focused with processing datasets as a whole.
In these next two chapters, I will discuss data wrangling on a more microscopic level, placing emphasis on the individual values of the dataset. This chapter will be about working with text data. In this chapter, I will introduce and discuss the use of regular expressions to recognize patterns in strings. After a brief introduction of regular expressions, I will demonstrate a specific application of regular expressions in a project to extract street names from a dataset containing addresses.
This chapter will include the following sections:
- Logistical overview ...