As mentioned previously, data preprocessing and data transformation are two of the most essential processes in data mining and other data science approaches. During the data processing stage, our data is often in the form of a string. Most of the datasets found on the internet are string-based. Hence, string manipulation techniques are an essential part of exploratory data analysis (EDA).
In this appendix chapter, we are going to learn about the following topics:
- String manipulation
- Using pandas vectorized string functions
- Using regular expressions