Transforming data
Data transformation refers to the process of converting data from one format to another. This could require simple or complex data manipulation, based on the nature of the data.
In most cases, the data that you retrieve from a data source is not in a format where it can be used as-is, and you might have to take some additional steps to clean it.
Examples of basic transformations include the following:
- Changing data types
- Filtering (rows and/or fields)
- Creating conditional columns
- Splitting columns
- Renaming/reformatting
Some examples are as follows:
- Getting rid of trailing spaces at the end of a text field
- Reconciling multiple formats saved in a date field (such as Jan-19, Jan 2019, 01-19, and so on)
- Concatenating Title, First Name, Last Name, and so on to get the person's name
Hence, the first step after retrieving data from a data store is to clean the data and convert it into a reusable format. In the following example, we will perform three data transformation...