Extracting numbers from text
Another common data preparation step is when we need to extract a number from text values. An excellent example is when we want to extract a flat number or a ZIP code from an address. Other examples include extracting the numeric part of a sales order number or cleaning full names of typos, such as when some names contain numbers. We will continue using the Chapter
5
, Common
Data Preparation
Steps
, Replaced
Values.pbix
sample file from the previous section. In our scenario, we want to add two new columns to the Customer table, as follows:
- Extract Flat Number as a new column from AddressLine1.
- Extract the rest of the address, Street Name, as a new column.
As the following image shows, the AddressLine1
column contains the flat number in different parts of the address; therefore, splitting by transitioning from digit to non-digit would not work:
Figure 5.49: Flat numbers appear in different places in AddressLine1
To...