Fuzzy Transformations—how SSIS understands fuzzy similarities
Suppose that input data is a csv
file with department name information and there is no guarantee that the department names are spelt consistently. So you may have names such as "Management" and "Managmnt" together in the same data file. In such cases an ordinal Lookup Transform cannot detect these similarities, because Lookup checks for terms that are completely identical. This is where we need to apply some Fuzzy operations.
SSIS has two Fuzzy Transformations, which catch fuzzy similarities between terms and help us in master data management, as follows:
Fuzzy Grouping: This component will create groups of data rows based on their similarity threshold
Fuzzy Lookup: This component will look at a reference table to find matching keywords based on a predefined similarity threshold
In this recipe, we take a look at two Fuzzy Transformations and how they can help us in real-world scenarios.
Getting ready
Create a Department
table with...