DQS Cleansing Transformation—Cleansing Data
Data Cleansing is one of the most important parts in every data transfer scenario. There are many scenarios where the source of data is not well structured, and the source is not consistent. For example, Microsoft is not spelled the same in all data sources, in one of them it is "Micsoft", in another case it is "Micro soft" and in some cases "Microsoft". Data Cleansing means maintaining the consistency of data.
SQL Server 2012 comes with a new service, which is named DQS. DQS stands for Data Quality Services. DQS is one of the services that can be installed and can listen to requests. You can create knowledge bases in DQS with a tool named DQS Client, and then use SSIS DQS Cleansing Component to check matching data with the knowledge bases and standardize them or report their status.
DQS itself is outside the scope of this book, but we will take a quick look at how to install and use DQS. Lastly, we will run a sample to apply DQS Cleansing on a data...