This chapter showed that T-SQL is not the only language or environment used in data science for data transformation. All of the technology shown in this chapter deserves—and has—its own publications. The goal of this chapter was to show technologies beyond T-SQL and to offer inspiration for further studies.
The first section of this chapter was an introduction to several types of transformations often needed in the data science domain. We also learned about categorization, standardization, and missing-value imputations in the form of terms and formulas.
The knowledge obtained in the first section was used to introduce SQL Server Integration Services in the second section. Here, we created a simple package to show how the development is done. During the development of the Data Flow task, we learned about some transformations and their usage.
The section on...