In Chapter 5, Data Transformation and Cleaning with T-SQL, we explored the need for data transformation for the purpose of data consolidation, accuracy checking, and cleansing. From there, we went on to learn how to explore data from a statistical point of view in Chapter 6, Data Exploration and Statistics with T-SQL. In Chapter 7, Data Visualization, we used some very helpful techniques for data visualization. Using techniques from all three of these chapters leads to the need to transform data once again.
This chapter is intended to explain how to replace missing values, normalize data, or standardize data used as an input into further machine learning models. For many of these tasks, T-SQL is an inadequate language, so we will use other tools and languages to meet our requirements.
In this chapter, we will learn the following topics:
- ...