Summary
This chapter introduced the data preparation phase of data mining. The specific focus of this chapter was on fixing some of the problems that were previously identified in the data during the data understanding phase. You learned how to ensure that you are selecting the appropriate cases for analysis, and you were also exposed to the idea of sorting cases to get a better feel for the data. In addition, you learned how to identify and remove duplicate cases. Finally, you were introduced to the topic of reclassifying categorical values to address various types of issues.
In the next chapter we will expand our data preparation skills so that we can combine various data files. Specifically we will learn to combine data files using both the Merge and Append nodes.