Chapter 6. Combining Data Files
The Data Preparation phase was introduced in the previous chapter, where we fixed some of the problems that were previously identified during the Data Understanding phase. You were shown how to select the appropriate cases for analysis. You were also made aware of the idea of sorting cases to get a better feel for the data. In addition, you learned how to identify and remove duplicate cases. Finally, you were introduced to the topic of reclassifying categorical values to address various types of issues. In this chapter you will:
- Learn to combine data files
- Learn how to remove unnecessary fields
As was mentioned in the previous chapter, there is no set order as to how to take on data preparation. In the previous chapter, we started preparing the data for modeling by addressing some of the problems or errors that were found in the dataset. At this point, we could have certainly continued to work on the initial data file by creating new fields, however often it...