Chapter 4: Preparing Data for DataRobot
This chapter covers tasks relating to preparing data for modeling. While the tasks themselves are relatively straightforward, they can take up a lot of time and can sometimes cause frustration. Just know that if you feel this way, you are not alone. This is pretty normal. This is also where you will begin to notice that things are a bit different from your experience in an academic setting. Data will almost never arrive in a form that's suitable for modeling, and it is a mistake to assume that the data you have received is in good condition and of good quality.
Most real-world problems do not come with a ready-made dataset that you can start processing and use to build models. Most likely you will need to stitch data together from multiple disparate sources. Depending on the data, DataRobot might perform data preparation and cleansing tasks automatically, or you might have to do some of these on your own. This chapter covers concepts...