Important concepts in predictive modeling
We already looked at several concepts when we talked about the machine learning pipeline. In this section, we will look at typical terms which are used in predictive modeling, and also discuss about model building and evaluation concepts in detail.
Preparing the data
The data preparation step, as discussed earlier, involves preparing the datasets necessary for feature selection and building the predictive models using the data. We frequently use the following terms in this context:
Datasets: They are typically a collection of data points or observations. Most datasets usually correspond to some form of structured data which involves a two dimensional data structure, such as a data matrix or data table (in R this is usually represented using a data frame) containing various values. An example is our
german_credit_dataset.csv
file from Chapter 5, Credit Risk Detection and Prediction – Descriptive Analytics.Data observations: They are the rows in a dataset...