Creating New Features
Adding new features to a dataset can help machine learning models learn patterns and important details in the data. For example, in finance, the disposable income, which is the total income minus the acquired debt for any one month, might be more relevant for credit risk than just the income or the acquired debt. Similarly, the total acquired debt of a person across financial products, such as a car loan, a mortgage, and credit cards, might be more important to estimate the credit risk than any debt considered individually. In these examples, we use domain knowledge to craft new variables, and these variables are created by adding or subtracting existing features.
In some cases, a variable may not have a linear or monotonic relationship with the target, but a polynomial combination might. For example, if our variable has a quadratic relationship with the target, , we can convert that into a linear relationship by squaring the original variable. We can also...