Introduction
In this chapter, we briefly demonstrate how to approach a binary classification problem using BoostedTreesClassifier
. We will apply the technique to solve a realistic business problem using a popular educational dataset: predicting which customers are likely to cancel their bookings. The data for this problem – and several other business problems – comes in tabular format, and typically contains a mixture of different feature types: numeric, categorical, dates, and so on. In the absence of sophisticated domain knowledge, gradient boosting methods are a good first choice for creating an interpretable solution that works out of the box. In the next section, the relevant modeling steps will be demonstrated with code: data preparation, structuring into functions, fitting a model through the tf.estimator
functionality, and interpretation of results.
How to do it...
We begin by loading the necessary packages:
import tensorflow as tf
import numpy as...