Binary classification refers to problems with only two distinct classes. As we did in the previous chapter, we will generate a dataset using the convenience function, make_classification(), in the SciKit Learn library:
X, y = skds.make_classification(n_samples=200,
n_features=2,
n_informative=2,
n_redundant=0,
n_repeated=0,
n_classes=2,
n_clusters_per_class=1)
if (y.ndim == 1):
y = y.reshape(-1,1)
The arguments to make_classification() are self-explanatory; n_samples is the number of data points to generate, n_features is the number of features to be generated, and n_classes is the number of classes, which is 2:
- n_samples is the number of data points to generate. We have kept it to 200 to keep the dataset small.
- n_features is the number of features to be generated; we are using only two features so that we can keep it a simple problem to...