We will first import a dataset known as Iris, which is already available in scikit-learn's sample dataset library (http://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html). The dataset consists of four features and has 150 rows. We will be developing the following steps in a pipeline to train our model using the Iris dataset. The problem statement is to predict the species of an Iris data using four different features:
In this pipeline, we will use a MinMaxScaler method to scale the input data and logistic regression to predict the species of the Iris. The model will then be evaluated based on the accuracy measure:
- The first step is to import various libraries from scikit-learn that will provide methods to accomplish our task. We have learn about all this in previous chapters. The only addition is the Pipeline method from sklearn.pipeline...