Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Machine Learning for Data Mining

You're reading from   Machine Learning for Data Mining Improve your data mining capabilities with advanced predictive modeling

Arrow left icon
Product type Paperback
Published in Apr 2019
Publisher Packt
ISBN-13 9781838828974
Length 252 pages
Edition 1st Edition
Languages
Tools
Concepts
Arrow right icon
Author (1):
Arrow left icon
Jesus Salcedo Jesus Salcedo
Author Profile Icon Jesus Salcedo
Jesus Salcedo
Arrow right icon
View More author details
Toc

Combining models

There are several ways in which models can be combined. We are going to look at each method in this section.

Combining by voting

Let's use an example to understand this method of combining models.

Consider that we have run three models and created a table like this:

We have the confidence for each model and its prediction. Let's see how we can combine these models.

If we take a look at the first row, we can see that each of these models is predicting that a person is going to leave. Hence, if we combine the predictions, we are still predicting that the person is going to leave. The confidence value, or the final confidence, is acquired by adding up the confidence values of all the models and dividing by the number of total models, three in our case.

If we look at the second row, we can see that two of these models predict that the person is going to leave; and one model is predicting that the person is going to stay; we can infer that the combined prediction will be that the person will leave. Here, we calculate the confidence values by adding up the confidence of the models that predicted the combined prediction, Leave, divided by the total number of models, which is three. Hence, the final confidence value is low in the second row.

This is combining models by voting, where only the predictions that occur a number of times are considered for combining.

Combining by highest confidence

This is another method of combining models. Consider the following table, for example:

In this example, we won't consider what the model is predicting; instead, we will just focus on high confidence values. If we look at the first row, each of the models has predicted Leave. But Model 1 has the highest confidence, and so the combined prediction is taken as Leave and the final confidence is the highest confidence acquired.

If we look at the second row, the model with highest confidence is Model 3 and it has predicted that the person is going to stay, and hence, the combined prediction becomes Stay and the final confidence becomes the highest confidence.

Implementing combining models

Follow these steps to see how we can combine different models:

  1. Get Electronics_Data on Canvas.
  2. Connect the dataset to a Partition node from the Field Ops palette.
  3. Split the data into training and testing datasets, we have done before.
  4. Connect the Partition node to the Neural Net model and run this model with a random seed set to 5000 and run it.
  5. We will now build a support vector machine (SVM) model. As we are heading towards combining models, we will go to the Partition node and connect it with an SVM model from the Modeling palette.
  6. Run the SVM model by recalling the edits we had made in the Expert tab from Chapter 2, Getting Started with Machine Learning. Go to the Expert tab, select the mode as Expert. Change the Regularization parameter, C and set it to 5, the middle value, and change the Kernel type to Polynomial, as that's what gave us an accurate and consistent model earlier on using the same data. Also, change the Degree value to 2. We are changing the parameters to these values because we acquired proper results earlier when we first saw a demonstration of this model in Chapter 2, Getting Started with Machine Learning. Click on Run.
  7. Connect both the SVM and the Neural Net model that were generated.
  8. Go to the Output palette and connect the generated SVM model to a Table.
  1. Run the table using the Run icon on top. You will see the following:

In this, you can see the results from the partition node, the predictions from the Neural Net model, its confidence, and even the predictions from the SVM model and its confidence. You can close this window.

  1. We will now analyze the model by connecting the SVM-generated model to an Analysis node from the Output palette.
  2. Edit the Analysis node, check the Coincidence matrices, and click on Run. You will see the following results:

We can see how well each of the models has performed. If you scroll down, you can see that the models have agreed 88% of the time on predictions in the training dataset, and about 87% of the time in the testing dataset. When these models agreed, they were actually correct a fair amount of the time. This brings us to evaluate the possibility of combining these two models.

We are now moving on to combine the models. We will first combine using Modeler, but we will also see how we can combine models outside of a modeler.

Combining models in Modeler

For combining models within a modeler, follow these steps:

  1. Go to the SVM model and connect it to the Ensemble node from the Field Ops palette.
  2. Let's edit the Ensemble node. The Ensemble node knows that it is combining the results of two models as it shows two models in ensemble. Choose the Target field for Ensemble as the Status from the drop-down button on the right. If the Filter out fields generated by ensemble models is checked, it will filter out the already generated fields from the previous models, hence, we will deselect it. Next, select the Ensemble method. This is a list 0f ways in which we can combine the model. Here, we will select Voting as we have already seen this. We will talk about the propensity scores later on in this chapter. Then we have to select what happens when there is a tie; here, we will select Highest confidence as we have seen this too and click on OK, as shown in the following screenshot:
  1. Let's see the results of our combination. For this, connect the Ensemble node to the Analysis node and click on the Run button on top. The following will be the results:

First, we have the results of the Neural Net model, followed by the results of the SVM model and then finally, we can see the results of the combined model.

We can see that the overall accuracy in the testing dataset is 82%, which means that there is a slight improvement. We were able to improve the accuracy by combining two models by 2% which is great as a starting point. Let's see how we can combine models from outside of Modeler.

Combining models outside Modeler

This method can be used when you are using any data-mining software other than SPSS Modeler.

Let's see how to do that:

  1. Go to the Field Ops palette and connect the SVM-generated model to a Derive node.
  2. We will use the Derive node to create a new field. We will edit this node and name it Combined_Prediction.
  3. Derive this field as a Conditional. You will see an if-else condition.
  4. Let's tell Modeler that if the predictions from all the models are equal then the combined prediction will be that prediction itself. To do this, let's add an expression in the first if condition as, the prediction from the Neural Net model, $N-Status select = the prediction of the SVM model, $S-Status; go to the Then condition, click on the expression builder and select, the prediction of the Neural Net model, $N-Status or alternatively, you can even select a prediction from the SVM model.
  5. Write in the Else condition, this statement: You can select the variable names from the list:

This statement means that we will select the Highest confidence from any of the models if the predictions of the two models do not match. And if the confidence of the prediction from the Neural Net model is higher than that of the SVM model, then we will go with the prediction of the Neural Net model. Otherwise, if the confidence of the prediction of the SVM model is higher than the Neural Net model, then we will go with the SVM model. But, if both the conditions don't satisfy, then we will put a 0, and then we have to end with an endif statement. Click on OK.

  1. Connect the Combined_prediction node to the Table mode and let's see the results take a look at the results, as shown in the following screenshot:

Here, in the 12th row, we can see that the neural network predicted a customer as Churned whereas the SVM predicted it as Current, but as the confidence of the Neural Net prediction was higher, the combined prediction was picked as Churned.

  1. You can analyze this model and see for yourself that the numbers that will be acquired will be similar to the numbers that we had using the Ensemble node.

This is how we combined two models to improve accuracy and we saw how we can get the combined results from the two models. You can try this out with three or more models. You will be amazed at how well combining models can work. We will now see another advanced method to improve the model.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image