Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
NZ$14.99 | ALL EBOOKS & VIDEOS
Save more on purchases! Buy 2 and save 10%, Buy 3 and save 15%, Buy 5 and save 20%
Machine Learning for Data Mining
Machine Learning for Data Mining

Machine Learning for Data Mining: Improve your data mining capabilities with advanced predictive modeling

By Jesus Salcedo
NZ$‎38.99 NZ$‎14.99
Book Apr 2019 252 pages 1st Edition
eBook
NZ$‎38.99 NZ$‎14.99
Print
NZ$‎48.99
Subscription
Free Trial
eBook
NZ$‎38.99 NZ$‎14.99
Print
NZ$‎48.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now
Table of content icon View table of contents Preview book icon Preview Book

Machine Learning for Data Mining

Introducing Machine Learning Predictive Models

A large percentage of data mining opportunities involve machine learning, and these opportunities often come with greater financial rewards. This chapter will give you the basic knowledge that you need to bring the power of machine learning into your data mining work. In this chapter, we're going to talk about the characteristics of machine learning models and also see some examples of these models.

The following are the topics that we will be covering in this chapter:

  • Characteristics of machine learning predictive models
  • Types of machine learning predictive models
  • Working with neural networks
  • A sample neural network model

Characteristics of machine learning predictive models

Knowing the characteristics of machine learning predictive models will help you understand the advantages and limitations in comparison to any statistical or decision tree models.

Let's get some insights on a few characteristics of predictive models in machine learning:

  • Optimized to learn complex patterns: Machine learning models are designed to be optimized to learn complex patterns. In comparison to statistical models or decision tree models, predictive models greatly excel, when you have very complex patterns in data.
  • Account for interactions and nonlinear relationships: Machine learning predictive models can account for interactions in the data and nonlinear relationships to an even better degree than decision tree models.
  • Few assumptions: These models are powerful because they have very few assumptions. They can also be used with different types of data.
  • A black box model's interpretation is not straightforward: Predictive models are black box models, this is one of the drawbacks of predictive machine learning models, because this implies that the interpretation is not straightforward. This means that, if we end up building many different equations and combine them, it becomes very difficult to see exactly how each one of these variables ended up interacting and impacting an output variable. So, the predictive machine learning models are great when it comes to predictive accuracy, but they're not that good for understanding the mechanics behind a prediction.

If you want to predict something, these models do a pretty good job and have amazing accuracy. But if you want to know why something is being predicted, and if you are looking forward to making some changes in the implementation so that you don't get a particular prediction, then it would be difficult to decipher.

Types of machine learning predictive models

The following are some of the different types of machine learning predictive models:

  • Neural networks
  • Support Vector Machines
  • Random forest
  • Naive Bayesian algorithms
  • Gradient boosting algorithms
  • K-nearest neighbors
  • Self-learning response model

We won't be covering all of them, but we'll focus on a very interesting model – the neural network. In the following sections, we will get an in-depth view of what neural networks are.

Working with neural networks

Neural networks were initially developed in an attempt to understand how the brain operates. They were originally used in the areas of neuroscience and linguistics.

In these fields, researchers noticed that something happened in the environment (input), the individual processed the information (in the brain), and then reacted in some way (output).

So, the idea behind neural networks or neural nets is that they will serve as a brain, which is like a black box. We then have to try to figure out what is going on so that the findings can be applied.

Advantages of neural networks

The following are the advantages of using a neural network:

  • Good for many types of problems: They work well with most of the complex problems that you might come across.
  • They generalize very well: Accurate generalization is a very important feature.
  • They are very common: Neural networks have become very common in today's world, and they are readily accepted and implemented for real-world problems.
  • A lot is known about them: Owing to the popularity that neural networks have gained, there is a lot of research being done and implemented successfully in different areas, so there is a lot of information available on neural networks.
  • Works well with non-clustered data: When you have non-clustered data, neural networks can be used in several situations, such as where the data itself is very complex, where you have many interactions, or where you have nonlinear relationships; neural networks are certainly very powerful and very robust solutions for such situations.

Disadvantages of neural networks

Good models come at the cost of a few disadvantages:

  • They take time to train: Neural networks do take a long time to train; they are generally slower than a linear regression model or a decision tree model, as these basically just do one pass on the data, while, with neural networks, you actually go through many, many iterations.
  • The best solution is not guaranteed: You're not guaranteed to find the best solution. This also means that, in addition to running a single neural network through many iterations, you'll also need to run it multiple times using different starting points so that you can try to get closer to the best solution.
  • Black boxes: As we discussed earlier, it is hard to decipher what gave a certain output and how.

Representing the errors

While building our neural network, our actual goal is to build the best possible solution, and not to get stuck with a sub-optimal one. We'll need to run a neural network multiple times.

Consider this error graph as an example:

This is a graph depicting the amount of errors in different solutions. The Global Solution is the best possible solution and is really optimal. A Sub-Optimal Solution is a solution that terminates, gets stuck, and no longer improves, but it isn't really the best solution.

Types of neural network models

There are different types of neural networks available for us; in this section, we will gain insights into these.

Multi-layer perceptron

The most common type is called the multi-layer perceptron model. This neural network model consists of neurons represented by circles, as shown in the following diagram. These neurons are organized into layers:

Every multi-layer perceptron model will have at least three layers:

  • Input Layer: This layer consists of all the predictors in our data.
  • Output Layer: This will consist of the outcome variable, which is also known as the dependent variable or target variable.
  • Hidden Layer: This layer is where you maximize the power of a neural network. Non-linear relationships can also be created in this layer, and all the complex interactions are carried out here. You can have many such hidden layers.

You will also notice in the preceding diagram that every neuron in a layer is connected to every neuron in the next layer. This forms connections, and every connecting line will have a weight associated with it. These weights will form different equations in the model.

Why are weights important?

Weights are important for several reasons. First because all neurons in one layer are connected to every neuron in the next layer, this means that the layers are connected. It also means that a neural network model, unlike many other models, doesn't drop any predictors. So for example, you may start off with 20 predictors, and these 20 predictors will be kept. A second reason why weights are important is that they provide information on the impact or importance of each predictor to the prediction. As will be shown later, these weights start off randomly, however through multiple iterations, the weights are modified so as to provide meaningful information.

An example representation of a multilayer perceptron model

Here, we will look at an example of a multilayer perceptron model. We will try to predict a potential buyer of a particular item based on an individual's age, income, and gender.

Consider the following, for example:

As you can see, our input predictors that form the Input Layer are age, income, and gender. The outcome variable that forms our Output Layer is Buy, which will determine whether someone bought a product or not. There is a hidden layer where the input predictors end up combining.

To better understand what goes on behind the scenes of a neural network model, lets take a look at a linear regression model.

The linear regression model

Let's understand the linear regression model with the help of an example.

Consider the following:

In linear regression, every input predictor in the Input Layer is connected to the outcome field by a single connection weight, also known as the coefficient, and these coefficients are estimated by a single pass through the data. The number of coefficients will be equal to the number of predictors. This means that every predictor will have a coefficient associated with it.

Every input predictor is directly connected to the Target with a particular coefficient as its weight. So, we can easily see the impact of a one unit change in the input predictor on the outcome variable or the Target. These kind of connections make it easy to determine the effect of each predictor on the Target variable as well as on the equation.

A sample neural network model

Let's use an example to understand neural networks in more detail:

Notice that every neuron in the Input Layer is connected to every neuron in the Hidden Layer, for example, Input 1 is connected to the first, second, and even the third neuron in the Hidden Layer. This implies that there will be three different weights, and these weights will be a part of three different equations.

This is what happens in this example:

  • The Hidden Layer intervenes between the Input Layer and the Output Layer.
  • The Hidden Layer allows for more complex models with nonlinear relationships.
  • There are many equations, so the influence of a single predictor on the outcome variable occurs through a variety of paths.
  • The interpretation of weights won't be straightforward.
  • Weights correspond to the variable importance; they will initially be random, and then they will go through a bunch of different iterations and will be changed based on the feedback of the iterations. They will then have their real meaning of being associated with variable importance.

So, let's go ahead and see how these weights are determined and how we can form a functional neural network.

Feed-forward backpropagation

Feed-forward backpropagation is a method through which we can predict things such as weights, and ultimately the outcome of a neural network.

According to this method, the following iterations occur on predictions:

  • If a prediction is correct, the weight associated with it is strengthened. Imagine the neural network saying, Hey, you know what, we used the weight of 0.75 for the first part of this equation for the first predictor and we got the correct prediction; that's probably a good starting point.
  • Suppose the prediction is incorrect; the error is fed back or back propagated into the model so that the weights or weight coefficients are modified, as shown here:

This backpropagation won't just take place in-between the Hidden Layers and the Target layer, but will also take place toward the Input Layer:

While these iterations are happening, we are actually making our neural network better and better with every error propagation. The connections now make a neural network capable of learning different patterns in the data.

So, unlike any linear regression or a decision tree model, a neural network tries to learn patterns in the data. If it's given enough time to learn those patterns, the neural network, combined with its experience, understands and predicts better, improving the rate of accuracy to a great extent.

Model training ethics

When you are training the neural network model, never train the model with the whole dataset. We need to hold back some data for testing purposes. This will allow us to test whether the neural network is able to apply what its learned from the training dataset to a new data.

We want the neural network to generalize well to new data and capture the generalities of different types of data, not just little nuances that would then make it sample-specific. Instead, we want the results to be translated to the new data as well. After the model has been trained, the new data can be predicted using the model's experience.

Summary

I hope you are now clear on machine learning predictive models and have understood the basic concepts. In this chapter, we have seen the characteristics of machine learning predictive models and have learned about some of the different types. These concepts are stepping stones to further chapters. We have also looked at an example of a basic neural network model. In the next chapter, we will implement a live neural network on a dataset and you will also be introduced to support vector machines and their implementation.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Learn how to apply machine learning techniques in the field of data science
  • Understand when to use different data mining techniques, how to set up different analyses, and how to interpret the results
  • A step-by-step approach to improving model development and performance

Description

Machine learning (ML) combined with data mining can give you amazing results in your data mining work by empowering you with several ways to look at data. This book will help you improve your data mining techniques by using smart modeling techniques. This book will teach you how to implement ML algorithms and techniques in your data mining work. It will enable you to pair the best algorithms with the right tools and processes. You will learn how to identify patterns and make predictions with minimal human intervention. You will build different types of ML models, such as the neural network, the Support Vector Machines (SVMs), and the Decision tree. You will see how all of these models works and what kind of data in the dataset they are suited for. You will learn how to combine the results of different models in order to improve accuracy. Topics such as removing noise and handling errors will give you an added edge in model building and optimization. By the end of this book, you will be able to build predictive models and extract information of interest from the dataset

What you will learn

Hone your model-building skills and create the most accurate models Understand how predictive machine learning models work Prepare your data to acquire the best possible results Combine models in order to suit the requirements of different types of data Analyze single and multiple models and understand their combined results Derive worthwhile insights from your data using histograms and graphs

Product Details

Country selected

Publication date : Apr 30, 2019
Length 252 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781838828974
Category :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Apr 30, 2019
Length 252 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781838828974
Category :

Table of Contents

7 Chapters
Preface Chevron down icon Chevron up icon
1. Introducing Machine Learning Predictive Models Chevron down icon Chevron up icon
2. Getting Started with Machine Learning Chevron down icon Chevron up icon
3. Understanding Models Chevron down icon Chevron up icon
4. Improving Individual Models Chevron down icon Chevron up icon
5. Advanced Ways of Improving Models Chevron down icon Chevron up icon
6. Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Top Reviews
No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.