Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Machine Learning for Data Mining
Machine Learning for Data Mining

Machine Learning for Data Mining: Improve your data mining capabilities with advanced predictive modeling

eBook
R$80 R$147.99
Paperback
R$183.99
Subscription
Free Trial
Renews at R$50p/m

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Table of content icon View table of contents Preview book icon Preview Book

Machine Learning for Data Mining

Introducing Machine Learning Predictive Models

A large percentage of data mining opportunities involve machine learning, and these opportunities often come with greater financial rewards. This chapter will give you the basic knowledge that you need to bring the power of machine learning into your data mining work. In this chapter, we're going to talk about the characteristics of machine learning models and also see some examples of these models.

The following are the topics that we will be covering in this chapter:

  • Characteristics of machine learning predictive models
  • Types of machine learning predictive models
  • Working with neural networks
  • A sample neural network model

Characteristics of machine learning predictive models

Knowing the characteristics of machine learning predictive models will help you understand the advantages and limitations in comparison to any statistical or decision tree models.

Let's get some insights on a few characteristics of predictive models in machine learning:

  • Optimized to learn complex patterns: Machine learning models are designed to be optimized to learn complex patterns. In comparison to statistical models or decision tree models, predictive models greatly excel, when you have very complex patterns in data.
  • Account for interactions and nonlinear relationships: Machine learning predictive models can account for interactions in the data and nonlinear relationships to an even better degree than decision tree models.
  • Few assumptions: These models are powerful because they have very few assumptions. They can also be used with different types of data.
  • A black box model's interpretation is not straightforward: Predictive models are black box models, this is one of the drawbacks of predictive machine learning models, because this implies that the interpretation is not straightforward. This means that, if we end up building many different equations and combine them, it becomes very difficult to see exactly how each one of these variables ended up interacting and impacting an output variable. So, the predictive machine learning models are great when it comes to predictive accuracy, but they're not that good for understanding the mechanics behind a prediction.

If you want to predict something, these models do a pretty good job and have amazing accuracy. But if you want to know why something is being predicted, and if you are looking forward to making some changes in the implementation so that you don't get a particular prediction, then it would be difficult to decipher.

Types of machine learning predictive models

The following are some of the different types of machine learning predictive models:

  • Neural networks
  • Support Vector Machines
  • Random forest
  • Naive Bayesian algorithms
  • Gradient boosting algorithms
  • K-nearest neighbors
  • Self-learning response model

We won't be covering all of them, but we'll focus on a very interesting model – the neural network. In the following sections, we will get an in-depth view of what neural networks are.

Working with neural networks

Neural networks were initially developed in an attempt to understand how the brain operates. They were originally used in the areas of neuroscience and linguistics.

In these fields, researchers noticed that something happened in the environment (input), the individual processed the information (in the brain), and then reacted in some way (output).

So, the idea behind neural networks or neural nets is that they will serve as a brain, which is like a black box. We then have to try to figure out what is going on so that the findings can be applied.

Advantages of neural networks

The following are the advantages of using a neural network:

  • Good for many types of problems: They work well with most of the complex problems that you might come across.
  • They generalize very well: Accurate generalization is a very important feature.
  • They are very common: Neural networks have become very common in today's world, and they are readily accepted and implemented for real-world problems.
  • A lot is known about them: Owing to the popularity that neural networks have gained, there is a lot of research being done and implemented successfully in different areas, so there is a lot of information available on neural networks.
  • Works well with non-clustered data: When you have non-clustered data, neural networks can be used in several situations, such as where the data itself is very complex, where you have many interactions, or where you have nonlinear relationships; neural networks are certainly very powerful and very robust solutions for such situations.

Disadvantages of neural networks

Good models come at the cost of a few disadvantages:

  • They take time to train: Neural networks do take a long time to train; they are generally slower than a linear regression model or a decision tree model, as these basically just do one pass on the data, while, with neural networks, you actually go through many, many iterations.
  • The best solution is not guaranteed: You're not guaranteed to find the best solution. This also means that, in addition to running a single neural network through many iterations, you'll also need to run it multiple times using different starting points so that you can try to get closer to the best solution.
  • Black boxes: As we discussed earlier, it is hard to decipher what gave a certain output and how.

Representing the errors

While building our neural network, our actual goal is to build the best possible solution, and not to get stuck with a sub-optimal one. We'll need to run a neural network multiple times.

Consider this error graph as an example:

This is a graph depicting the amount of errors in different solutions. The Global Solution is the best possible solution and is really optimal. A Sub-Optimal Solution is a solution that terminates, gets stuck, and no longer improves, but it isn't really the best solution.

Types of neural network models

There are different types of neural networks available for us; in this section, we will gain insights into these.

Multi-layer perceptron

The most common type is called the multi-layer perceptron model. This neural network model consists of neurons represented by circles, as shown in the following diagram. These neurons are organized into layers:

Every multi-layer perceptron model will have at least three layers:

  • Input Layer: This layer consists of all the predictors in our data.
  • Output Layer: This will consist of the outcome variable, which is also known as the dependent variable or target variable.
  • Hidden Layer: This layer is where you maximize the power of a neural network. Non-linear relationships can also be created in this layer, and all the complex interactions are carried out here. You can have many such hidden layers.

You will also notice in the preceding diagram that every neuron in a layer is connected to every neuron in the next layer. This forms connections, and every connecting line will have a weight associated with it. These weights will form different equations in the model.

Why are weights important?

Weights are important for several reasons. First because all neurons in one layer are connected to every neuron in the next layer, this means that the layers are connected. It also means that a neural network model, unlike many other models, doesn't drop any predictors. So for example, you may start off with 20 predictors, and these 20 predictors will be kept. A second reason why weights are important is that they provide information on the impact or importance of each predictor to the prediction. As will be shown later, these weights start off randomly, however through multiple iterations, the weights are modified so as to provide meaningful information.

An example representation of a multilayer perceptron model

Here, we will look at an example of a multilayer perceptron model. We will try to predict a potential buyer of a particular item based on an individual's age, income, and gender.

Consider the following, for example:

As you can see, our input predictors that form the Input Layer are age, income, and gender. The outcome variable that forms our Output Layer is Buy, which will determine whether someone bought a product or not. There is a hidden layer where the input predictors end up combining.

To better understand what goes on behind the scenes of a neural network model, lets take a look at a linear regression model.

The linear regression model

Let's understand the linear regression model with the help of an example.

Consider the following:

In linear regression, every input predictor in the Input Layer is connected to the outcome field by a single connection weight, also known as the coefficient, and these coefficients are estimated by a single pass through the data. The number of coefficients will be equal to the number of predictors. This means that every predictor will have a coefficient associated with it.

Every input predictor is directly connected to the Target with a particular coefficient as its weight. So, we can easily see the impact of a one unit change in the input predictor on the outcome variable or the Target. These kind of connections make it easy to determine the effect of each predictor on the Target variable as well as on the equation.

A sample neural network model

Let's use an example to understand neural networks in more detail:

Notice that every neuron in the Input Layer is connected to every neuron in the Hidden Layer, for example, Input 1 is connected to the first, second, and even the third neuron in the Hidden Layer. This implies that there will be three different weights, and these weights will be a part of three different equations.

This is what happens in this example:

  • The Hidden Layer intervenes between the Input Layer and the Output Layer.
  • The Hidden Layer allows for more complex models with nonlinear relationships.
  • There are many equations, so the influence of a single predictor on the outcome variable occurs through a variety of paths.
  • The interpretation of weights won't be straightforward.
  • Weights correspond to the variable importance; they will initially be random, and then they will go through a bunch of different iterations and will be changed based on the feedback of the iterations. They will then have their real meaning of being associated with variable importance.

So, let's go ahead and see how these weights are determined and how we can form a functional neural network.

Feed-forward backpropagation

Feed-forward backpropagation is a method through which we can predict things such as weights, and ultimately the outcome of a neural network.

According to this method, the following iterations occur on predictions:

  • If a prediction is correct, the weight associated with it is strengthened. Imagine the neural network saying, Hey, you know what, we used the weight of 0.75 for the first part of this equation for the first predictor and we got the correct prediction; that's probably a good starting point.
  • Suppose the prediction is incorrect; the error is fed back or back propagated into the model so that the weights or weight coefficients are modified, as shown here:

This backpropagation won't just take place in-between the Hidden Layers and the Target layer, but will also take place toward the Input Layer:

While these iterations are happening, we are actually making our neural network better and better with every error propagation. The connections now make a neural network capable of learning different patterns in the data.

So, unlike any linear regression or a decision tree model, a neural network tries to learn patterns in the data. If it's given enough time to learn those patterns, the neural network, combined with its experience, understands and predicts better, improving the rate of accuracy to a great extent.

Model training ethics

When you are training the neural network model, never train the model with the whole dataset. We need to hold back some data for testing purposes. This will allow us to test whether the neural network is able to apply what its learned from the training dataset to a new data.

We want the neural network to generalize well to new data and capture the generalities of different types of data, not just little nuances that would then make it sample-specific. Instead, we want the results to be translated to the new data as well. After the model has been trained, the new data can be predicted using the model's experience.

Summary

I hope you are now clear on machine learning predictive models and have understood the basic concepts. In this chapter, we have seen the characteristics of machine learning predictive models and have learned about some of the different types. These concepts are stepping stones to further chapters. We have also looked at an example of a basic neural network model. In the next chapter, we will implement a live neural network on a dataset and you will also be introduced to support vector machines and their implementation.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Learn how to apply machine learning techniques in the field of data science
  • Understand when to use different data mining techniques, how to set up different analyses, and how to interpret the results
  • A step-by-step approach to improving model development and performance

Description

Machine learning (ML) combined with data mining can give you amazing results in your data mining work by empowering you with several ways to look at data. This book will help you improve your data mining techniques by using smart modeling techniques. This book will teach you how to implement ML algorithms and techniques in your data mining work. It will enable you to pair the best algorithms with the right tools and processes. You will learn how to identify patterns and make predictions with minimal human intervention. You will build different types of ML models, such as the neural network, the Support Vector Machines (SVMs), and the Decision tree. You will see how all of these models works and what kind of data in the dataset they are suited for. You will learn how to combine the results of different models in order to improve accuracy. Topics such as removing noise and handling errors will give you an added edge in model building and optimization. By the end of this book, you will be able to build predictive models and extract information of interest from the dataset

Who is this book for?

If you are a data scientist, data analyst, and data mining professional and are keen to achieve a 30% higher salary by adding machine learning to your skillset, then this is the ideal book for you. You will learn to apply machine learning techniques to various data mining challenges. No prior knowledge of machine learning is assumed.

What you will learn

  • Hone your model-building skills and create the most accurate models
  • Understand how predictive machine learning models work
  • Prepare your data to acquire the best possible results
  • Combine models in order to suit the requirements of different types of data
  • Analyze single and multiple models and understand their combined results
  • Derive worthwhile insights from your data using histograms and graphs
Estimated delivery fee Deliver to Brazil

Standard delivery 10 - 13 business days

R$63.95

Premium delivery 3 - 6 business days

R$203.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Apr 30, 2019
Length: 252 pages
Edition : 1st
Language : English
ISBN-13 : 9781838828974
Category :
Languages :
Concepts :
Tools :

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Estimated delivery fee Deliver to Brazil

Standard delivery 10 - 13 business days

R$63.95

Premium delivery 3 - 6 business days

R$203.95
(Includes tracking information)

Product Details

Publication date : Apr 30, 2019
Length: 252 pages
Edition : 1st
Language : English
ISBN-13 : 9781838828974
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
R$50 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
R$500 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just R$25 each
Feature tick icon Exclusive print discounts
R$800 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just R$25 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total R$ 675.97
Machine Learning for Data Mining
R$183.99
Hands-On Machine Learning with Microsoft Excel 2019
R$245.99
Machine Learning for Finance
R$245.99
Total R$ 675.97 Stars icon

Table of Contents

6 Chapters
Introducing Machine Learning Predictive Models Chevron down icon Chevron up icon
Getting Started with Machine Learning Chevron down icon Chevron up icon
Understanding Models Chevron down icon Chevron up icon
Improving Individual Models Chevron down icon Chevron up icon
Advanced Ways of Improving Models Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Full star icon 5
(2 Ratings)
5 star 100%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Em Dec 10, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
A few years ago I bought the book, IBM SPSS Modeler Essentials, by the same author, and I found it to be extremely useful. This book introduces machine learning and then covers the ins and outs of several models. However it is chapters 4 and 5 that really take analyzing data to a whole other level.
Amazon Verified review Amazon
Amazon Customer Nov 02, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great book
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela