Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Machine Learning Quick Reference

You're reading from  Machine Learning Quick Reference

Product type Book
Published in Jan 2019
Publisher Packt
ISBN-13 9781788830577
Pages 294 pages
Edition 1st Edition
Languages
Author (1):
Rahul Kumar Rahul Kumar
Profile icon Rahul Kumar
Toc

Table of Contents (18) Chapters close

Title Page
Copyright and Credits
About Packt
Contributors
Preface
1. Quantifying Learning Algorithms 2. Evaluating Kernel Learning 3. Performance in Ensemble Learning 4. Training Neural Networks 5. Time Series Analysis 6. Natural Language Processing 7. Temporal and Sequential Pattern Discovery 8. Probabilistic Graphical Models 9. Selected Topics in Deep Learning 10. Causal Inference 11. Advanced Methods 1. Other Books You May Enjoy Index

Learning curve


The basic premise behind the learning curve is that the more time you spend doing something, the better you tend to get. Eventually, the time to perform a task keeps on plummeting. This is known by different names, such as improvement curve, progress curve, and startup function.

For example, when you start learning to drive a manual car, you undergo a learning cycle. Initially, you are extra careful about operating the break, clutch, and gear. You have to keep reminding yourself when and how to operate these components.

 

 

 

 

 

 

 

But, as the days go by and you continue practicing, your brain gets accustomed and trained to the entire process. With each passing day, your driving will keep getting smoother and your brain will react to the situation without any realization. This is called subconscious intelligence. You reach this stage with lots of practice and transition from a conscious intelligence to a subconscious intelligence that has got a cycle.

Machine learning

Let me define machine learning and its components so that you don't get bamboozled by lots of jargon when it gets thrown at you.

In the words of Tom Mitchell, "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."Also, another theory says that machine learning is the field that gives computers the ability to learn without being explicitly programmed.

For example, if a computer has been given cases such as, [(father, mother), (uncle, aunt), (brother, sisters)], based on this, it needs to find out (son, ?). That is, given son, what will be the associated item? To solve this problem, a computer program will go through the previous records and try to understand and learn the association and pattern out of these combinations as it hops from one record to another. This is called learning, and it takes place through algorithms. With more records, that is, more experience, the machine gets smarter and smarter.

Let's take a look at the different branches of machine learning, as indicated in the following diagram:

We will explain the preceding diagram as follows:

  • Supervised learning: In this type of learning, both the input variables and output variables are known to us. Here, we are supposed to establish a relationship between the input variables and the output, and the learning will be based on that. There are two types of problems under it, as follows:
    • Regression problem: It has got a continuous output. For example, a housing price dataset wherein the price of the house needs to be predicted based on input variables such as area, region, city, number of rooms, and so on. The price to be predicted is a continuous variable.
    • Classification: It has got a discrete output. For example, the prediction that an employee would leave an organization or not, based on salary, gender, the number of members in their family, and so on.
  • Unsupervised learning: In this type of scenario, there is no output variable. We are supposed to extract a pattern based on all the variables given. For example, the segmentation of customers based on age, gender, income, and so on.
  • Reinforcement learning: This is an area of machine learning wherein suitable action is taken to maximize reward. For example, training a dog to catch a ball and give it—we reward the dog if they carry out this action; otherwise, we tell them off, leading to a punishment.

Wright's model

In Wright's model, the learning curve function is defined as follows:

The variables are as follows:

  • Y: The cumulative average time per unit
  • X: The cumulative number of units produced
  • a: Time required to produce the first unit
  • b: Slope of the function when plotted on graph paper (log of the learning rate/log of 2)

The following curve has got a vertical axis (y axis) representing the learning with respect to a particular work and a horizontal axis that corresponds to the time taken to learn. A learning curve with a steep beginning can be comprehended as a sign of rapid progress. The following diagram shows Wright's Learning Curve Model:

However, the question that arises is, How is it connected to machine learning? We will discuss this in detail now.

Let's discuss a scenario that happens to be a supervised learning problem by going over the following steps:

  1. We take the data and partition it into a training set (on which we are making the system learn and come out as a model) and a validation set (on which we are testing how well the system has learned).
  2. The next step would be to take one instance (observation) of the training set and make use of it to estimate a model. The model error on the training set will be 0.
  3. Finally, we would find out the model error on the validation data.

Step 2 and Step 3 are repeated by taking a number of instances (training size) such as 10, 50, and 100 and studying the training error and validation error, as well as their relationship with a number of instances (training size). This curve—or the relationship—is called a learning curve in a machine learning scenario.

 

Let's work on a combined power plant dataset. The features comprised hourly average ambient variables, that is, temperature (T), ambient pressure (AP), relative humidity (RH), and exhaust vacuum (V), to predict the net hourly electrical energy output (PE) of the plant:

# importing all the libraries
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import learning_curve
import matplotlib.pyplot as plt

#reading the data
data= pd.read_excel("Powerplant.xlsx")

#Investigating the data
print(data.info())
data.head()

From this, we are able to see the data structure of the variables in the data:

The output can be seen as follows:

The second output gives you a good feel for the data.

The dataset has five variables, where ambient temperature (AT) and PE (target variable).

 

 

Let's vary the training size of the data and study the impact of it on learning. A list is created for train_size with varying training sizes, as shown in the following code:

# As discussed here we are trying to vary the size of training set
train_size = [1, 100, 500, 2000, 5000]
features = ['AT', 'V', 'AP', 'RH']
target = 'PE'
# estimating the training score & validation score
train_sizes, train_scores, validation_scores = learning_curve(estimator = LinearRegression(), X = data[features],y = data[target], train_sizes = train_size, cv = 5,scoring ='neg_mean_squared_error')

Let's generate the learning_curve:

# Generating the Learning_Curve
train_scores_mean =-train_scores.mean(axis =1)
validation_scores_mean =-validation_scores.mean(axis =1)
import matplotlib.pyplot as plt 
plt.style.use('seaborn')
plt.plot(train_sizes, train_scores_mean, label ='Train_error')
plt.plot(train_sizes, validation_scores_mean, label ='Validation_error')
plt.ylabel('MSE', fontsize =16)
plt.xlabel('Training set size', fontsize =16)
plt.title('Learning_Curves', fontsize = 20, y =1)
plt.legend()

We get the following output:

From the preceding plot, we can see that when the training size is just 1, the training error is 0, but the validation error shoots beyond 400.

As we go on increasing the training set's size (from 1 to 100), the training error continues rising. However, the validation error starts to plummet as the model performs better on the validation set. After the training size hits the 500 mark, the validation error and training error begin to converge. So, what can be inferred out of this? The performance of the model won't change, irrespective of the size of the training post. However, if you try to add more features, it might make a difference, as shown in the following diagram:

The preceding diagram shows that the validation and training curve have converged, so adding training data will not help at all. However, in the following diagram, the curves haven't converged, so adding training data will be a good idea:

You have been reading a chapter from
Machine Learning Quick Reference
Published in: Jan 2019 Publisher: Packt ISBN-13: 9781788830577
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at ₹800/month. Cancel anytime