Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Supervised Machine Learning with Python

You're reading from   Supervised Machine Learning with Python Develop rich Python coding practices while exploring supervised machine learning

Arrow left icon
Product type Paperback
Published in May 2019
Publisher Packt
ISBN-13 9781838825669
Length 162 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Taylor Smith Taylor Smith
Author Profile Icon Taylor Smith
Taylor Smith
Arrow right icon
View More author details
Toc

Supervised learning

In this section, we will formally define what machine learning is and, specifically, what supervised machine learning is.

In the early days of AI, everything was a rules engine. The programmer wrote the function and the rules, and the computer simply followed them. Modern-day AI is more in line with machine learning, which teaches a computer to write its own functions. Some may contest that oversimplification of the concept, but, at its core, this is largely what machine learning is all about.

We're going to look at a quick example of what machine learning is and what it is not. Here, we're using scikit-learn's datasets, submodule to create two objects and variables, also known as covariance or features, which are along the column axis. y is a vector with the same number of values as there are rows in X. In this case, y is a class label. For the sake of an example, y here could be a binary label corresponding to a real-world occurrence, such as the malignancy of a tumor. X is then a matrix of attributes that describe y. One feature could be the diameter of the tumor, and another could indicate its density. The preceding explanation can be seen in the following code:

import numpy as np
from sklearn.datasets import make_classification

rs = np.random.RandomState(42)
X,y = make_classification(n_samples=10, random_state=rs)

A rules engine, by our definition, is simply business logic. It can be as simple or as complex as you need it to be, but the programmer makes the rules. In this function, we're going to evaluate our X matrix by returning 1, or true, where the sums over the rows are greater than 0. Even though there's some math involved here, there is still a rules engine, because we, the programmers, defined a rule. So, we could theoretically get into a gray area, where the rule itself was discovered via machine learning. But, for the sake of argument, let's take an example that the head surgeon arbitrarily picks 0 as our threshold, and anything above that is deemed as cancerous:

def make_life_alterning_decision(X):
"""Determine whether something big happens"""
row_sums = X.sum(axis=1)
return (row_sums > 0).astype(int)
make_life_alterning_decision(X)

The output of the preceding code snippet is as follows:

array([0, 1, 0, 0, 1, 1, 1, 0, 1, 0])

Now, as mentioned before, our rules engine can be as simple or as complex as we want it to be. Here, we're not only interested in row_sums, but we have several criteria to meet in order to deem something cancerous. The minimum value in the row must be less than -1.5, in addition to one or more of the following three criteria:

  • The row sum exceeds 0
  • The sum of the rows is evenly divisible by 0.5
  • The maximum value of the row is greater than 1.5

So, even though our math is a little more complex here, we're still just building a rules engine:

def make_more_complex_life_alterning_decision(X):
"""Make a more complicated decision about something big"""
row_sums = X.sum(axis=1)
return ((X.min(axis=1) < -1.5) &
((row_sums >= 0.) |
(row_sums % 0.5 == 0) |
(X.max(axis=1) > 1.5))).astype(int)

make_more_complex_life_alterning_decision(X)

The output of the preceding code is as follows:

array([0, 1, 1, 1, 1, 1, 0, 1, 1, 0])

Now, let's say that our surgeon understands and realizes they're not the math or programming whiz that they thought they were. So, they hire programmers to build them a machine learning model. The model itself is a function that discovers parameters that complement a decision function, which is essentially the function the machine itself learned. So, parameters are things we'll discuss in our next Chapter 2, Implementing Parametric Models, which are parametric models. So, what's happening behind the scenes when we invoke the fit method is that the model learns the characteristics and patterns of the data, and how the X matrix describes the y vector. Then, when we call the predict function, it applies its learned decision function to the input data to make an educated guess:

from sklearn.linear_model import LogisticRegression

def learn_life_lession(X, y):
"""Learn a lesson abd apply it in a future situation"""
model = LogisticRegression().fit(X, y)
return (lambda X: model.predict(X))
educated_decision = learn_life_lession(X, y)(X)
educated_decision

The output of the preceding code is as follows:

array([1, 1, 0, 0, 0, 1, 1, 0, 1, 0])

So, now we're at a point where we need to define specifically what supervised learning is. Supervised learning is precisely the example we just described previously. Given our matrix of examples, X, in a vector of corresponding labels, y, that learns a function which approximates the value of y or :

There are other forms of machine learning that are not supervised, known as unsupervised machine learning. These do not have labels and are more geared toward pattern recognition tasks. So, what makes something supervised is the presence of labeled data.

Going back to our previous example, when we invoke the fit method, we learn our new decision function and then, when we call predict, we're approximating the new y values. So, the output is this  we just looked at:

Supervised learning learns a function from labelled samples that approximates future y values. At this point, you should feel comfortable explaining the abstract concept—just the high-level idea of what supervised machine learning is.

You have been reading a chapter from
Supervised Machine Learning with Python
Published in: May 2019
Publisher: Packt
ISBN-13: 9781838825669
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image