Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
TensorFlow Machine Learning Projects

You're reading from   TensorFlow Machine Learning Projects Build 13 real-world projects with advanced numerical computations using the Python ecosystem

Arrow left icon
Product type Paperback
Published in Nov 2018
Publisher Packt
ISBN-13 9781789132212
Length 322 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Ankit Jain Ankit Jain
Author Profile Icon Ankit Jain
Ankit Jain
Dr. Amita Kapoor Dr. Amita Kapoor
Author Profile Icon Dr. Amita Kapoor
Dr. Amita Kapoor
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Overview of TensorFlow and Machine Learning FREE CHAPTER 2. Using Machine Learning to Detect Exoplanets in Outer Space 3. Sentiment Analysis in Your Browser Using TensorFlow.js 4. Digit Classification Using TensorFlow Lite 5. Speech to Text and Topic Extraction Using NLP 6. Predicting Stock Prices using Gaussian Process Regression 7. Credit Card Fraud Detection using Autoencoders 8. Generating Uncertainty in Traffic Signs Classifier Using Bayesian Neural Networks 9. Generating Matching Shoe Bags from Shoe Images Using DiscoGANs 10. Classifying Clothing Images using Capsule Networks 11. Making Quality Product Recommendations Using TensorFlow 12. Object Detection at a Large Scale with TensorFlow 13. Generating Book Scripts Using LSTMs 14. Playing Pacman Using Deep Reinforcement Learning 15. What is Next? 16. Other Books You May Enjoy

Machine learning, classification, and logistic regression

Let's now learn about machine learning, classification, and logistic regression.

Machine learning

Machine learning refers to the application of algorithms to make computers learn from data. The models that are learned by computers are used to make predictions and forecasts. Machine learning has been successfully applied in a variety of areas, such as natural language processing, self-driving vehicles, image and speech recognition, chatbots, and computer vision.

Machine learning algorithms are broadly categorized into three types:

  • Supervised learning: In supervised learning, the machine learns the model from a training dataset that consists of features and labels. The supervised learning problems are generally of two types: regression and classification. Regression refers to predicting future values based on the model, while classification refers to predicting the categories of the input values.
  • Unsupervised learning: In unsupervised learning, the machine learns the model from a training dataset that consists of features only. One of the most common types of unsupervised learning is known as clustering. Clustering refers to dividing the input data into multiple groups, thus producing clusters or segments.
  • Reinforcement learning: In reinforcement learning, the agent starts with an initial model and then continuously learns the model based on the feedback from the environment. A reinforcement learning agent learns or updates the model by applying supervised or unsupervised learning techniques as part of the reinforcement learning algorithms.

These machine learning problems are abstracted to the following equation in one form or another:

Here, y represents the target and x represents the feature. If x is a collection of features, it is also called a feature vector and denoted with X. The model is the function f that maps features to targets. Once the computer learns f, it can use the new values of x to predict the values of y.

The preceding simple equation can be rewritten in the context of linear models for machine learning as follows:

Here, w is known as the weight and b is known as the bias. Thus, the machine learning problem now can be stated as a problem of finding w and b from the current values of X so that the equation can now be used to predict the values of y. 

Regression analysis or regression modeling refers to the methods and techniques used to estimate relationships among variables. The variables that are used as input for regression models are called independent variables, predictors, or features, and the output variables from regression models are called dependent variables or targets. Regression models are defined as follows:

Where Y is the target variable, X is a vector of features, and β is a vector of parameters (w,b in the preceding equation).

Classification

Classification is one of the classical problems in machine learning. Data under consideration could belong to one class or another, for example, if the images provided are data, they could be pictures of cats or dogs. Thus, the classes, in this case, are cats and dogs. Classification means identifying the label or class of the objects under consideration. Classification falls under the umbrella of supervised machine learning. In classification problems, a training dataset is provided that has features or inputs and their corresponding outputs or labels. Using this training dataset, a model is trained; in other words, the parameters of the model are computed. The trained model is then used on new data to find its correct labels.

Classification problems can be of two types: binary class or multiclass. Binary class means that the data is to be classified into two distinct and discrete labels; for example, the patient has cancer or the patient does not have cancer, and the images are of cats or dogs and so on. Multiclass means that the data is to be classified among multiple classes, for example, an email classification problem will divide emails into social media emails, work-related emails, personal emails, family-related emails, spam emails, shopping offer emails, and so on. Another example would be of pictures of digits; each picture could be labeled between 0 and 9, depending on what digit the picture represents. In this chapter, we will look at examples of both kinds of classification.

The most popular method for classification is logistic regression. Logistic regression is a probabilistic and linear classifier. The probability that the vector of input features belongs to a specific class can be described mathematically by the following equation: 

In the preceding equation, the following applies:

  • Y represents the output
  • i represents one of the classes
  • x represents the inputs
  • w represents the weights
  • b represents the biases
  • z represents the regression equation 
  • ϕ represents the smoothing function (or model, in our case)

The ϕ(z) function represents the probability that x belongs to class i when w and b are given. Thus, the model has to be trained to maximize the value of this probability.

Logistic regression for binary classification

For binary classification, the model function ϕ(z) is defined as the sigmoid function, which can be described as follows:

The sigmoid function transforms the y value to be between the range [0,1]. Thus, the value of y=ϕ(z) can be used to predict the class: if y > 0.5, then the object belongs to 1, otherwise the object belongs to 0.

The model training means to search for the parameters that minimize the loss function, which can either be the sum of squared errors or the sum of mean squared errors. For logistic regression, the likelihood is maximized as follows:

 

However, as it is easier to maximize the log-likelihood, we use the log-likelihood (l(w)) as the cost function. The loss function (J(w)) is written as -l(w), and can be minimized by using optimization algorithms such as gradient descent.

The loss function for binary logistic regression is written mathematically as follows:

Here, ϕ(z) is the sigmoid function.

Logistic regression for multiclass classification

When more than two classes are involved, logistic regression is known as multinomial logistic regression. In multinomial logistic regression, instead of sigmoid, use the softmax function, which can be described mathematically as follows:

The softmax function produces the probabilities for each class so that the probabilities vector adds up to 1. At the time of inference, the class with the highest softmax value becomes the output or predicted class. The loss function, as we discussed earlier, is the negative log-likelihood function, -l(w), that can be minimized by the optimizers, such as gradient descent.

The loss function for multinomial logistic regression is written formally as follows:

Here, ϕ(z) is the softmax function.

We will implement this loss function in the next section. In the following section, we will dig into our example for multiclass classification with logistic regression in TensorFlow.

You have been reading a chapter from
TensorFlow Machine Learning Projects
Published in: Nov 2018
Publisher: Packt
ISBN-13: 9781789132212
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime