Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Machine Learning for Developers

You're reading from   Machine Learning for Developers Uplift your regular applications with the power of statistics, analytics, and machine learning

Arrow left icon
Product type Paperback
Published in Oct 2017
Publisher Packt
ISBN-13 9781786469878
Length 270 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Md Mahmudul Hasan Md Mahmudul Hasan
Author Profile Icon Md Mahmudul Hasan
Md Mahmudul Hasan
Rodolfo Bonnin Rodolfo Bonnin
Author Profile Icon Rodolfo Bonnin
Rodolfo Bonnin
Arrow right icon
View More author details
Toc

Table of Contents (10) Chapters Close

Preface 1. Introduction - Machine Learning and Statistical Science FREE CHAPTER 2. The Learning Process 3. Clustering 4. Linear and Logistic Regression 5. Neural Networks 6. Convolutional Neural Networks 7. Recurrent Neural Networks 8. Recent Models and Developments 9. Software Installation and Configuration

Machine learning in the bigger picture

Machine learning as a discipline is not an isolated field—it is framed inside a wider domain, Artificial Intelligence (AI). But as you can guess, machine learning didn't appear from the void. As a discipline it has its predecessors, and it has been evolving in stages of increasing complexity in the following four clearly differentiated steps:

  1. The first model of machine learning involved rule-based decisions and a simple level of data-based algorithms that includes in itself, and as a prerequisite, all the possible ramifications and decision rules, implying that all the possible options will be hardcoded into the model beforehand by an expert in the field. This structure was implemented in the majority of applications developed since the first programming languages appeared in 1950. The main data type and function being handled by this kind of algorithm is the Boolean, as it exclusively dealt with yes or no decisions.

  1. During the second developmental stage of statistical reasoning, we started to let the probabilistic characteristics of the data have a say, in addition to the previous choices set up in advance. This better reflects the fuzzy nature of real-world problems, where outliers are common and where it is more important to take into account the nondeterministic tendencies of the data than the rigid approach of fixed questions. This discipline adds to the mix of mathematical tools elements of Bayesian probability theory. Methods pertaining to this category include curve fitting (usually of linear or polynomial), which has the common property of working with numerical data.
  2. The machine learning stage is the realm in which we are going to be working throughout this book, and it involves more complex tasks than the simplest Bayesian elements of the previous stage.
    The most outstanding feature of machine learning algorithms is that they can generalize models from data but the models are capable of generating their own feature selectors, which aren't limited by a rigid target function, as they are generated and defined as the training process evolves. Another differentiator of this kind of model is that they can take a large variety of data types as input, such as speech, images, video, text, and other data susceptible to being represented as vectors.
  3. AI is the last step in the scale of abstraction capabilities that, in a way, include all previous algorithm types, but with one key difference: AI algorithms are able to apply the learned knowledge to solve tasks that had never been considered during training. The types of data with which this algorithm works are even more generic than the types of data supported by machine learning, and they should be able, by definition, to transfer problem-solving capabilities from one data type to another, without a complete retraining of the model. In this way, we could develop an algorithm for object detection in black and white images and the model could abstract the knowledge to apply the model to color images.

In the following diagram, we represent these four stages of development towards real AI applications:

Types of machine learning

Let's try to dissect the different types of machine learning project, starting from the grade of previous knowledge from the point of view of the implementer. The project can be of the following types:

  • Supervised learning: In this type of learning, we are given a sample set of real data, accompanied by the result the model should give us after applying it. In statistical terms, we have the outcome of all the training set experiments.
  • Unsupervised learning: This type of learning provides only the sample data from the problem domain, but the task of grouping similar data and applying a category has no previous information from which it can be inferred.
  • Reinforcement learning: This type of learning doesn't have a labeled sample set and has a different number of participating elements, which include an agent, an environment, and learning an optimum policy or set of steps, maximizing a goal-oriented approach by using rewards or penalties (the result of each attempt).

Take a look at the following diagram:

Main areas of Machine Learning

Grades of supervision

The learning process supports gradual steps in the realm of supervision:

  • Unsupervised Learning doesn't have previous knowledge of the class or value of any sample, it should infer it automatically.
  • Semi-Supervised Learning, needs a seed of known samples, and the model infers the remaining samples class or value from that seed.
  • Supervised Learning: This approach normally includes a set of known samples, called training set, another set used to validate the model's generalization, and a third one, called test set, which is used after the training process to have an independent number of samples outside of the training set, and warranty independence of testing.

In the following diagram, depicts the mentioned approaches:

Graphical depiction of the training techniques for Unsupervised, Semi-Supervised and Supervised Learning

Supervised learning strategies - regression versus classification

This type of learning has the following two main types of problem to solve:

  • Regression problem: This type of problem accepts samples from the problem domain and, after training the model, minimizes the error by comparing the output with the real answers, which allows the prediction of the right answer when given a new unknown sample
  • Classification problem: This type of problem uses samples from the domain to assign a label or group to new unknown samples

Unsupervised problem solving–clustering

The vast majority of unsupervised problem solving consist of grouping items by looking at similarities or the value of shared features of the observed items, because there is no certain information about the a priori classes. This type of technique is called clustering.

Outside of these main problem types, there is a mix of both, which is called semi-supervised problem solving, in which we can train a labeled set of elements and also use inference to assign information to unlabeled data during training time. To assign data to unknown entities, three main criteria are used—smoothness (points close to each other are of the same class), cluster (data tends to form clusters, a special case of smoothness), and manifold (data pertains to a manifold of much lower dimensionality than the original domain).

You have been reading a chapter from
Machine Learning for Developers
Published in: Oct 2017
Publisher: Packt
ISBN-13: 9781786469878
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime