Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Scala for Machine Learning

You're reading from   Scala for Machine Learning Leverage Scala and Machine Learning to construct and study systems that can learn from data

Arrow left icon
Product type Paperback
Published in Dec 2014
Publisher
ISBN-13 9781783558742
Length 624 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Patrick R. Nicolas Patrick R. Nicolas
Author Profile Icon Patrick R. Nicolas
Patrick R. Nicolas
Arrow right icon
View More author details
Toc

Table of Contents (15) Chapters Close

Preface 1. Getting Started FREE CHAPTER 2. Hello World! 3. Data Preprocessing 4. Unsupervised Learning 5. Naïve Bayes Classifiers 6. Regression and Regularization 7. Sequential Data Models 8. Kernel Models and Support Vector Machines 9. Artificial Neural Networks 10. Genetic Algorithms 11. Reinforcement Learning 12. Scalable Frameworks A. Basic Concepts Index

Taxonomy of machine learning algorithms

The purpose of machine learning is to teach computers to execute tasks without human intervention. An increasing number of applications such as genomics, social networking, advertising, or risk analysis generate a very large amount of data that can be analyzed or mined to extract knowledge or provide insight into a process, a customer, or an organization. Ultimately, machine learning algorithms consist of identifying and validating models to optimize a performance criterion using historical, present, and future data [1:4].

Data mining is the process of extracting or identifying patterns in a dataset.

Unsupervised learning

The goal of unsupervised learning is to discover patterns of regularities and irregularities in a set of observations. The process known as density estimation in statistics is broken down into two categories: discovery of data clusters and discovery of latent factors. The methodology consists of processing input data to understand patterns similar to the natural learning process in infants or animals. Unsupervised learning does not require labeled data, and therefore, is easy to implement and execute because no expertise is needed to validate an output. However, it is possible to label the output of a clustering algorithm and use it for future classification.

Clustering

The purpose of data clustering is to partition a collection of data into a number of clusters or data segments. Practically, a clustering algorithm is used to organize observations into clusters by minimizing the observations within a cluster and maximizing the observations between clusters. A clustering algorithm consists of the following steps:

  1. Creating a model by making an assumption on the input data.
  2. Selecting the objective function or goal of the clustering.
  3. Evaluating one or more algorithms to optimize the objective function.

Data clustering is also known as data segmentation or data partitioning.

Dimension reduction

Dimension reduction techniques aim at finding the smallest but most relevant set of features that models dataset reliability. There are many reasons for reducing the number of features or parameters in a model, from avoiding overfitting to reducing computation costs.

There are many ways to classify the different techniques used to extract knowledge from data using unsupervised learning. The following taxonomy breaks down these techniques according to their purpose, although the list is far for being exhaustive, as shown in the following diagram:

Dimension reduction

Supervised learning

The best analogy for supervised learning is function approximation or curve fitting. In its simplest form, supervised learning attempts to extract a relation or function f x → y from a training set {x, y}. Supervised learning is far more accurate and reliable than any other learning strategy. However, a domain expert may be required to label (tag) data as a training set for certain types of problems.

Supervised machine learning algorithms can be broken into two categories:

  • Generative models
  • Discriminative models

Generative models

In order to simplify the description of statistics formulas, we adopt the following simplification: the probability of an event X is the same as the probability of the discrete random variable X to have a value x, p(X) = p(X=x). The notation of joint probability (resp. conditional probability) becomes p(X, Y) = p(X=x, Y=y) (resp. p(X|Y)=p(X=x | Y=y).

Generative models attempt to fit a joint probability distribution, p(X,Y), of two events (or random variables), X and Y, representing two sets of observed and hidden (latent) variables x and y. Discriminative models learn the conditional probability p(Y|X) of an event or random variable Y of hidden variables y, given an event or random variable X of observed variables x. Generative models are commonly introduced through the Bayes' rule. The conditional probability of an event Y, given an event X, is computed as the product of the conditional probability of the event X, given the event Y, and the probability of the event X normalized by the probability of event Y [1:5].

Tip

Join probability (if X and Y are independent):

Generative models

Conditional probability:

Generative models

The Bayes' rule:

Generative models

The Bayes' rule is the foundation of the Naïve Bayes classifier, which is the topic of Chapter 5, Naïve Bayes Classifiers.

Discriminative models

Contrary to generative models, discriminative models compute the conditional probability p(Y|X) directly, using the same algorithm for training and classification.

Generative and discriminative models have their respective advantages and drawbacks. Novice data scientists learn to match the appropriate algorithm to each problem through experimentation. Here is a brief guideline describing which type of models makes sense according to the objective or criteria of the project:

Objective

Generative models

Discriminative models

Accuracy

Highly dependent on the training set.

Probability estimates tend to be more accurate.

Modeling requirements

There is a need to model both observed and hidden variables, which requires a significant amount of training.

The quality of the training set does not have to be as rigorous as for generative models.

Computation cost

This is usually low. For example, any graphical method derived from the Bayes' rule has low overhead.

Most algorithms rely on optimization of a convex that introduces significant performance overhead.

Constraints

These models assume some degree of independence among the model features.

Most discriminative algorithms accommodate dependencies between features.

We can further refine the taxonomy of supervised learning algorithms by segregating between sequential and random variables for generative models and breaking down discriminative methods as applied to continuous processes (regression) and discrete processes (classification):

Discriminative models

Reinforcement learning

Reinforcement learning is not as well understood as supervised and unsupervised learning outside the realms of robotics or game strategy. However, since the 90s, genetic-algorithms-based classifiers have become increasingly popular to solve problems that require collaboration with a domain expert. For some types of applications, reinforcement learning algorithms output a set of recommended actions for the adaptive system to execute. In its simplest form, these algorithms compute or estimate the best course of action. Most complex systems based on reinforcement learning establish and update policies that can be vetoed by an expert. The foremost challenge developers of reinforcement learning systems face is that the recommended action or policy may depend on partially observable states and how to deal with uncertainty.

Genetic algorithms are not usually considered part of the reinforcement learning toolbox. However, advanced models such as learning classifier systems use genetic algorithms to classify and reward the rules and policies.

As with the two previous learning strategies, reinforcement learning models can be categorized as Markovian or evolutionary:

Reinforcement learning

This is a brief overview of machine learning algorithms with a suggested taxonomy. There are almost as many ways to introduce machine learning as there are data and computer scientists. We encourage you to browse through the list of references at the end of the book and find the documentation appropriate to your level of interest and understanding.

You have been reading a chapter from
Scala for Machine Learning
Published in: Dec 2014
Publisher:
ISBN-13: 9781783558742
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime