Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Artificial Intelligence By Example

You're reading from   Artificial Intelligence By Example Acquire advanced AI, machine learning, and deep learning design skills

Arrow left icon
Product type Paperback
Published in Feb 2020
Publisher Packt
ISBN-13 9781839211539
Length 578 pages
Edition 2nd Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Denis Rothman Denis Rothman
Author Profile Icon Denis Rothman
Denis Rothman
Arrow right icon
View More author details
Toc

Table of Contents (23) Chapters Close

Preface 1. Getting Started with Next-Generation Artificial Intelligence through Reinforcement Learning 2. Building a Reward Matrix – Designing Your Datasets FREE CHAPTER 3. Machine Intelligence – Evaluation Functions and Numerical Convergence 4. Optimizing Your Solutions with K-Means Clustering 5. How to Use Decision Trees to Enhance K-Means Clustering 6. Innovating AI with Google Translate 7. Optimizing Blockchains with Naive Bayes 8. Solving the XOR Problem with a Feedforward Neural Network 9. Abstract Image Classification with Convolutional Neural Networks (CNNs) 10. Conceptual Representation Learning 11. Combining Reinforcement Learning and Deep Learning 12. AI and the Internet of Things (IoT) 13. Visualizing Networks with TensorFlow 2.x and TensorBoard 14. Preparing the Input of Chatbots with Restricted Boltzmann Machines (RBMs) and Principal Component Analysis (PCA) 15. Setting Up a Cognitive NLP UI/CUI Chatbot 16. Improving the Emotional Intelligence Deficiencies of Chatbots 17. Genetic Algorithms in Hybrid Neural Networks 18. Neuromorphic Computing 19. Quantum Computing 20. Answers to the Questions 21. Other Books You May Enjoy
22. Index

Reinforcement learning concepts

AI is constantly evolving. The classical approach states that:

  • AI covers all domains
  • Machine learning is a subset of AI, with clustering, classification, regression, and reinforcement learning
  • Deep learning is a subset of machine learning that involves neural networks

However, these domains often overlap and it's difficult to fit neuromorphic computing, for example, with its sub-symbolic approach, into these categories (see Chapter 18, Neuromorphic Computing).

In this chapter, RL clearly fits into machine learning. Let's have a brief look into the scientific foundations of the MDP, the RL algorithm we are going to explore. The main concepts to keep in mind are the following:

  • Optimal transport: In 1781, Gaspard Monge defined transport optimizing from one location to another using the shortest and most cost-effective path; for example, mining coal and then using the most cost-effective path to a factory. This was subsequently generalized to any form of path from point A to point B.
  • Boltzmann equation and constant: In the late 19th century, Ludwig Boltzmann changed our vision of the world with his probabilistic distribution of particles beautifully summed up in his entropy formula:

    S = k * log W

    S represents the entropy (energy, disorder) of a system expressed. k is the Boltzmann constant, and W represents the number of microstates. We will explore Boltzmann's ideas further in Chapter 14, Preparing the Input of Chatbots with Restricted Boltzmann Machines (RBMs) and Principal Component Analysis (PCA).

  • Probabilistic distributions advanced further: Josiah Willard Gibbs took the probabilistic distributions of large numbers of particles a step further. At that point, probabilistic information theory was advancing quickly. At the turn of the 19th century, Andrey Markov applied probabilistic algorithms to language, among other areas. A modern era of information theory was born.
  • When Boltzmann and optimal transport meet: 2011 Fields Medal winner, Cédric Villani, brought Boltzmann's equation to yet another level. Villani then went on to unify optimal transport and Boltzmann. Cédric Villani proved something that was somewhat intuitively known to 19th century mathematicians but required proof.

Let's take all of the preceding concepts and materialize them in a real-world example that will explain why reinforcement learning using the MDP, for example, is so innovative.

Analyzing the following cup of tea will take you right into the next generation of AI:

Figure 1.1: Consider a cup of tea

You can look at this cup of tea in two different ways:

  1. Macrostates: You look at the cup and content. You can see the volume of tea in the cup and you could feel the temperature when holding the cup in your hand.
  2. Microstates: But can you tell how many molecules are in the tea, which ones are hot, warm, or cold, their velocity and directions? Impossible right?

Now, imagine, the tea contains 2,000,000,000+ Facebook accounts, or 100,000,000+ Amazon Prime users with millions of deliveries per year. At this level, we simply abandon the idea of controlling every item. We work on trends and probabilities.

Boltzmann provides a probabilistic approach to the evaluation of the features of our real world. Materializing Boltzmann in logistics through optimal transport means that the temperature could be the ranking of a product, the velocity can be linked to the distance to delivery, and the direction could be the itineraries we will study in this chapter.

Markov picked up the ripe fruits of microstate probabilistic descriptions and applied it to his MDP. Reinforcement learning takes the huge volume of elements (particles in a cup of tea, delivery locations, social network accounts) and defines the probable paths they take.

The turning point of human thought occurred when we simply could not analyze the state and path of the huge volumes facing our globalized world, which generates images, sounds, words, and numbers that exceed traditional software approaches.

With this in mind, we can start exploring the MDP.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime