Packt+ | Advance your knowledge in tech

You're reading from Reinforcement Learning with TensorFlow A beginner's guide to designing self-learning systems with TensorFlow and OpenAI Gym

Product type Paperback

Published in Apr 2018

Publisher Packt

ISBN-13 9781788835725

Length 334 pages

Edition 1st Edition

Languages

Python

Tools

OpenAI Gym

Concepts

Reinforcement Learning

Author (1):

Sayon Dutta

View More author details

Table of Contents (17) Chapters

Preface

1. Deep Learning – Architectures and Frameworks FREE CHAPTER

2. Training Reinforcement Learning Agents Using OpenAI Gym

3. Markov Decision Process

4. Policy Gradients

5. Q-Learning and Deep Q-Networks

6. Asynchronous Methods

7. Robo Everything – Real Strategy Gaming

8. AlphaGo – Reinforcement Learning at Its Best

9. Reinforcement Learning in Autonomous Driving

10. Financial Portfolio Management

11. Reinforcement Learning in Robotics

12. Deep Reinforcement Learning in Ad Tech

13. Reinforcement Learning in Image Processing

14. Deep Reinforcement Learning in NLP

15. Further topics in Reinforcement Learning

16. Other Books You May Enjoy

Leave a review - let other readers know what you think

Model based learning and model free learning

In Chapter 3, Markov Decision Process, we used states, actions, rewards, transition models, and discount factors to solve our Markov decision process, that is, the MDP problem. Thus, if all these elements of an MDP problem are available, we can easily use a planning algorithm to come up with a solution to the objective. This type of learning is called model based learning, where an AI agent will interact with the environment and based on its interactions, will try to approximate the environment's model, that is, the state transition model. Given the model, now the agent can try to find the optimum policy through value iteration or policy iteration.

But its not necessary for our AI agent to learn an explicit model of the environment. It can derive optimal policy directly from its interactions with the environment without building a model. This type of learning is called model free learning. Model free learning involves predicting the value function...