Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On Intelligent Agents with OpenAI Gym

You're reading from   Hands-On Intelligent Agents with OpenAI Gym Your guide to developing AI agents using deep reinforcement learning

Arrow left icon
Product type Paperback
Published in Jul 2018
Publisher Packt
ISBN-13 9781788836579
Length 254 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Palanisamy Palanisamy
Author Profile Icon Palanisamy
Palanisamy
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Introduction to Intelligent Agents and Learning Environments 2. Reinforcement Learning and Deep Reinforcement Learning FREE CHAPTER 3. Getting Started with OpenAI Gym and Deep Reinforcement Learning 4. Exploring the Gym and its Features 5. Implementing your First Learning Agent - Solving the Mountain Car problem 6. Implementing an Intelligent Agent for Optimal Control using Deep Q-Learning 7. Creating Custom OpenAI Gym Environments - CARLA Driving Simulator 8. Implementing an Intelligent - Autonomous Car Driving Agent using Deep Actor-Critic Algorithm 9. Exploring the Learning Environment Landscape - Roboschool, Gym-Retro, StarCraft-II, DeepMindLab 10. Exploring the Learning Algorithm Landscape - DDPG (Actor-Critic), PPO (Policy-Gradient), Rainbow (Value-Based) 11. Other Books You May Enjoy

Understanding the features of OpenAI Gym

In this section, we will take a look at the key features that have made the OpenAI Gym toolkit very popular in the reinforcement learning community and led to it becoming widely adopted.

Simple environment interface

OpenAI Gym provides a simple and common Python interface to environments. Specifically, it takes an action as input and provides observation, reward, done and an optional info object, based on the action as the output at each step. If this does not make perfect sense to you yet, do not worry. We will go over the interface again in a more detailed manner to help you understand. This paragraph is just to give you an overview of the interface to make it clear how simple it is. This provides great flexibility for users as they can design and develop their agent algorithms based on any paradigm they like, and not be constrained to use any particular paradigm because of this simple and convenient interface.

Comparability and reproducibility

We intuitively feel that we should be able to compare the performance of an agent or an algorithm in a particular task to the performance of another agent or algorithm in the same task. For example, if an agent gets a score of 1,000 on average in the Atari game of Space Invaders, we should be able to tell that this agent is performing worse than an agent that scores 5000 on average in the Space Invaders game in the same amount of training time. But what happens if the scoring system for the game is slightly changed? Or if the environment interface was modified to include additional information about the game states that will provide an advantage to the second agent? This would make the score-to-score comparison unfair, right?

To handle such changes in the environment, OpenAI Gym uses strict versioning for environments. The toolkit guarantees that if there is any change to an environment, it will be accompanied by a different version number. Therefore, if the original version of the Atari Space Invaders game environment was named SpaceInvaders-v0 and there were some changes made to the environment to provide more information about the game states, then the environment's name would be changed to SpaceInvaders-v1. This simple versioning system makes sure we are always comparing performance measured on the exact same environment setup. This way, the results obtained are comparable and reproducible.

Ability to monitor progress

All the environments available as part of the Gym toolkit are equipped with a monitor. This monitor logs every time step of the simulation and every reset of the environment. What this means is that the environment automatically keeps track of how our agent is learning and adapting with every step. You can even configure the monitor to automatically record videos of the game while your agent is learning to play. How cool is that?

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime