You're reading from Deep Reinforcement Learning Hands-On Apply modern RL methods to practical problems of chatbots, robotics, discrete optimization, web automation, and more

Product type Paperback

Published in Jan 2020

Publisher Packt

ISBN-13 9781838826994

Length 826 pages

Edition 2nd Edition

Languages

Python

Tools

Deep Reinforcement Learning

Concepts

Chatbots

Author (1):

Maxim Lapan

View More author details

Table of Contents (28) Chapters

Preface

1. What Is Reinforcement Learning?

2. OpenAI Gym FREE CHAPTER

3. Deep Learning with PyTorch

4. The Cross-Entropy Method

5. Tabular Learning and the Bellman Equation

6. Deep Q-Networks

7. Higher-Level RL Libraries

8. DQN Extensions

9. Ways to Speed up RL

10. Stocks Trading Using RL

11. Policy Gradients – an Alternative

12. The Actor-Critic Method

13. Asynchronous Advantage Actor-Critic

14. Training Chatbots with RL

15. The TextWorld Environment

16. Web Navigation

17. Continuous Action Space

18. RL in Robotics

19. Trust Regions – PPO, TRPO, ACKTR, and SAC

20. Black-Box Optimization in RL

21. Advanced Exploration

22. Beyond Model-Free – Imagination

23. AlphaGo Zero

24. RL in Discrete Optimization

25. Multi-agent RL

26. Other Books You May Enjoy

27. Index

The MAgent environment

Before we jump into our first MARL example, I will describe our environment to experiment with.

Installation

If you want to play with MARL, your choice is a bit limited. All the environments that come with Gym support only one agent. There are some patches for Atari Pong, to switch it into two-player mode, but they are not standard and are an exception rather than the rule.

DeepMind, together with Blizzard, has made StarCraft II publicly available (https://github.com/deepmind/pysc2) and it makes for a very interesting and challenging environment for experimentation. However, for somebody who is taking their first steps in MARL, it might be too complex. In that regard, I found the MAgent environment from Geek.AI (https://github.com/geek-ai/MAgent) perfectly suitable: it is simple, fast, and has minimal dependency, but it still allows you to simulate different multi-agent scenarios for experimentation. It doesn't provide a Gym-compatible API, but...