Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Reinforcement Learning with TensorFlow

You're reading from   Reinforcement Learning with TensorFlow A beginner's guide to designing self-learning systems with TensorFlow and OpenAI Gym

Arrow left icon
Product type Paperback
Published in Apr 2018
Publisher Packt
ISBN-13 9781788835725
Length 334 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Sayon Dutta Sayon Dutta
Author Profile Icon Sayon Dutta
Sayon Dutta
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Deep Learning – Architectures and Frameworks 2. Training Reinforcement Learning Agents Using OpenAI Gym FREE CHAPTER 3. Markov Decision Process 4. Policy Gradients 5. Q-Learning and Deep Q-Networks 6. Asynchronous Methods 7. Robo Everything – Real Strategy Gaming 8. AlphaGo – Reinforcement Learning at Its Best 9. Reinforcement Learning in Autonomous Driving 10. Financial Portfolio Management 11. Reinforcement Learning in Robotics 12. Deep Reinforcement Learning in Ad Tech 13. Reinforcement Learning in Image Processing 14. Deep Reinforcement Learning in NLP 15. Further topics in Reinforcement Learning 16. Other Books You May Enjoy

The pioneers and breakthroughs in reinforcement learning

Before going on floor with all the coding, let's shed some light on some of the pioneers, industrial leaders, and research breakthroughs in the field of deep reinforcement learning.

David Silver

Dr. David Silver, with an h-index of 30, heads the research team of reinforcement learning at Google DeepMind and is the lead researcher on AlphaGo. David co-founded Elixir Studios and then completed his PhD in reinforcement learning from the University of Alberta, where he co-introduced the algorithms used in the first master-level 9x9 Go programs. After this, he became a lecturer at University College London. He used to consult for DeepMind before joining full-time in 2013. David lead the AlphaGo project, which became the first program to defeat a top professional player in the game of Go.

Pieter Abbeel

Pieter Abbeel is a professor at UC Berkeley and was a Research Scientist at OpenAI. Pieter completed his PhD in Computer Science under Andrew Ng. His current research focuses on robotics and machine learning, with a particular focus on deep reinforcement learning, deep imitation learning, deep unsupervised learning, meta-learning, learning-to-learn, and AI safety. Pieter also won the NIPS 2016 Best Paper Award.

Google DeepMind

Google DeepMind is a British artificial intelligence company founded in September 2010 and acquired by Google in 2014. They are an industrial leader in the domains of deep reinforcement learning and a neural turing machine. They made news in 2016 when the AlphaGo program defeated Lee Sedol, 9th dan Go player. Google DeepMind has channelized its focus on two big sectors: energy and healthcare.

Here are some of its projects:

  • In July 2016, Google DeepMind and Moorfields Eye Hospital announced their collaboration to use eye scans to research early signs of diseases leading to blindness
  • In August 2016, Google DeepMind announced its collaboration with University College London Hospital to research and develop an algorithm to automatically differentiate between healthy and cancerous tissues in head and neck areas
  • Google DeepMind AI reduced the Google's data center cooling bill by 40%

The AlphaGo program

As mentioned previously in Google DeepMind, AlphaGo is a computer program that first defeated Lee Sedol and then Ke Jie, who at the time was the world No. 1 in Go. In 2017 an improved version, AlphaGo zero was launched that defeated AlphaGo 100 games to 0.

Libratus

Libratus is an artificial intelligence computer program designed by the team led by Professor Tuomas Sandholm at Carnegie Mellon University to play Poker. Libratus and its predecessor, Claudico, share the same meaning, balanced.

In January 2017, it made history by defeating four of the world's best professional poker players in a marathon 20-day poker competition.

Though Libratus focuses on playing poker, its designers mentioned its ability to learn any game that has incomplete information and where opponents are engaging in deception. As a result, they have proposed that the system can be applied to problems in cybersecurity, business negotiations, or medical planning domains.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime