Packt+ | Advance your knowledge in tech

You're reading from Python: Advanced Guide to Artificial Intelligence Expert machine learning systems and intelligent agents using Python

Product type Course

Published in Dec 2018

Publisher Packt

ISBN-13 9781789957211

Length 764 pages

Edition 1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Artificial Intelligence

Authors (2):

Giuseppe Bonaccorso

Rajalingappaa Shanmugamani

View More author details

Table of Contents (31) Chapters

Title Page

About Packt

Contributors

Preface

1. Machine Learning Model Fundamentals FREE CHAPTER

2. Introduction to Semi-Supervised Learning

3. Graph-Based Semi-Supervised Learning

4. Bayesian Networks and Hidden Markov Models

5. EM Algorithm and Applications

6. Hebbian Learning and Self-Organizing Maps

7. Clustering Algorithms

8. Advanced Neural Models

9. Classical Machine Learning with TensorFlow

10. Neural Networks and MLP with TensorFlow and Keras

11. RNN with TensorFlow and Keras

12. CNN with TensorFlow and Keras

13. Autoencoder with TensorFlow and Keras

14. TensorFlow Models in Production with TF Serving

15. Deep Reinforcement Learning

16. Generative Adversarial Networks

17. Distributed Models with TensorFlow Clusters

18. Debugging TensorFlow Models

19. Tensor Processing Units

20. Getting Started

21. Image Classification

22. Image Retrieval

23. Object Detection

24. Semantic Segmentation

25. Similarity Learning

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Implementing Q-Learning

Q-Learning is a model-free method of finding the optimal policy that can maximize the reward of an agent. During initial gameplay, the agent learns a Q value for each pair of (state, action), also known as the exploration strategy, as explained in previous sections. Once the Q values are learned, then the optimal policy will be to select an action with the largest Q-value in every state, also known as the exploitation strategy. The learning algorithm may end in locally optimal solutions, hence we keep using the exploration policy by setting an exploration_rate parameter.

The Q-Learning algorithm is as follows:

initialize Q(shape=[#s,#a]) to random values or zeroes
Repeat (for each episode)
    observe current state s
    Repeat
        select an action a (apply explore or exploit strategy)
        observe state s_next as a result of action a
        update the Q-Table using bellman's equation
        set current state s = s_next       
    until the episode ends or...

The rest of the chapter is locked

You're reading from Python: Advanced Guide to Artificial Intelligence Expert machine learning systems and intelligent agents using Python

Table of Contents (31) Chapters

Implementing Q-Learning

Unlock this book and the full library FREE for 7 days

Authors (2)

Personalised recommendations for you