Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Free Learning

You're reading from Python Deep Learning Exploring deep learning techniques and neural network architectures with PyTorch, Keras, and TensorFlow

Product type Paperback

Published in Jan 2019

Publisher Packt

ISBN-13 9781789348460

Length 386 pages

Edition 2nd Edition

Languages

Python

Tools

Keras

Concepts

Deep Learning

Authors (5):

Gianmario Spacagna

Daniel Slater

Valentino Zocca

Peter Roelants

Ivan Vasilev

+1 more

View More author details

Table of Contents (12) Chapters

Preface

1. Machine Learning - an Introduction

2. Neural Networks FREE CHAPTER

3. Deep Learning Fundamentals

4. Computer Vision with Convolutional Networks

5. Advanced Computer Vision

6. Generating Images with GANs and VAEs

7. Recurrent Neural Networks and Language Models

8. Reinforcement Learning Theory

9. Deep Reinforcement Learning for Games

10. Deep Learning in Autonomous Vehicles

11. Other Books You May Enjoy

Leave a review - let other readers know what you think

RL as a Markov decision process

A Markov decision process (MDP) is a mathematical framework for modeling decisions. We can use it to describe the RL problem. We'll assume that we work with a full knowledge of the environment. An MDP provides a formal definition of the properties we defined in the previous section (and adds some new ones):

is the finite set of all possible environment states, and s_t is the state at time t.
is the set of all possible actions, and a_t is the action at time t.
is the dynamics of the environment (also known as transition probabilities matrix). It defines the conditional probability of transitioning to a new state, s', given the existing state, s, and an action, a (for all states and actions):

We have transition probabilities between the states, because MDP is stochastic (it includes randomness). These probabilities represent the...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (5)

Vasilev

Ivan Vasilev started working on the first open source Java deep learning library with GPU support in 2013. The library was acquired by a German company, with whom he continued its development. He has also worked as a machine learning engineer and researcher in medical image classification and segmentation with deep neural networks. Since 2017, he has focused on financial machine learning. He co-founded an algorithmic trading company, where he's the lead engineer. He holds an MSc in artificial intelligence from Sofia University St. Kliment Ohridski and has written two previous books on the same topic.

See other products by Vasilev

Roelants

Peter Roelants holds a master's in computer science with a specialization in AI from KU Leuven. He works on applying deep learning to a variety of problems, such as spectral imaging, speech recognition, text understanding, and document information extraction. He currently works at Onfido as a team leader for the data extraction research team, focusing on data extraction from official documents.

See other products by Roelants

Spacagna

Gianmario Spacagna is a senior data scientist at Pirelli, processing sensors and telemetry data for internet of things (IoT) and connected-vehicle applications. He works closely with tire mechanics, engineers, and business units to analyze and formulate hybrid, physics-driven, and data-driven automotive models. His main expertise is in building ML systems and end-to-end solutions for data products. He holds a master's degree in telematics from the Polytechnic of Turin, as well as one in software engineering of distributed systems from KTH, Stockholm. Prior to Pirelli, he worked in retail and business banking (Barclays), cyber security (Cisco), predictive marketing (AgilOne), and did some occasional freelancing.

See other products by Spacagna

Zocca

Valentino Zocca has a PhD degree and graduated with a Laurea in mathematics from the University of Maryland, USA, and University of Rome, respectively, and spent a semester at the University of Warwick. He started working on high-tech projects of an advanced stereo 3D Earth visualization software with head tracking at Autometric, a company later bought by Boeing. There he developed many mathematical algorithms and predictive models, and using Hadoop he automated several satellite-imagery visualization programs. He has worked as an independent consultant at the U.S. Census Bureau, in the USA and in Italy. Currently, Valentino lives in New York and works as an independent consultant to a large financial company.

See other products by Zocca

Daniel Slater

Daniel Slater started programming at age 11, developing mods for the id Software game Quake. His obsession led him to become a developer working in the gaming industry on the hit computer game series Championship Manager. He then moved into finance, working on risk- and high-performance messaging systems. He now is a staff engineer working on big data at Skimlinks to understand online user behavior. He spends his spare time training AI to beat computer games. He talks at tech conferences about deep learning and reinforcement learning; and the name of his blog is Daniel Slater's blog. His work in this field has been cited by Google.

See other products by Daniel Slater