Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Python Deep Learning Exploring deep learning techniques and neural network architectures with PyTorch, Keras, and TensorFlow

Product type Paperback

Published in Jan 2019

Publisher Packt

ISBN-13 9781789348460

Length 386 pages

Edition 2nd Edition

Languages

Python

Tools

Keras

Concepts

Deep Learning

Authors (5):

Gianmario Spacagna

Daniel Slater

Valentino Zocca

Peter Roelants

Ivan Vasilev

+1 more

View More author details

Table of Contents (12) Chapters

Preface

1. Machine Learning - an Introduction

2. Neural Networks FREE CHAPTER

3. Deep Learning Fundamentals

4. Computer Vision with Convolutional Networks

5. Advanced Computer Vision

6. Generating Images with GANs and VAEs

7. Recurrent Neural Networks and Language Models

8. Reinforcement Learning Theory

9. Deep Reinforcement Learning for Games

10. Deep Learning in Autonomous Vehicles

11. Other Books You May Enjoy

Leave a review - let other readers know what you think

Policy gradient methods

All RL algorithms we discussed until now have tried to learn the state- or action-value functions. For example, in Q-learning we usually follow an ε-greedy policy, which has no parameters (OK, it has one parameter) and relies on the value function instead. In this section, we'll discuss something new: how to approximate the policy itself with the help of policy gradient methods. We'll follow a similar approach as in Chapter 8, Reinforcement Learning Theory, in the Value function approximation section.

There, we introduced a value approximation function, which is described by a set of parameters w (neural net weights). Here, we'll introduce a parameterized policy , which is described by a set of parameters θ. As with value function approximation, θ could be the weights of a neural network.

Recall that we use the notation...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (5)

Vasilev

Ivan Vasilev started working on the first open source Java deep learning library with GPU support in 2013. The library was acquired by a German company, with whom he continued its development. He has also worked as a machine learning engineer and researcher in medical image classification and segmentation with deep neural networks. Since 2017, he has focused on financial machine learning. He co-founded an algorithmic trading company, where he's the lead engineer. He holds an MSc in artificial intelligence from Sofia University St. Kliment Ohridski and has written two previous books on the same topic.

See other products by Vasilev

Roelants

Peter Roelants holds a master's in computer science with a specialization in AI from KU Leuven. He works on applying deep learning to a variety of problems, such as spectral imaging, speech recognition, text understanding, and document information extraction. He currently works at Onfido as a team leader for the data extraction research team, focusing on data extraction from official documents.

See other products by Roelants

Spacagna

Gianmario Spacagna is a senior data scientist at Pirelli, processing sensors and telemetry data for internet of things (IoT) and connected-vehicle applications. He works closely with tire mechanics, engineers, and business units to analyze and formulate hybrid, physics-driven, and data-driven automotive models. His main expertise is in building ML systems and end-to-end solutions for data products. He holds a master's degree in telematics from the Polytechnic of Turin, as well as one in software engineering of distributed systems from KTH, Stockholm. Prior to Pirelli, he worked in retail and business banking (Barclays), cyber security (Cisco), predictive marketing (AgilOne), and did some occasional freelancing.

See other products by Spacagna

Zocca

Valentino Zocca has a PhD degree and graduated with a Laurea in mathematics from the University of Maryland, USA, and University of Rome, respectively, and spent a semester at the University of Warwick. He started working on high-tech projects of an advanced stereo 3D Earth visualization software with head tracking at Autometric, a company later bought by Boeing. There he developed many mathematical algorithms and predictive models, and using Hadoop he automated several satellite-imagery visualization programs. He has worked as an independent consultant at the U.S. Census Bureau, in the USA and in Italy. Currently, Valentino lives in New York and works as an independent consultant to a large financial company.

See other products by Zocca

Daniel Slater

Daniel Slater started programming at age 11, developing mods for the id Software game Quake. His obsession led him to become a developer working in the gaming industry on the hit computer game series Championship Manager. He then moved into finance, working on risk- and high-performance messaging systems. He now is a staff engineer working on big data at Skimlinks to understand online user behavior. He spends his spare time training AI to beat computer games. He talks at tech conferences about deep learning and reinforcement learning; and the name of his blog is Daniel Slater's blog. His work in this field has been cited by Google.

See other products by Daniel Slater