You're reading from Deep Learning with TensorFlow 2 and Keras Regression, ConvNets, GANs, RNNs, NLP, and more with TensorFlow 2 and the Keras API

Product type Paperback

Published in Dec 2019

Publisher Packt

ISBN-13 9781838823412

Length 646 pages

Edition 2nd Edition

Languages

Python

Tools

Keras

Concepts

Deep Learning

Authors (3):

Dr. Amita Kapoor

Sujit Pal

Antonio Gulli

View More author details

Table of Contents (19) Chapters

Preface

1. Neural Network Foundations with TensorFlow 2.0

2. TensorFlow 1.x and 2.x FREE CHAPTER

3. Regression

4. Convolutional Neural Networks

5. Advanced Convolutional Neural Networks

6. Generative Adversarial Networks

7. Word Embeddings

8. Recurrent Neural Networks

9. Autoencoders

10. Unsupervised Learning

11. Reinforcement Learning

12. TensorFlow and Cloud

13. TensorFlow for Mobile and IoT and TensorFlow.js

14. An introduction to AutoML

15. The Math Behind Deep Learning

16. Tensor Processing Unit

17. Other Books You May Enjoy

18. Index

Hyperparameter tuning and AutoML

The experiments defined above give some opportunities for fine-tuning a net. However, what works for this example will not necessarily work for other examples. For a given net, there are indeed multiple parameters that can be optimized (such as the number of hidden neurons, BATCH_SIZE, number of epochs, and many more depending on the complexity of the net itself). These parameters are called "hyperparameters" to distinguish them from the parameters of the network itself, that is, the values of the weights and biases.

Hyperparameter tuning is the process of finding the optimal combination of those hyperparameters that minimize cost functions. The key idea is that if we have n hyperparameters, then we can imagine that they define a space with n dimensions and the goal is to find the point in this space that corresponds to an optimal value for the cost function. One way to achieve this goal is to create a grid in this space and systematically check the value assumed by the cost function for each grid vertex. In other words, the hyperparameters are divided into buckets and different combinations of values are checked via a brute force approach.

If you think that this process of fine-tuning the hyperparameters is manual and expensive, then you are absolutely right! However, during the last few years we have seen significant results in AutoML, a set of research techniques aiming at both automatically tuning hyperparameters and searching automatically for optimal network architecture. We will discuss more about this in Chapter 14, An introduction to AutoML.