Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Deep Learning Quick Reference

You're reading from   Deep Learning Quick Reference Useful hacks for training and optimizing deep neural networks with TensorFlow and Keras

Arrow left icon
Product type Paperback
Published in Mar 2018
Publisher Packt
ISBN-13 9781788837996
Length 272 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Mike Bernico Mike Bernico
Author Profile Icon Mike Bernico
Mike Bernico
Arrow right icon
View More author details
Toc

Table of Contents (15) Chapters Close

Preface 1. The Building Blocks of Deep Learning FREE CHAPTER 2. Using Deep Learning to Solve Regression Problems 3. Monitoring Network Training Using TensorBoard 4. Using Deep Learning to Solve Binary Classification Problems 5. Using Keras to Solve Multiclass Classification Problems 6. Hyperparameter Optimization 7. Training a CNN from Scratch 8. Transfer Learning with Pretrained CNNs 9. Training an RNN from scratch 10. Training LSTMs with Word Embeddings from Scratch 11. Training Seq2Seq Models 12. Using Deep Reinforcement Learning 13. Generative Adversarial Networks 14. Other Books You May Enjoy

Which hyperparameters should we optimize?

Even if you were to follow my advice above and settle on a good enough architecture, you can and should still attempt to search for ideal hyperparameters within that architecture. Some of the hyperparameters we might want to search include the following:

  • Our choice of optimizer. Thus far, I've been using Adam, but an rmsprop optimizer or a well-tuned SGD may do better.
  • Each of these optimizers has a set of hyperparameters that we might tune, such as learning rate, momentum, and decay.
  • Network weight initialization.
  • Neuron activation.
  • Regularization parameters such as dropout probability or the regularization parameter used in l2 regularization.
  • Batch size.

As implied above, this is not an exhaustive list. There are most certainly more options you could try, including introducing variable numbers of neurons in each hidden layer,...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image