Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
A Handbook of Mathematical Models with Python

You're reading from   A Handbook of Mathematical Models with Python Elevate your machine learning projects with NetworkX, PuLP, and linalg

Arrow left icon
Product type Paperback
Published in Aug 2023
Publisher Packt
ISBN-13 9781804616703
Length 144 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Ranja Sarkar Ranja Sarkar
Author Profile Icon Ranja Sarkar
Ranja Sarkar
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Part 1:Mathematical Modeling
2. Chapter 1: Introduction to Mathematical Modeling FREE CHAPTER 3. Chapter 2: Machine Learning vis-à-vis Mathematical Modeling 4. Part 2:Mathematical Tools
5. Chapter 3: Principal Component Analysis 6. Chapter 4: Gradient Descent 7. Chapter 5: Support Vector Machine 8. Chapter 6: Graph Theory 9. Chapter 7: Kalman Filter 10. Chapter 8: Markov Chain 11. Part 3:Mathematical Optimization
12. Chapter 9: Exploring Optimization Techniques 13. Chapter 10: Optimization Techniques for Machine Learning 14. Index 15. Other Books You May Enjoy

Gradient descent optimizers

The optimizers discussed here are widely used to train DL models depending on the degree of the non-convexity of the error or cost function.

Momentum

The momentum method uses a moving average gradient instead of a gradient at each time step and reduces the back-and-forth oscillations (fluctuations of the cost function) caused by SGD. This process focuses on the steepest descent path. Figure 4.5a shows movement with no momentum by creating oscillations in SGD while Figure 4.5b shows movement in the relevant direction by accumulating velocity with damped oscillations and closer to the optimum.

Figure 4.5a: SGD with no momentum

Figure 4.5a: SGD with no momentum

Figure 4.5b: SGD with momentum

Figure 4.5b: SGD with momentum

The momentum term reduces updates for dimensions whose gradients change directions and as a result, faster convergence is achieved.

Adagrad

The adagrad optimizer is used when dealing with sparse data as the algorithm performs small updates...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime