Optimizing a quadratic cost function and finding the minima using just math to gain insight
In this recipe, we will explore the fundamental concept behind mathematical optimization using simple derivatives before introducing Gradient Descent (first order derivative) and L-BFGS, which is a Hessian free quasi-Newton method.
We will examine a sample quadratic cost/error and show how to find the or maximum with just math.
We will use both the closed form (vertex formula) and derivative method (slope) to find the minima, but we will defer to later recipes in this chapter to introduce numerical optimization techniques, such Gradient Descent and its application to regression.
How to do it...
- Let's assume we have a quadratic cost function and we find its minima:
- The cost function in statistical machine learning algorithms acts as a proxy for the level of difficulty, energy spent, or total error as we move around in our search space.
- The first thing we do is to graph the function and inspect it visually...