Some mathematical tools
Before introducing backpropagation, we need to review some mathematical tools from calculus. Don't worry too much; we'll briefly review a few areas, all of which are commonly covered in high school-level mathematics.
Derivatives and gradients everywhere
Derivatives are a powerful mathematical tool. We are going to use derivatives and gradients for optimizing our network. Let's look at the definition. The derivative of a function y = f(x) of a variable x is a measure of the rate at which the value y of the function changes with respect to the change of the variable x. If x and y are real numbers, and if the graph of f is plotted against x, the derivative is the "slope" of this graph at each point.
If the function is linear, y = f(x) = ax + b, the slope is . This is a simple result of calculus that can be derived by considering that:



In Figure 1 we show the geometrical meaning of ,
and the angle
between the linear...