Activation functions
In Chapter 1, Neural Network Foundations with TensorFlow 2.0, we have seen a few activation functions including sigmoid, tanh, and ReLU. In the following section we compute the derivative of these activation functions.
Derivative of the sigmoid
Remember that the sigmoid is defined as (see Figure 6):
Figure 6: Sigmoid activation function
The derivative can be computed as follows:
Therefore the derivative of can be computesd as a very simple form .
Derivative of tanh
Remember that the arctan function is defined as, as seen in Figure 7:
Figure 7: Tanh activation function
If you remember that and , then the derivative is computed as:
Therefore the derivative of tanh(z) can be computed as a very simple form: .
Derivative of ReLU
The ReLU function is defined as f(x) = max(0, x) (see Figure 8). The derivative of ReLU is:
Note that ReLU is non-differentiable at zero. However, it is differentiable...