Now that we can link together operational gates, we want to run the computational graph output through an activation function. In this section, we will introduce common activation functions.
Working with gates and activation functions
Getting ready
In this section, we will compare and contrast two different activation functions: sigmoid and rectified linear unit (ReLU). Recall that the two functions are given by the following equations:
In this example, we will create two one-layer neural networks with the same structure, except that one will feed through the sigmoid activation and one will feed through the ReLU activation. The loss function will be governed by the L2 distance from the value 0.75. We will randomly pull...