Working with gates and activation functions
Now that we can link together operational gates, we want to run the computational graph output through an activation function. In this section, we will introduce common activation functions.
Getting ready
In this section, we will compare and contrast two different activation functions: sigmoid and rectified linear unit (ReLU). Recall that the two functions are given by the following equations:
In this example, we will create two one-layer neural networks with the same structure, except that one will feed through the sigmoid activation and one will feed through the ReLU activation. The loss function will be governed by the L2 distance from the value 0.75. We will randomly pull batch data and then optimize the output toward 0.75.
How to do it...
We proceed with the recipe as follows:
- We will start by loading the necessary libraries. This is also a good point at which we can...