Selecting an activation function
Recall that, in the previous section, we used an activation function to map a value to a particular output, depending on the value. We will define an activation function as a mathematical function that defines the output of an individual node using an input value. Using the analogy of the human brain, these functions simply act as gatekeepers, deciding what will be fired off to the next neuron. There are several features that an activation function should have to allow the model to learn most effectively from it:
- The avoidance of a vanishing gradient
- A low computational expense
Artificial neural networks are trained using a process known as gradient descent. For this example, let's assume that there is a two-layer neural network:
The overall network can be represented as follows:
When the weights are calculated in a step known as a backward pass, the...