With the background information about the activation functions, we now understand why we need nonlinearities within the neural network. The nonlinearity is essential in order to model complex data patterns that solve regression and classification problems with accuracy. Let's once again go back to our initial example problem where we have established the activity of the hidden layer. Let's apply the sigmoid activation function to the activity for each of the nodes in the hidden layer. This gives our second formula in the perceptron model:
- Z(2) = XW(1)
- a(2) = f(z(2))
Once we apply the activation function, f, the resultant matrix will be the same size as z(2). That is, 5 x 3. The next step is to multiply the activities of the hidden layer by the weights on the synapse on the output layer. Refer to the diagram on ANN notations...