Two-layer neural networks
Let us look at the formal definition of a two-layer neural network. We follow the notations and description used by David MacKay (reference 1, 2, and 3 in the References section of this chapter). The input to the NN is given by . The input values are first multiplied by a set of weights to produce a weighted linear combination and then transformed using a nonlinear function to produce values of the state of neurons in the hidden layer:
data:image/s3,"s3://crabby-images/a6931/a6931c559c951221e46375a367e8b5139285e572" alt=""
A similar operation is done at the second layer to produce final output values :
data:image/s3,"s3://crabby-images/a981d/a981d4a6bf330f24233aa1b5f67353952327dbc0" alt=""
The function is usually taken as either a
sigmoid function
or
. Another common function used for multiclass classification is softmax defined as follows:
data:image/s3,"s3://crabby-images/bc77c/bc77c525b2e358a5de47a79cd488028ad11fa730" alt=""
This is a normalized exponential function.
All these are highly nonlinear functions exhibiting the property that the output value has a sharp increase as a function of the input. This nonlinear property gives neural networks more computational flexibility than standard linear or generalized linear models...