Building a neural layer in Swift
A fully-connected layer is easy to implement, because it can be expressed as two operations:
- A matrix multiplication between weights matrix W and input vector x.
- A point wise application of activation function f :
Figure 8.5: One layer in detail
In many frameworks, the two operations are separated so that matrix multiplication happens in the fully-connected layer and activation happens in the next nonlinearity layer. This is handy because in this way we can easily replace the weighted sum with convolution. In the next chapter, we will discuss convolutional NNs.
But for now, let's see how NNs can perform logical operations. One neuron is enough to model any logical gate, except XOR. This finding caused the first AI winter in the 1960s; however, XOR is trivial to a model having a network with two layers.