Multilayer Perceptrons (MLPs)
The main limitation of a perceptron is its linearity. How is it possible to exploit this kind of architecture by removing such a constraint? The solution is easier than you might speculate. Adding at least one non-linear layer between the input and output leads to a highly non-linear combination, parametrized with a larger number of variables. The resulting architecture, called a Multilayer Perceptron (MLP) and containing a single (just for simplicity) hidden layer, is shown in the following diagram:
Structure of a generic Multilayer Perceptron with a single hidden layer
This is a so-called feed-forward network, meaning that the flow of information begins in the first layer, always proceeds in the same direction, and ends at the output layer. Architectures that allow partial feedback (for example, in order to implement local memory) are called recurrent networks and will be analyzed in the next chapter.
In this case, there...