Implementing the core deep learning models - MLPs, CNNs, and RNNs
We've already mentioned that we'll be using three advanced deep learning models, they are:
- MLPs: Multilayer perceptrons
- RNNs: Recurrent neural networks
- CNNs: Convolutional neural networks
These are the three networks that we will be using throughout this book. Despite the three networks being separate, you'll find that they are often combined together in order to take advantage of the strength of each model.
In the following sections of this chapter, we'll discuss these building blocks one by one in more detail. In the following sections, MLPs are covered together with other important topics such as loss function, optimizer, and regularizer. Following on afterward, we'll cover both CNNs and RNNs.
The difference between MLPs, CNNs, and RNNs
Multilayer perceptrons or MLPs are a fully-connected network. You'll often find them referred to as either deep feedforward networks or feedforward neural networks in some literature. Understanding these networks in terms of known target applications will help us get insights about the underlying reasons for the design of the advanced deep learning models. MLPs are common in simple logistic and linear regression problems. However, MLPs are not optimal for processing sequential and multi-dimensional data patterns. By design, MLPs struggle to remember patterns in sequential data and requires a substantial number of parameters to process multi-dimensional data.
For sequential data input, RNNs are popular because the internal design allows the network to discover dependency in the history of data that is useful for prediction. For multi-dimensional data like images and videos, a CNN excels in extracting feature maps for classification, segmentation, generation, and other purposes. In some cases, a CNN in the form of a 1D convolution is also used for networks with sequential input data. However, in most deep learning models, MLPs, RNNs, and CNNs are combined to make the most out of each network.
MLPs, RNNs, and CNNs do not complete the whole picture of deep networks. There is a need to identify an objective or loss function, an optimizer, and a regularizer. The goal is to reduce the loss function value during training since it is a good guide that a model is learning. To minimize this value, the model employs an optimizer. This is an algorithm that determines how weights and biases should be adjusted at each training step. A trained model must work not only on the training data but also on a test or even on unforeseen input data. The role of the regularizer is to ensure that the trained model generalizes to new data.