We now have a shared vocabulary. You have a notional understanding of what terms like layers, model weights, loss function, and optimizer mean. But how do they work together? How do we train them on arbitrary data? We can train them to give us the ability to recognize cat pictures or fraudulent reviews on Amazon.
Here is the rough outline of the steps that occur inside a training loop:
- Initialize:
- The network/model weights are assigned random values, usually in the form of (-1, 1) or (0, 1).
- The model is very far from the target. This is because it is simply executing a series of random transformations.
- The loss is very high.
- With every example that the network processes, the following occurs:
- The weights are adjusted a little in the correct direction
- The loss score decreases
This is the training loop, which is repeated...