Unraveling backpropagation
At this point, you may be wondering why weights, biases, and activation functions are so special. After all, at this point, they probably seem not much different than parameters and hyperparameters in traditional ML models. However, understanding backpropagation will solidify your appreciation of how weights and biases work. This journey begins with a brief discussion of gradient descent.
Gradient descent
In short, gradient descent is a powerful optimization algorithm that’s widely used in ML and DL to minimize a cost or loss function. It is the name that’s given to the process of training a model on a task by first making a prediction with the model, measuring how good that prediction is, and then adjusting its weights slightly so that it will perform better next time. This process allows the model to gradually make better predictions over many iterations of training. It is used to train not only NNs but also other ML models, such as...