Classification loss function
The loss function is an objective function to minimize during training to get the best model. Many different loss functions exist.
In a classification problem, where the target is to predict the correct class among k classes, cross-entropy is commonly used as it measures the difference between the real probability distribution, q, and the predicted one, p, for each class:
![Classification loss function](https://static.packt-cdn.com/products/9781786465825/graphics/graphics/B05525_02_02.jpg)
Here, i is the index of the sample in the dataset, n is the number of samples in the dataset, and k is the number of classes.
While the real probability of each class is unknown, it can simply be approximated in practice by the empirical distribution, that is, randomly drawing a sample out of the dataset in the dataset order. The same way, the cross-entropy of any predicted probability,
p
, can be approximated by the empirical cross-entropy:
![Classification loss function](https://static.packt-cdn.com/products/9781786465825/graphics/graphics/B05525_02_17.jpg)
Here, is the probability estimated by the model for the correct class of example
.
Accuracy and cross-entropy both evolve in the same direction but measure...