Cross entropy loss, or log loss, measures the performance of the classification model whose output is a probability between 0 and 1. Cross entropy increases as the predicted probability of a sample diverges from the actual value. Therefore, predicting a probability of 0.05 when the actual label has a value of 1 increases the cross entropy loss.Â
Mathematically, for a binary classification setting, cross entropy is defined as the following equation:
Here,  is the binary indicator (0 or 1) denoting the class for the sample , while  denotes the predicted probability between 0 and 1 for that sample.
Alternatively, if there are more than two classes, we define a new term known as categorical cross entropy. It is calculated as a sum of separate loss for each class label per observation. Mathematically...