Let's consider the following linear regression example where we have a set of training data. Based on the training data, we use forward propagation to model a straight line prediction function, h(x), as in the following diagram:
The difference between the actual and predicted value for an individual training sample contributes to the overall error for the prediction function. The goodness of fit for a neural network is defined with a cost function. It measures how well a neural network performed with respect to the training dataset when it modeled the training data.
As you can imagine, the cost function value in the case of the neural network is dependent on the weights on each neuron and the biases on each of the nodes. The cost function is a single value and it...