Evaluation metrics and evaluating performance
This section will discuss how to set up a deep learning project and what evaluation metrics to select. We will look at how to select evaluation criteria and how to decide when the model is approaching optimal performance. We will also discuss how all deep learning models tend to overfit and how to manage the bias/variance tradeoff. This will give guidelines on what to do when models have low accuracy.
Types of evaluation metric
Different evaluation metrics are used for categorization and regression tasks. For categorization, accuracy is the most commonly used evaluation metric. However, accuracy is only valid if the cost of errors is the same for all classes, which is not always the case. For example, in medical diagnosis, the cost of a false negative will be much higher than the cost of a false positive. A false negative in this case says that the person is not sick when they are, and a delay in diagnosis can have serious, perhaps fatal, consequences...