Summary
In this chapter, we covered model evaluation and accuracy in depth. We learned how accuracy is not the most appropriate technique for evaluation when our dataset is imbalanced. We also learned how to compute a confusion matrix using scikit-learn and how to derive other metrics, such as sensitivity, specificity, precision, and false positive rate.
Finally, we understood how to use threshold values to adjust metrics and how ROC curves
and AUC scores
help us evaluate our models. It is very common to deal with imbalanced datasets in real-life problems. Problems such as credit card fraud detection, disease prediction, and spam email detection all have imbalanced data in different proportions.
In the next chapter, we will learn about a different kind of neural network architecture (convolutional neural networks) that performs well on image classification tasks. We will test performance by classifying images into two classes and experiment with different architectures and activation...