Now that we know how to handle data in all shapes and forms, be it numerical, categorical, text, or image data, it is time to put our newly gained knowledge to good use.
In this chapter, we will learn how to build a machine learning system that can make a medical diagnosis. We aren't all doctors, but we've probably all been to one at some point in our lives. Typically, a doctor would gain as much information as possible about a patient's history and symptoms to make an informed diagnosis. We will mimic a doctor's decision-making process with the help of what is known as decision trees. We will also cover the Gini coefficient, information gain, and variance reduction, along with overfitting and pruning.
A decision tree is a simple yet powerful supervised learning algorithm that resembles a flow chart; we will...