Statistics deal with all things about data, namely, collection, analysis, interpretation, inference, and presentation. It is a vast field, incorporating many methods for analyzing data. Covering it all is out of the scope of this book, but we will look into one concept that lies at the heart of machine learning, that is, maximum likelihood estimation (MLE). As always, do not fear the terminology, as the underlying concepts are simple and intuitive. To understand MLE, we will need to dive into probability theory, the cornerstone of statistics.
To start, let's look at why we need probabilities when we already are equipped with such great mathematical tooling. We use calculus to work with functions on an infinitesimal scale and to measure how they change. We developed algebra to solve equations, and we have dozens of other areas of mathematics that help...