Knowing the prerequisites
Machine learning mimicking human intelligence is a subfield of AI—a field of computer science concerned with creating systems. Software engineering is another field in computer science. Generally, we can label Python programming as a type of software engineering. Machine learning is also closely related to linear algebra, probability theory, statistics, and mathematical optimization. We usually build machine learning models based on statistics, probability theory, and linear algebra, and then optimize the models using mathematical optimization.
Most of you reading this book should have a good, or at least sufficient, command of Python programming. Those who aren’t feeling confident about mathematical knowledge might be wondering how much time should be spent learning or brushing up on the aforementioned subjects. Don’t panic; we will get machine learning to work for us without going into any deep mathematical details in this book. It just requires some basic 101 knowledge of probability theory and linear algebra, which helps us to understand the mechanics of machine learning techniques and algorithms. And it gets easier, as we will build models both from scratch and with popular packages in Python, a language we like and are familiar with.
For those who want to learn or brush up on probability theory and linear algebra, feel free to search for basic probability theory and basic linear algebra. There are a lot of resources available online, for example, https://people.ucsc.edu/~abrsvn/intro_prob_1.pdf, the online course Introduction to Probability by Harvard University (https://pll.harvard.edu/course/introduction-probability-edx) regarding probability 101, and the following paper regarding basic linear algebra: http://www.maths.gla.ac.uk/~ajb/dvi-ps/2w-notes.pdf.
Those who want to study machine learning systematically can enroll in computer science, AI, and, more recently, data science and AI master’s programs. There are also various data science boot camps. However, the selection for boot camps is usually stricter, as they’re more job-oriented and the program duration is often short, ranging from 4 to 10 weeks. Another option is free Massive Open Online Courses (MOOCs), such as Andrew Ng’s popular course on machine learning. Last but not least, industry blogs and websites are great resources for us to keep up with the latest developments.
Machine learning is not only a skill but also a bit of a sport. We can compete in several machine learning competitions, such as Kaggle (www.kaggle.com)—sometimes for decent cash prizes, sometimes for joy, but most of the time to play to our strengths. However, to win these competitions, we may need to utilize certain techniques, which are only useful in the context of competitions and not in the context of trying to solve a business problem. That’s right—the no free lunch theorem (https://en.wikipedia.org/wiki/No_free_lunch_theorem) applies here. In the context of machine learning, this theorem suggests that no single algorithm is universally superior across all possible datasets and problem domains.
Next, we’ll take a look at the three types of machine learning.