Preparing and analyzing data in machine learning workflows:
- Chapter 1, Gathering and Organizing Data, covers the gathering, organization, and parsing of data to/from a local and remote sources. Once the reader is done with this chapter, they will understand how to interact with data stored in various places and in various formats, how to parse and clean that data, and how to output that cleaned and parsed data.
- Chapter 2, Matrices, Probability, and Statistics, covers the organization of data into matrices and matrix operations. Once the reader is done with this material, they will understand how to form matrices within Go programs and how to utilize these matrices to perform various types of matrix operations. This chapter also covers statistical measures and operations key to day-to-day data analysis work. Once the reader is done with this chapter, they will understand how to perform solid summary data analysis, describe and visualize distributions, quantify hypotheses, and transform datasets with, for example, dimensionality reductions.
- Chapter 3, Evaluation and Validation, covers evaluation and validation, which are key to measuring the performance of machine applications and ensuring that they generalize. Once the reader is done with this chapter, they will understand various metrics to gauge the performance of models (in other words, evaluate the model) as well as various techniques to validate the model more generally.
Machine learning techniques:
- Chapter 4, Regression, explains regression, a widely used technique to model continuous variables and a basis for other models. Regression produces models that are immediately interpretable. Thus, it can provide an excellent starting point when introducing predictive capabilities in a organization.
- Chapter 5, Classification, covers classification, a machine learning technique distinct from regression in that the target variable is typically categorical or labeled. For example, a classification model may classify emails into spam and not spam categories or classify network traffic as fraudulent or not fraudulent.
- Chapter 6, Clustering, covers clustering, an unsupervised machine learning technique used to form groupings of samples. At the end of this chapter, readers will be able to automatically form groupings of data points to better understand their structure.
- Chapter 7, Time Series and Anomaly Detection, introduces techniques utilized to model time series data, such as stock prices, user events, and so on. After reading this chapter, the reader will understand how to evaluate various terms in a time series, build up a model of the time series, and detect anomalies in a time series.
Taking machine learning to the next level:
- Chapter 8, Neural Networks and Deep Learning, introduces techniques utilized to perform regression, classification, and image processing with neural networks. After reading this chapter, the reader will understand how and when to apply these more complicated modeling techniques.
- Chapter 9, Deploying and Distributing Analyses and Models, empowers readers to deploy the models that we have developed throughout the class to production environments and distribute processing over production scale data. This chapter illustrates how both of these things can be done easily, without significant modifications to the code utilized throughout the book.
The Appendix, Algorithms/Techniques Related to Machine Learning, can be referenced throughout the text of the book and will provide information about algorithms, optimizations, and techniques that are relevant to machine learning workflows.