What this book covers
Chapter 1, An Overview of Regularization, provides a high-level introduction to what regularization is, as well as all the fundamental knowledge and vocabulary to fully understand the remaining chapters of this book.
Chapter 2, Machine Learning Refresher, guides you through a typical machine learning workflow and best practices, from data loading and splitting to model training and evaluation.
Chapter 3, Regularization with Linear Models, covers regularization with common linear models: linear regression and logistic regression. Regularization with L1 and L2 penalization is covered, as well as some practical tips for how to choose the right regularization method.
Chapter 4, Regularization with Tree-Based Models, provides reminders about decision trees for both classification and regression, as well as how to regularize them. Ensemble methods, such as Random Forest and Gradient Boosting, and their regularization methods are then covered.
Chapter 5, Regularization with Data, introduces regularization with data, using hashing and its features and feature aggregation. Resampling methods for imbalanced datasets are then covered.
Chapter 6, Deep Learning Reminders, provides reminders about deep learning, both conceptually and practically. Starting with a Perceptron, we then train models for regression and classification.
Chapter 7, Deep Learning Regularization, covers regularization for deep learning models. Several techniques are explored and explained: L2 penalization, early stopping, network architecture, and dropout.
Chapter 8, Regularization with Recurrent Neural Networks, dives into Recurrent Neural Networks (RNNs) and Gated Recurrent Units (GRUs). It starts by explaining what they are and how to train such models. Regularization techniques are then covered, such as dropout and maximum sequence length.
Chapter 9, Advanced Regularization in Natural Language Processing, explores regularization methods specific to Natural Language Processing (NLP). Regularization using word2vec embeddings and BERT embeddings is covered. Data augmentation with word2vec and GPT-3 is explored. Zero-shot inference solutions are also proposed.
Chapter 10, Regularization in Computer Vision, dives into regularization for computer vision and Convolutional Neural Networks (CNNs). After explaining CNNs conceptually and practically on classification, recipes with regularization for object detection and semantic segmentation are provided.
Chapter 11, Regularization in Computer Vision – Synthetic Image Generation, dives deeper into synthetic image generation for regularization. Simple data augmentation is first explored. Then, a QR code object detection mechanism is built with only synthetic training data. Finally, we explore a real-time style transfer whose training is based on Stable Diffusion data, as well as explain how to work with such a dataset by yourself.