You're reading from Machine Learning with LightGBM and Python A practitioner's guide to developing production-ready machine learning systems

Product type Paperback

Published in Sep 2023

Publisher Packt

ISBN-13 9781800564749

Length 252 pages

Edition 1st Edition

Languages

Python

Tools

LightGBM

Concepts

Machine Learning

Author (1):

Andrich van Wyk

View More author details

Table of Contents (17) Chapters

Preface

1. Part 1: Gradient Boosting and LightGBM Fundamentals

2. Chapter 1: Introducing Machine Learning FREE CHAPTER

3. Chapter 2: Ensemble Learning – Bagging and Boosting

4. Chapter 3: An Overview of LightGBM in Python

5. Chapter 4: Comparing LightGBM, XGBoost, and Deep Learning

6. Part 2: Practical Machine Learning with LightGBM

7. Chapter 5: LightGBM Parameter Optimization with Optuna

8. Chapter 6: Solving Real-World Data Science Problems with LightGBM

9. Chapter 7: AutoML with LightGBM and FLAML

10. Part 3: Production-ready Machine Learning with LightGBM

11. Chapter 8: Machine Learning Pipelines and MLOps with LightGBM

12. Chapter 9: LightGBM MLOps with AWS SageMaker

13. Chapter 10: LightGBM Models with PostgresML

14. Chapter 11: Distributed and GPU-Based Learning with LightGBM

15. Index

Why subscribe?

16. Other Books You May Enjoy

What is machine learning?

Machine learning is a part of the broader artificial intelligence field that involves methods and techniques that allow computers to “learn” specific tasks without explicit programming.

Machine learning is just another way to write programs, albeit automatically, from data. Abstractly, a program is a set of instructions that transforms inputs into specific outputs. A programmer’s job is to understand all the relevant inputs to a computer program and develop a set of instructions to produce the correct outputs.

However, what if the inputs are beyond the programmer’s understanding?

For example, let’s consider creating a program to forecast the total sales of a large retail store. The inputs to the program would be various factors that could affect sales. We could imagine factors such as historical sales figures, upcoming public holidays, stock availability, any special deals the store might be running, and even factors such as the weather forecast or proximity to other stores.

In our store example, the traditional approach would be to break down the inputs into manageable, understandable (by a programmer) pieces, perhaps consult an expert in store sales forecasting, and then devise handcrafted rules and instructions to attempt to forecast future sales.

While this approach is certainly possible, it is also brittle (in the sense that the program might have to undergo extensive changes regarding the input factors) and wholly based on the programmer’s (or domain expert’s) understanding of the problem. With potentially thousands of factors and billions of examples, this problem becomes untenable.

Machine learning offers us an alternative to this approach. Instead of creating rules and instructions, we repeatedly show the computer examples of the tasks we need to accomplish and then get it to figure out how to solve them automatically.

However, where we previously had a set of instructions, we now have a trained model instead of a programmed one.

The key realization here, especially if you are coming from a software background, is that our machine learning program still functions like a regular program: it accepts input, has a way to process it, and produces output. Like all other software programs, machine learning software must be tested for correctness, integrated into other systems, deployed, monitored, and optimized. Collectively, this forms the field of machine learning engineering. We’ll cover all these aspects and more in later chapters.

Machine learning paradigms

Broadly speaking, machine learning has three main paradigms: supervised, unsupervised, and reinforcement learning.

With supervised learning, the model is trained on labeled data: each instance in the dataset has its associated correct output, or label, for the input example. The model is expected to learn to predict the label for unseen input examples.

With unsupervised learning, the examples in the dataset are unlabeled; in this case, the model is expected to discover patterns and relationships in the data. Examples of unsupervised approaches are clustering algorithms, anomaly detection, and dimensionality reduction algorithms.

Finally, reinforcement learning entails a model, usually called an agent, interacting with a particular environment and learning by receiving penalties or rewards for specific actions. The goal is for the agent to perform actions that maximize its reward. Reinforcement learning is widely used in robotics, control systems, or training computers to play games.

LightGBM and most other algorithms discussed later in this book are examples of supervised learning techniques and are the focus of this book.

The following section dives deeper into the machine learning terminology we’ll use throughout this book and the details of the machine learning process.

You're reading from Machine Learning with LightGBM and Python A practitioner's guide to developing production-ready machine learning systems

Table of Contents (17) Chapters

What is machine learning?

Machine learning paradigms

Authors (1)

Personalised recommendations for you