The Pipeline class helps to sequence, or streamline, the execution of separate blocks that lead to an estimated model; it chains multiple Transformers and Estimators to form a sequential execution workflow.
Pipelines are useful as they avoid explicitly creating multiple transformed datasets as the data gets pushed through different parts of the overall data transformation and model estimation process. Instead, Pipelines abstract distinct intermediate stages by automating the data flow through the workflow. This makes the code more readable and maintainable as it creates a higher abstraction of the system, and it helps with code debugging.
In this recipe, we will streamline the execution of a generalized linear regression model.