Ensembling with Blending and Stacking Solutions
When you start competing on Kaggle, it doesn’t take long to realize that you cannot win with a single, well-devised model; you need to ensemble multiple models. Next, you will immediately wonder how to set up a working ensemble. There are few guides around, and more is left to Kaggle’s lore than to scientific papers.
The point here is that if ensembling is the key to winning in Kaggle competitions, in the real world it is associated with complexity, poor maintainability, difficult reproducibility, and hidden technical costs for little advantage. Often, the small boost that can move you from the lower ranks to the top of the leaderboard really doesn’t matter for real-world applications because the costs overshadow the advantages. However, that doesn’t mean that ensembling is not being used at all in the real world. In a limited form, such as averaging and mixing a few diverse models, ensembling allows us...