As introduced in the previous chapters, the Python package for machine learning with the lion's share is scikit-learn. In this chapter, we also will use XGboost, LightGBM, and Catboost: you'll find the instructions in the relevant sections.
The motivations for using scikit-learn developed at Inria, the French Institute for Research in Computer Science and Automation (inria.fr/en/), are multiple. It is worthwhile at this point to mention the most important reasons for using scikit-learn for the success of your data science project:
- A consistent API (fit, predict, transform, and partial_fit) across models that naturally helps to correctly implement data science procedures working on data organized in NumPy arrays
- A complete selection of well-tested and scalable classical models for machine learning, offering many out-of-core implementations...