In this book, we will cover various ensemble techniques and will learn how to ensemble multiple machine learning algorithms to enhance a model's performance. We will use pandas, NumPy, scikit-learn, and Matplotlib, all of which were built for working with Python, as we will do throughout the book. By now, you should be well aware of data manipulation and exploration.
In this chapter, we will recap how to read and manipulate data in Python, how to analyze and treat missing values, and how to explore data to gain deeper insights. We will use various Python packages, such as numpy and pandas, for data manipulation and exploration, and seaborn packages for data visualization. We will continue to use some or all of these libraries in the later chapters of this book as well. We will also use the Anaconda distribution for our Python coding. If you have not installed Anaconda, you need to download it from https://www.anaconda.com/download. At the time of writing this book, the latest version of Anaconda is 5.2, and comes with both Python 3.6 and Python 2.7. We suggest you download Anaconda for Python 3.6. We will also use the HousePrices dataset, which is available on GitHub.