Packt+ | Advance your knowledge in tech

You're reading from IPython Interactive Computing and Visualization Cookbook Over 100 hands-on recipes to sharpen your skills in high-performance numerical computing and data science in the Jupyter Notebook

Product type Paperback

Published in Jan 2018

Publisher Packt

ISBN-13 9781785888632

Length 548 pages

Edition 2nd Edition

Languages

Python

Tools

IPython

Concepts

Data Analysis

Author (1):

Cyrille Rossant

View More author details

Table of Contents (17) Chapters

Preface

1. A Tour of Interactive Computing with Jupyter and IPython FREE CHAPTER

2. Best Practices in Interactive Computing

3. Mastering the Jupyter Notebook

4. Profiling and Optimization

5. High-Performance Computing

6. Data Visualization

7. Statistical Data Analysis

8. Machine Learning

9. Numerical Optimization

10. Signal Processing

11. Image and Audio Processing

12. Deterministic Dynamical Systems

13. Stochastic Dynamical Systems

14. Graphs, Geometry, and Geographic Information Systems

15. Symbolic and Numerical Mathematics

Index

Using a random forest to select important features for regression

Decision trees are frequently used to represent workflows or algorithms. They also form a method for nonparametric supervised learning. A tree mapping observations to target values is learned on a training set and gives the outcomes of new observations.

Random forests are ensembles of decision trees. Multiple decision trees are trained and aggregated to form a model that is more performant than any of the individual trees. This general idea is the purpose of ensemble learning.

There are many types of ensemble methods. Random forests are an instance of bootstrap aggregating, also called bagging, where models are trained on randomly drawn subsets of the training set.

Random forests yield information about the importance of each feature for the classification or regression task. In this recipe, we will find the most influential features of Boston house prices using a classic dataset that contains a range of diverse indicators...