Random forest regression with the Boston dataset
In this section, we will run a random forest regression for the Boston dataset; the median values of owner-occupied homes are predicted for the test data. The dataset describes 13 numerical properties of houses in Boston suburbs, and is concerned with modeling the price of houses in those suburbs in thousands of dollars. As such, this is a regression predictive modeling problem. Input attributes include features like crime rate, proportion of non-retail business acres, chemical concentrations, and more.
Note
To get the data, we draw on the large collection of data available in the UCI Machine Learning Repository at the following link:http://archive.ics.uci.edu/ml
The following list shows all the variables, followed by a brief description:
- Number of instances:
506
- Number of attributes:
14
continuous attributes (including the class attributemedv
), and one binary-valued attribute
Each of the attributes is detailed as follows:
crim
: Per capita crime...