Predicting traffic using an extremely random forest regressor
Let's apply the concepts learned in the previous sections to a real-world problem. AÂ dataset available at will be used: https://archive.ics.uci.edu/ml/datasets/Dodgers+Loop+Sensor. This dataset consists of data that counts the number of vehicles passing by on the road during baseball games played at Los Angeles Dodgers stadium. In order to make the data readily available for analysis, we need to pre-process it. The pre-processed data is in the file traffic_data.txt
. In this file, each line contains comma-separated strings. Let's take the first line as an example:
Tuesday,00:00,San Francisco,no,3
With reference to the preceding line, it is formatted as follows:
Day of the week, time of the day, opponent team, binary value indicating whether a baseball game is currently going on (yes/no), number of vehicles passing by.
Our goal is to predict the number of vehicles going by using the given information...