Part 4 – Creating an ARIMA model for predicting flight delays
In Chapter 8, Analytics Study: Prediction - Financial Time Series Analysis and Forecasting, we used time series analysis to build a forecasting model for predicting financial stocks. We can actually use the same technique in flight delays since, after all, we are also dealing here with time series, and so in this section, we'll follow the exact same steps. For each destination airport and optional airline, we'll build a pandas DataFrame that contains matching flight information.
Note
Note: We'll use the statsmodels
library again. Make sure to install it if you haven't done so already and refer to Chapter 8, Analytics Study: Prediction - Financial Time Series Analysis and Forecasting for more information.
As an example, let's focus on all the Delta (DL
) flights with BOS
as the destination:
df = flights[(flights["AIRLINE"] == "DL") & (flights["ORIGIN_AIRPORT"] == "BOS")]
Using the ARRIVAL_DELAY
column as a value for our time series...