Stock prices have a tendency to go up and down. We want to Spark ML and a Spark time-series library to explore historical stock price data going back a couple years and come up numbers like the average closing price. We also want our stock price prediction model to forecast what the stock price will be over the timeframe of a few days.
This chapter presents an ML methodology to reduce the complexity associated with stock price prediction. We will obtain a smaller set of optimal financial indicators by feature selection and employ a Random Forest algorithm to build a price prediction pipeline.
We must first download the dataset from the ModernScalaProjects_Code folder.