Feature engineering
As discussed earlier, we want to predict the close price for the DJIA index for a particular trading day. In this section, we will do feature selection based on our intuition for our basic prediction model for stock prices. We have already generated the training dataset. So, now we will load the saved .pkl format dataset and perform feature selection as well as minor data processing. We will also generate the sentiment score for each of the filtered NYTimes news articles and will use this sentiment score to train our baseline model. We will use the following Python dependencies:
numpy
pandas
nltk
This section has the following steps:
Loading the dataset
Minor preprocessing
Feature selection
Sentiment analysis
So, let's begin coding!
Loading the dataset
We have saved the data in the pickle format, and now we need to load data from it. You can refer to the following code snippet:
You can refer to the code by clicking...