Collecting the dataset
In order to build the model, first we need to collect the data. We will use the following two data points:
Dow Jones Industrial Average (DJIA) index prices
News articles
DJIA index prices give us an overall idea about the stock market's movements on a particular day, whereas news articles help us find out how news affects the stock prices. We will build our model using these two data points. Now let's collect the data.
Collecting DJIA index prices
In order to collect the DJIA index prices, we will use Yahoo Finance. You can visit this link: https://finance.yahoo.com/quote/%5EDJI/history?period1=1196706600&period2=1512325800&interval=1d&filter=history&frequency=1d. Once you click on this link, you can see that the price data shows up. You can change the time period and click on the Download Data link and that's it; you can have all the data in .csv
file format. Refer to the following screenshot of the Yahoo finance DJIA index price page: