Acquiring stock data
Our script to acquire the data will be based on the pandas-datareader Python package
. It provides a simple abstraction to remote financial APIs we can leverage in the future in the pipeline. The abstraction is very simple. Given a data source such as Yahoo Finance, you provide the stock ticker/pair and date range, and the data is provided in a DataFrame.
We will now create the load_raw_data.py file
, which will be responsible for loading the data and saving it in the raw
folder. You can look at the contents of the file in the repository at https://github.com/PacktPublishing/Machine-Learning-Engineering-with-MLflow/blob/master/Chapter07/psystock-data-features-main/load_raw_data.py. Execute the following steps to implement the file:
- We will start by importing the relevant packages:
import mlflow from datetime import date from dateutil.relativedelta import relativedelta import pprint import pandas import pandas_datareader.data as web
- Next, you should...