Exploring and visualizing stock market data for Apple
Before any modeling and predictions are performed on the data, it is important to first explore and visualize the data at hand for any hidden gems.
Getting ready
We will perform transformations and visualizations on the dataframe in this section. This will require importing the following libraries in Python:
pyspark.sql.functions
matplotlib
How to do it...
The following section walks through the steps to explore and visualize the stock market data.
- Transform the
Date
column in the dataframe by removing the timestamp using the following script:
import pyspark.sql.functions as f df = df.withColumn('date', f.to_date('Date'))
- Create a for-cycle to add three additional columns to the dataframe. The loop breaks apart the
date
field intoyear
,month
, andday
, as seen in the following script:
date_breakdown = ['year', 'month', 'day'] for i in enumerate(date_breakdown): index = i[0] name = i[1] df = df.withColumn(name, f.split('date', '-')[index...