Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

How Large Language Models Reshape Trading Stats

Save for later
  • 15 min read
  • 21 Sep 2023

article-image

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!

Introduction

Stock analysis is not just about numbers; it's a sophisticated dance of interpretation and prediction. Advanced techniques, such as the ones discussed here, offer deeper insights into the world of stocks. The journey begins with Volatility Analysis, utilizing Rolling Standard Deviation to grasp the extent of stock price movements, offering a window into the stock's inherent risk. Predictive Modeling then takes the stage, harnessing past data to provide a lens into potential future stock prices. Yet, for any analysis to stand on solid ground, the data itself must be pristine. This is where Data Cleaning comes into play, meticulously weeding out inconsistencies and errors, and laying a trustworthy foundation. To make sense of this vast sea of data, Visualizations transform intricate datasets into digestible charts and graphs, bridging the gap between numbers and narratives. Now, with these advanced steps in view, it's time to Dive In. This guide is designed for those keen to delve into a more profound layer of stock analysis, be it beginners eager to learn or veterans aiming to hone their skills.

Step 1. Volatility Analysis gauges stock price fluctuations using Rolling Standard Deviation.

Step 2. Data Cleaning: Ensure the data is free of inconsistencies and errors before diving into analysis.

Step 3. Visualizations: Create intuitive charts and graphs to represent data, making it easily comprehensible at a glance.

Step 4. Heat Map: Utilize heatmaps to visualize monthly average closing prices, helping traders spot recurring patterns, seasonality, and stock price shifts for informed investment decisions.

Step 5. Predictive Modeling: Leverage historical data to attempt predictions of future stock prices.

You can access the data here (Link)

Step 1: Volatility Analysis - Rolling Standard Deviation

Volatility Analysis employs the Rolling Standard Deviation to measure the consistency of stock price fluctuations over time.

Prompt 1: As a diligent stock analyst, recognizing and quantifying a stock's volatility is paramount in understanding its risk and potential reward. To delve into this aspect, let's employ the rolling standard deviation methodology on the daily returns. Using a 21-day window - a conventional choice for monthly volatility assessments - will yield a dynamic depiction of the stock's volatility. The subsequent plot showcasing the 21-day rolling volatility essentially captures the standard deviation of the stock's daily returns over this time frame. Such a visualization can aid investors in discerning periods of heightened uncertainty or stability, assisting in refined decision-making.

Response: Absolutely, the rolling standard deviation of daily returns is a commonly used metric to assess the volatility of a stock. A higher standard deviation signifies greater volatility, while a lower value indicates stability.

To assess the stock's volatility:

1. Compute the rolling standard deviation of the daily returns using a 21-day window.

2. Plot the resulting 21-day rolling volatility.

Let's proceed with these steps.

The graph showcases the 21-day rolling volatility of the stock, which is essentially the standard deviation of the stock's daily returns over a 21-day window.

# Compute the 21-day rolling standard deviation of the daily returns
data['21-day Volatility'] = data['Daily Returns'].rolling(window=21).std()
 
# Plotting the 21-day rolling volatility
plt.figure(figsize=(15, 7))
plt.plot(data.index, data['21-day Volatility'], label='21-day Volatility', color='purple')
 
plt.title("21-day Rolling Volatility")
plt.xlabel("Date")
plt.ylabel("Volatility (Standard Deviation of Daily Returns)")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

 how-large-language-models-reshape-trading-stats-img-0

Interpretation: The chart shows the stock's ups and downs over time using a purple line. When this line peaks, it means the stock had a lot of movement, possibly because of big company news or changes in the wider market. On the other hand, when the line is low, the stock was more stable. While some investors might see these peaks as chances to make quick gains, others might prefer the steadier, low points. Simply put, this chart helps investors decide when to buy or sell based on how much the stock is expected to move.

After analyzing the volatility, let’s clean the data  for further analysis

Step 2: Data Cleaning and Transformation

Data Cleaning and Transformation involves addressing missing values, normalizing or standardizing data for modeling purposes, and ensuring date uniformity for a robust time series analysis by setting it as the index.

Prompt 2: As a meticulous stock analyst, it's crucial to ensure the integrity and completeness of our dataset before deep-diving into further analysis. Start by scanning the dataset for any missing values, and addressing them appropriately, either through imputation or deletion based on the dataset's nature and the missing data's extent. Moreover, consistency is key; hence, confirm that the "Date" column adheres to a uniform format. This foundational data preparation step ensures the reliability of our subsequent analyses and insights.

Response: Absolutely, ensuring the data's integrity and completeness is a foundational step in any analysis. To maintain the reliability of our subsequent analyses, we'll:

1. Scan the dataset for missing values.

2. Address these missing values, deciding on either imputation or deletion based on the nature and extent of the missing data.

3. Confirm that the "Date" column is in a consistent format.

Let's start by checking for missing values in our dataset.

# Check for missing values in the dataset
missing_values = data.isnull().sum()
missing_values

The dataset has missing values in the following columns:

1. Daily Returns: 1 missing value - This is expected since the first entry won't have a previous day to compute the return from.

2. 50-day MA: 49 missing values - The 50-day moving average will have missing values for the initial 49 days since there aren't enough data points to compute the average.

3. 200-day MA: 199 missing values - Similarly, the 200-day moving average will lack values for the initial 199 days.

4. 21-day Volatility: 21 missing values - The 21-day rolling volatility will be absent for the initial 21 days.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime

Given the nature of these missing values (they're a result of our calculations and are expected due to the rolling window computations), we don't need to impute them. However, for some analyses, we might want to focus on dates after these initial missing values to ensure continuity.

Next, let's ensure the "Date" column (which is now our index) is consistent in format.

# Check the data type and consistency of the "Date" column (which is now the index)
date_format_consistency = data.index
date_format_consistency

 The "Date" column, which is now our index, is of type `datetime64[ns]`. This confirms that the dates are in a consistent datetime format, ensuring that our subsequent analyses are accurate and reliable.

Step 3: Visualizations - Distribution Plots

Visualizations encompass the use of candlestick plots to intricately depict stock price movements, distribution plots to showcase price and volume variances, and heatmaps to represent monthly or yearly average stock prices.

Prompt 3: In your role as a stock analyst, visualizing the distribution of data can offer profound insights into its behavior and characteristics. Now, direct your attention to the distribution plots of both the stock prices (specifically the Close/Last prices) and the trading volume. These plots will elucidate the spread, skewness, peaks, and central tendencies of the data.

1. Distribution of Close/Last Prices: By examining this plot, you can discern the most frequent closing prices, identify any outliers, and understand the distribution's symmetry or skewness. Does it lean towards higher or lower prices? Are there any significant peaks?

2. Distribution of Trading Volume: This visualization will shed light on the typical trading volume levels. Are there days of extraordinarily high or low trading volumes? Does the distribution hint at a particular trend in trading activity?

Through these plots, you're better positioned to understand the stock's price movements and the market's interest in it, guiding future analysis or investment recommendations.

Response: To visualize the distribution of data, we'll employ distribution plots (or histograms) for both the stock's closing prices and the trading volume.

1. Distribution of Close/Last Prices: This will help us understand the frequency distribution of the closing prices, offering insights into the most common price levels, the data's symmetry or skewness, and potential outliers.

2. Distribution of Trading Volume: This will shed light on the frequency distribution of trading volumes, allowing us to discern common volume levels and any anomalies in trading activity.

Let's begin by visualizing the distribution of the stock's Close/Last prices.

# Importing necessary libraries for visualization
import seaborn as sns
 
# Setting up the figure and axes
fig, ax = plt.subplots(nrows=2, ncols=1, figsize=(15, 10))
 
# Distribution of Close/Last Prices
sns.histplot(data['Close/Last'], bins=50, kde=True, ax=ax[0], color='blue')
ax[0].set_title("Distribution of Close/Last Prices")
ax[0].set_xlabel("Close/Last Price ($)")
ax[0].set_ylabel("Frequency")
 
# Distribution of Trading Volume
sns.histplot(data['Volume'], bins=50, kde=True, ax=ax[1], color='gray')
ax[1].set_title("Distribution of Trading Volume")
ax[1].set_xlabel("Trading Volume")
ax[1].set_ylabel("Frequency")
 
plt.tight_layout()
plt.show()

 how-large-language-models-reshape-trading-stats-img-1

Interpretation: The plots show how often the stock closes at different prices and how its trading volume varies. For closing prices, most often they land in a certain range, but sometimes they go a bit higher. There don't seem to be any unusual values, but a different chart could confirm this. The trading volume, or how much the stock is traded, usually stays within a specific range. However, there are rare days when it's traded a lot, possibly due to big news about the company. In short, these charts give a snapshot of the stock's usual behavior and any deviations from it, helping guide investment choices.

Step 4: Visualizations – Heatmap

Heat Map visualization showcases monthly average closing prices, empowering traders to identify recurring patterns, discern seasonality, and detect stock price shifts, thereby guiding informed investment choices.

Prompt 9: As an astute stock analyst, appreciating the power of visualizations to detect patterns is vital. To enhance our understanding of the stock's behavior over time, let's craft a heatmap that displays the monthly average closing prices. Heatmaps, with their gradient color schemes, provide an intuitive way to quickly spot variations and trends. By focusing on monthly averages, this visualization will surface any recurring monthly patterns, seasonality, or discernible shifts in the stock's closing prices. Such insights can be instrumental for traders and investors seeking to capitalize on consistent temporal trends in the stock market.

Response: The heatmap showcases the monthly average closing prices for the stock:

# Define the price columns
price_columns = ['Close/Last', 'Open', 'High', 'Low']
 
# Convert the price columns to float values by removing the '$' sign and converting to float
for col in price_columns:
    data[col] = data[col].str.replace('$', '', regex=False).astype(float)
 
# Set the 'Date' column as the index
data.set_index('Date', inplace=True)
 
# Compute the daily returns
data['Daily Returns'] = data['Close/Last'].pct_change()
 
# Resample the data for monthly average and transform for heatmap
monthly_avg = data['Close/Last'].resample('M').mean()
heatmap_data = monthly_avg.groupby([monthly_avg.index.year, monthly_avg.index.month]).mean().unstack()
 
# Plotting the heatmap
plt.figure(figsize=(12, 6))
sns.heatmap(heatmap_data, cmap="YlGnBu", annot=True, fmt=".2f", cbar_kws={'label': 'Average Closing Price ($)'})
plt.title("Monthly Average Closing Prices")
plt.xlabel("Month")
plt.ylabel("Year")
plt.show()

 how-large-language-models-reshape-trading-stats-img-2 

Interpretation:

  • Color Gradient: The color gradient, ranging from light to dark shades of blue, represents the average closing price. Darker shades indicate higher average closing prices for that particular month, while lighter shades suggest lower averages.
  • Annotations: Each cell in the heatmap is annotated with the specific average closing price for that month and year.

By examining this heatmap:

  • Patterns & Seasonality: You can spot any recurring monthly patterns, seasonality, or discernible shifts in the stock's closing prices over the years.
  • Variations: The color variations can quickly help identify months with particularly high or low average closing prices.

This visualization provides a comprehensive and intuitive view of the stock's price movements on a monthly basis, aiding traders and investors in spotting consistent temporal trends and guiding future analyses or investment strategies. Now let’s look into the future trend.

Step 5: Predictive Modeling (Time Series Forecasting)

In the realm of stock analysis, forecasting is crucial to anticipate market movements. Starting with chronological time series data, one should ensure its stationarity, utilize ACF and PACF for ARIMA parameter selection, then train, forecast, visualize the results, and continuously validate and refine, ensuring well-informed investment decisions.

Prompt 5: As a dedicated stock analyst, I understand the nuances and intricacies of time series forecasting. When working with stock price data, the ARIMA (Autoregressive Integrated Moving Average) model can be an invaluable tool. Begin with your time series data, ensuring chronological ordering and handling any missing values. It's crucial to ensure the data's stationarity, as ARIMA thrives on stationary datasets. This can be validated visually or with tests like the Augmented Dickey-Fuller. Diving deeper, ARIMA's efficacy lies in its parameters: \(p\) for the lagged values, \(d\) for differencing, and \(q\) for the moving average term. Helpful tools like ACF and PACF plots can guide the selection of these parameters. Once defined, train your ARIMA model on the dataset. With this trained model, predict future stock prices for your desired time frame. Visual representation can be achieved by plotting the original time series alongside the ARIMA predictions. For a more holistic view, overlay the forecast's confidence intervals. Remember, in the world of stock forecasting, model validation is paramount. Compare your model's predictions against a holdout set if available, employing metrics such as MAE or RMSE. Continual refinement might be necessary, so be prepared to iterate and adjust as needed for the most accurate forecasting.

Response: We can fit the ARIMA model to our data. However, it's worth noting that multiple combinations of \(p\), \(d\), and \(q\) might be suitable. Model selection criteria, like AIC or BIC, can help refine our choices.

Let's proceed by fitting the ARIMA model using the initial parameters and then forecast future stock prices.

from statsmodels.tsa.arima.model import ARIMA
from datetime import timedelta
 
# Initial parameters based on ACF and PACF plots
p, d, q = 1, 1, 1
 
# Fit the ARIMA model
model = ARIMA(data['Close/Last'], order=(p, d, q))
model_fit = model.fit()
 
# Forecast the next 30 days
forecast_steps = 30
forecast = model_fit.get_forecast(steps=forecast_steps)
forecast_index = [data.index[-1] + timedelta(days=i) for i in range(1, forecast_steps + 1)]
 
# Extract the forecast mean and confidence intervals
forecast_mean = forecast.predicted_mean
confidence_intervals = forecast.conf_int()
 
# Plot the original series and the forecast
plt.figure(figsize=(15, 7))
data['Close/Last'].plot(label='Observed', color='blue')
plt.plot(forecast_index, forecast_mean, color='red', label='Forecast')
plt.fill_between(forecast_index, confidence_intervals.iloc[:, 0], confidence_intervals.iloc[:, 1], color='pink', alpha=0.3)
plt.title('Stock Closing Price Forecast using ARIMA')
plt.xlabel('Date')
plt.ylabel('Close Price ($)')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

 how-large-language-models-reshape-trading-stats-img-3

Interpretation: The chart shows past stock prices (in blue) and what they might be in the next 30 days (in red), based on a prediction model called ARIMA. Around the red line is a pink area that suggests where the stock prices might realistically land. However, it's vital to remember that this is just an estimate. Predicting stock prices is tricky because so many different things can affect them. As time goes on, the pink area gets broader, meaning the predictions are less certain. While this model offers a glimpse into potential future prices, always be cautious when basing decisions on predictions, as the stock market is full of surprises.

Conclusion

Stock analysis, often seen as a realm of pure numbers, is actually a delicate blend of art and science, interpretation paired with prediction. As we've journeyed through, advanced techniques like Volatility Analysis have provided clarity on the unpredictable nature of stocks, while Data Cleaning ensures that our foundation is rock-solid. Visual tools, especially intuitive heatmaps, act as a compass, highlighting subtle patterns and variations in monthly stock prices. At the heart of it all, Predictive Modeling stands as a beacon, illuminating potential future paths using the wisdom of past data. Whether one is just stepping into this vast ocean or is a seasoned navigator, the tools and techniques discussed here not only simplify the journey but also enhance the depth of understanding. In stock analysis, as in many fields, knowledge is power, and with these methods in hand, both newcomers and experts are well-equipped to make informed, strategic decisions in the dynamic world of stocks.

Author Bio

Dr. Anshul Saxena is an author, corporate consultant, inventor, and educator who assists clients in finding financial solutions using quantum computing and generative AI. He has filed over three Indian patents and has been granted an Australian Innovation Patent. Anshul is the author of two best-selling books in the realm of HR Analytics and Quantum Computing (Packt Publications). He has been instrumental in setting up new-age specializations like decision sciences and business analytics in multiple business schools across India. Currently, he is working as Assistant Professor and Coordinator – Center for Emerging Business Technologies at CHRIST (Deemed to be University), Pune Lavasa Campus. Dr. Anshul has also worked with reputed companies like IBM as a curriculum designer and trainer and has been instrumental in training 1000+ academicians and working professionals from universities and corporate houses like UPES, CRMIT, and NITTE Mangalore, Vishwakarma University, Pune & Kaziranga University, and KPMG, IBM, Altran, TCS, Metro CASH & Carry, HPCL & IOC. With a work experience of 5 years in the domain of financial risk analytics with TCS and Northern Trust, Dr. Anshul has guided master's students in creating projects on emerging business technologies, which have resulted in 8+ Scopus-indexed papers. Dr. Anshul holds a PhD in Applied AI (Management), an MBA in Finance, and a BSc in Chemistry. He possesses multiple certificates in the field of Generative AI and Quantum Computing from organizations like SAS, IBM, IISC, Harvard, and BIMTECH.

Author of the book: Financial Modeling Using Quantum Computing