Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Generative AI and Data Storytelling

Save for later
  • 11 min read
  • 01 Nov 2023

article-image

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!

Introduction

As data volumes grow exponentially, converting vast amounts of data into digestible, human-friendly "information assets" has become a blend of art and science. This process requires understanding the data's context, the intended objective, the target audience. The final step of curating and converting information assets into a cohesive narrative that results in human understanding and persuasion is known as “Data Storytelling.”

The Essence of Data Storytelling

Crafting a compelling data story means setting a clear objective, curating information, and narrating relevant information assets. Good stories communicate effectively, leaving a lasting impression, and in some cases, sway opinions or reinforce previously accepted knowledge. A common tactic to tell an influential data story typically contains quantitative facts and statistics.

Modern enterprises utilize a myriad of techniques to convert accumulated data into valuable information assets.  The vast amounts of analytical and qualitative information assets get distributed in the form of reports and dashboards. These assets are informational to support process automation, decision support, and facilitate future predictions.  Many curated stories manifest as summarized text and charts, often taking the form of a document, spreadsheet, or presentation.

The Potential of Generative AI

While analytics and machine learning are adept at tackling large-scale computational challenges, the growing demand for curated explanations has elevated data storytelling to an important discipline. However, the number of skilled data storytellers remains limited due to the unique blend of technical and soft skills required. This has led to a high demand for quality data stories and a scarcity of adept professionals. Generative AI provides a promising solution by automating many of the data storytelling steps and simulating human interpretation and comprehension.

Crafting Dynamic Narratives with AI

Constructing a coherent narrative from data is intricate. However, the emergence of powerful generative AI models, such as BARD and ChatGPT, offers hope. These platforms, initially popularized by potent large language models, have recently incorporated computer vision to interpret and generate visual data. Given the right context and curated data, these AI models seem ready to address several challenges of data storytelling.

Presently, publicly available generative AI models can produce code-based data analysis, interpret and present findings, experiment with story structures, extract insights, and customize messages based on user personas. Generative AI isn’t adapted to prepare complete visual stories with quantitative data. Merging and chaining multiple AI agents will open the door to translating and delivering visual stories.

The essence of data storytelling lies in narration, and occasionally, in persuasion or creative representation of information. Unmonitored nuances in AI-generated data narratives can be misleading or flat-out incorrect. There is no scenario where data storytelling should go un-aided by a human reviewer.

How Generative AI Can Aid Traditional Descriptive Analytics Stories

Various data analysis techniques, including scatter plots, correlation mattresses, and histograms, help unveil the relationships and patterns within the data. Identifying trends and visualizing the data distribution process, the exploratory data analysis technique guides in selecting the appropriate analysis method. Large language models are not designed to perform computational and quantitative analysis. However, they are quite proficient at generating coding capable of and translating questions to code into quantitative analysis only when context and data are well defined.

generative-ai-and-data-storytelling-img-0

Understanding Data Context

Before choosing any technique to generate narratives, it's essential to understand the context of a dataset. Delving deep into domain-specific nuances ensures narratives that are both focused and relevant.

● Preprocessing and Data Cleaning

Clean and well-processed data form the bedrock of impactful analyses. Both human-built data stories and AI-augmented narratives falter with poor data. Advancements in data management, governance, and descriptive analytics are pivotal in paving the way for superior AI-enhanced stories. Proper preprocessing and quality data are prerequisites for crafting reliable narratives.

● Exploratory Data Analysis Techniques

Data exploration without context and understanding of the data is challenging. However, with a very good dictionary, business glossary, and good data large language models can easily write code capable of streamlining basic data exploration with a human in the loop steering the process.

● Machine Learning with Advanced Analytics

To create dynamic and impactful narratives, it's vital to employ a range of advanced analytical methods, such as regression analysis and clustering. These techniques help unearth detailed insights. Similarly, Generative AI can efficiently conduct basic statistical analyses. It can also deploy machine learning algorithms like decision trees and random forests to discern intricate patterns. Weaving these insights into narrative techniques imparts a layer of sophistication and depth to the storytelling.

● Interactive Data Visualization

Interactive visualization tools enable users to dynamically explore data. Traditionally, this area has been predominantly the domain of human analysis. While machines can interpret data patterns without visual aids, rapid advancements in computer vision and generative AI are bridging the gap. Today's AI can interpret visual representations of data, deriving meaning from visualizations. It's becoming increasingly common for dashboards to be reviewed and summarized by AI, a trend likely to continue.

Example

Here is a simple example in action where we have a data insight that we want to translate to a simple infographic:

Data Insight: Our customer demographics include 45% millennials, 30% Gen X, and 25% baby boomers.

Prompt: 

Create a visual representation to tell the story via wolfram

Result: 

The result with Wolram enabled is a conversion of my prompt and data insights into code that can be translated to a visual infographic

 REQUEST TO WOLFRAM

{ "input": "BarChart[{45, 30, 25}, ChartLabels -> {\"Millennials\", \"Gen X\", \"Baby Boomers\"}, ChartStyle -> {RGBColor[0, 0.5, 1], RGBColor[0, 0.8, 0], RGBColor[1, 0.5, 0]}, PlotLabel -> \"Customer Demographics\", AxesLabel -> {\"Demographic\", \"Percentage\"}]" }

RESPONSE FROM WOLFRAM

{ "output": "https://www.wolframcloud.com/obj/86f3ba22-6a46-412d-88e1-b90064001407" }
generative-ai-and-data-storytelling-img-1

How Generative AI Can Aid Narrative Creation

Natural language processing

 One of the techniques to generate dynamic narratives is the implementation of natural language processing. The technique helps in extracting insights from contextual data. Sometimes, sentiment analysis and topic modeling help in processing textual information while changing the outlook of unstructured data into valuable narratives. At the same time, integrating the same analysis can increase understanding of the narrative.

Data storytelling

As a user, if you want to draft a compelling data story, you need to structure the narrative coherently. Starting with a captivating introduction accompanied by data-driven insights and their implications enhances the method. Such compelling storytelling ensures that the focal message is conveyed with clarity.

Data Insight: Our website traffic increased by 25% after the recent marketing campaign. Sales of Product A saw a significant boost, contributing to 40% of total revenue.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime

Prompt: 

Create a visual representation to tell the story via wolfram and a data-driven story based on the following insights

Result:

REQUEST TO WOLFRAM:

{ "input": "GraphicsRow[{BarChart[{100, 125}, ChartLabels -> {\"Before Campaign\", \"After Campaign\"}, ChartStyle -> {Blue, Green}, AxesLabel -> {\"\", \"Traffic\"}, PlotLabel -> \"Website Traffic\"], PieChart[{40, 60}, ChartLabels -> {\"Product A\", \"Other Products\"}, ChartStyle -> {Red, Yellow}, PlotLabel -> \"Revenue Contribution\"]}]" }

RESPONSE FROM WOLFRAM:

{ "output": "https://www.wolframcloud.com/obj/7221f424-771c-4a02-997b-f9bf8fade843" }
generative-ai-and-data-storytelling-img-2

Narrative: "After launching our recent marketing campaign, we witnessed a remarkable surge in our website traffic. The numbers speak for themselves: a significant 25% increase! This not only brought more visitors to our platform but also translated into tangible business outcomes. A standout performer was Product A, which became a major revenue driver. It alone accounted for a whopping 40% of our total revenue, highlighting its immense popularity among our customers. The campaign not only amplified our reach but also reinforced the market's love for Product A. It's evident that our marketing efforts and Product A together created a winning combination for our business."

How can AI-aided storytelling drive operations efficiently?

AI-aided storytelling methods don’t always require creating information assets from raw data. Generative AI can assist in translating previously created information assets or summaries into the new explorative analyses.

Data Insight: In the last quarter, our sales increased by 20%. Customer engagement on social media platforms grew by 35%.

Prompt: 

How can we summarize the information to tell a good data story? What are some additional insights we should investigate?

Result from Large Language Model:

Additional Insights to Investigate:

1. Source of Traffic and Sales:
Which social media platforms contributed the most to the increased engagement?
Are there specific platforms that are driving more sales compared to others?

2. Content Analysis:
What type of content (videos, articles, images, etc.) received the most engagement?
Were there specific campaigns or promotions that were particularly successful?

3. Customer Demographics:
Has there been a change in the demographics of the customers who are purchasing?
Are younger or older age groups more engaged on social media?

From the above prompt and resulting information it is clear how generative AI helps in crafting the story and further exploring complex information relationships in ways that previously required expertise. As a result, large lanague models in the exploration phase can help:

  • Improve the exploration ideas and hypothesis.
  • Creating quck infographics
  • Automate the manual process of narrative creation and summarization.

Insights of leveraging AI for more effective data communication

 generative-ai-and-data-storytelling-img-3

There can be various ways of leveraging artificial intelligence to incorporate data communication effectively. These are the top areas where generative AI can help boost the operation.

  • Streamlined Unstructured Data Analysis

Large language models are very good at ingesting and interpreting bodies of text and interpreting and extracting summarized facts. The latest generative AI tools, including OpenAI, now support interpreting large documents including PDFs and spreadsheets as large as 100MB. For example, you can potentially import and interpret an annual statement and summarize it. For large data sets representing many documents or gigabytes of data, there are a number of ingest, indexing, and large language model integration techniques to effectively expand the body of knowledge and context your model has to formulate data stories.

  • Advancing analytics

Democratization of data science is further expanded thanks to large language models’ ability to interpret and write code, notably Python. Using code interpreter, we can easily feed AI tens or hundreds of thousands of data points and predict the most predictive variables that determine a target variable. For example, after importing 50,000 records and feeding the following prompt a modern AI model can automatically perform steps traditionally carried out by a data analyst or data scientist. To successfully use this method, you still need to understand and interpret results and direct any miscalculations. For example, the AI model will not help you understand variables that overfit.

The steps we'll follow are:

1. Data Preprocessing:

  • Handle missing values.
  • Encode categorical variables.

Split the dataset into training and testing sets.

2. Train a model (e.g., random forest) using the data.
3. Extract feature importance to determine the most predictive variables.

Resulting code example:
# Redefining the RandomForestClassifier
rf = RandomForestClassifier(n_estimators=100, random_state=42)
 
# Training a RandomForest model
rf.fit(X_train, y_train)
 
# Extracting feature importances
feature_importances = rf.feature_importances_
sorted_idx = feature_importances.argsort()[-10:][::-1]  # We'll display the top 10 features
 
top_features = X.columns[sorted_idx]
top_importances = feature_importances[sorted_idx]
 
top_features, top_importances

Resulting code example:

Based on the Random Forest model, the top 10 most predictive variables for "Bad" are…

Conclusion

There are numerous opportunities to utilize generative AI to aid data storytelling. While the techniques covered in this article help streamline the process, there is no magic easy button to automate data storytelling end to end. In the world of data storytelling, AI is a perfect augmentation tool to craft a narrative, perform analysis, and inspire new ideas. However, the competitive and disruptive world of AI is already merging computer vision, code interpretation, and generative AI further opening the door for new techniques. As images and text converge with code interpretation, we should see new distributive storytelling tools emerge. The most important ingredient is always the data itself. I am excited to continue to explore and experiment with these advancements to streamline my data storytelling abilities.

Author Bio

Ryan Goodman has dedicated 20 years to the business of data and analytics, working as a practitioner, executive, and entrepreneur. He recently founded DataTools Pro after 4 years at Reliant Funding, where he served as the VP of Analytics and BI. There, he implemented a modern data stack, utilized data sciences, integrated cloud analytics, and established a governance structure. Drawing from his experiences as a customer, Ryan is now collaborating with his team to develop rapid deployment industry solutions. These solutions utilize machine learning, LLMs, and modern data platforms to significantly reduce the time to value for data and analytics teams.