Introduction to Market Segmentation
Market segmentation is dividing customers into different segments based on common characteristics. The following are the uses of customer segmentation:
Increasing customer conversion and retention
Developing new products for a particular segment by identifying it and its needs
Improving brand communication with a particular segment
Identifying gaps in marketing strategy and making new marketing strategies to increase sales
Exercise 4: Exploring the Wholesale Customer Dataset
In this exercise, we will have a look at the data in the wholesale customer dataset.
Note
For all the exercises and activities where we are importing an external CSV or image files, go to RStudio-> Session-> Set Working Directory-> To Source File Location. You can see in the console that the path is set automatically.
To download the CSV file, go to https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Exercise04/wholesale_customers_data.csv. Click on wholesale_customers_data.csv.
Note
This dataset is taken from the UCI Machine Learning Repository. You can find the dataset at https://archive.ics.uci.edu/ml/machine-learning-databases/00292/. We have downloaded the file and saved it at https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Exercise04/wholesale_customers_data.csv.
Save it to the folder in which you have installed R. Now, to load it in R, use the following function:
ws<-read.csv("wholesale_customers_data.csv")
Now we may have a look at the different columns and rows in this dataset by using the following function in R:
head(ws)
The output is as follows:
These six rows show the first six rows of annual spending in monetary units by category of product.
Activity 2: Customer Segmentation with k-means
For this activity, we're going to use the wholesale customer dataset from the UCI Machine Learning Repository. It's available at: https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Activity02/wholesale_customers_data.csv. We're going to identify customers belonging to different market segments who like to spend on different types of goods with clustering. Try k-means clustering for values of k from 2 to 6.
Note
This dataset is taken from the UCI Machine Learning Repository. You can find the dataset at https://archive.ics.uci.edu/ml/machine-learning-databases/00292/. We have downloaded the file and saved it at https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Activity02/wholesale_customers_data.csv.
These steps will help you complete the activity:
Read data downloaded from the UCI Machine Learning Repository into a variable. The data can be found at: https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Activity02/wholesale_customers_data.csv.
Select only two columns, Grocery and Frozen, for easy visualization of clusters.
As in Step 2 of Exercise 4, Exploring the Wholesale Customer Dataset, change the value for the number of clusters to 2 and generate the cluster centers.
Plot the graph as in Step 4 in Exercise 4, Exploring the Wholesale Customer Dataset.
Save the graph you generate.
Repeat Steps 3, 4, and 5 by changing value for the number of clusters to 3, 4, 5, and 6.
Decide which value for the number of clusters best classifies the dataset.
The output will be chart of six clusters as follows:
Note
The solution for this activity can be found on page 201.