Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Applied Unsupervised Learning with R

You're reading from   Applied Unsupervised Learning with R Uncover hidden relationships and patterns with k-means clustering, hierarchical clustering, and PCA

Arrow left icon
Product type Paperback
Published in Mar 2019
Publisher
ISBN-13 9781789956399
Length 320 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Bradford Tuckfield Bradford Tuckfield
Author Profile Icon Bradford Tuckfield
Bradford Tuckfield
Alok Malik Alok Malik
Author Profile Icon Alok Malik
Alok Malik
Arrow right icon
View More author details
Toc

Introduction to Market Segmentation


Market segmentation is dividing customers into different segments based on common characteristics. The following are the uses of customer segmentation:

  • Increasing customer conversion and retention

  • Developing new products for a particular segment by identifying it and its needs

  • Improving brand communication with a particular segment

  • Identifying gaps in marketing strategy and making new marketing strategies to increase sales

Exercise 4: Exploring the Wholesale Customer Dataset

In this exercise, we will have a look at the data in the wholesale customer dataset.

Note

For all the exercises and activities where we are importing an external CSV or image files, go to RStudio-> Session-> Set Working Directory-> To Source File Location. You can see in the console that the path is set automatically.

  1. To download the CSV file, go to https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Exercise04/wholesale_customers_data.csv. Click on wholesale_customers_data.csv.

    Note

    This dataset is taken from the UCI Machine Learning Repository. You can find the dataset at https://archive.ics.uci.edu/ml/machine-learning-databases/00292/. We have downloaded the file and saved it at https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Exercise04/wholesale_customers_data.csv.

  2. Save it to the folder in which you have installed R. Now, to load it in R, use the following function:

    ws<-read.csv("wholesale_customers_data.csv")
  3. Now we may have a look at the different columns and rows in this dataset by using the following function in R:

    head(ws)

    The output is as follows:

    Figure 1.19: Columns of the wholesale customer dataset

These six rows show the first six rows of annual spending in monetary units by category of product.

Activity 2: Customer Segmentation with k-means

For this activity, we're going to use the wholesale customer dataset from the UCI Machine Learning Repository. It's available at: https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Activity02/wholesale_customers_data.csv. We're going to identify customers belonging to different market segments who like to spend on different types of goods with clustering. Try k-means clustering for values of k from 2 to 6.

Note

This dataset is taken from the UCI Machine Learning Repository. You can find the dataset at https://archive.ics.uci.edu/ml/machine-learning-databases/00292/. We have downloaded the file and saved it at https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Activity02/wholesale_customers_data.csv.

These steps will help you complete the activity:

  1. Read data downloaded from the UCI Machine Learning Repository into a variable. The data can be found at: https://github.com/TrainingByPackt/Applied-Unsupervised-Learning-with-R/tree/master/Lesson01/Activity02/wholesale_customers_data.csv.

  2. Select only two columns, Grocery and Frozen, for easy visualization of clusters.

  3. As in Step 2 of Exercise 4, Exploring the Wholesale Customer Dataset, change the value for the number of clusters to 2 and generate the cluster centers.

  4. Plot the graph as in Step 4 in Exercise 4, Exploring the Wholesale Customer Dataset.

  5. Save the graph you generate.

  6. Repeat Steps 3, 4, and 5 by changing value for the number of clusters to 3, 4, 5, and 6.

  7. Decide which value for the number of clusters best classifies the dataset.

The output will be chart of six clusters as follows:

Figure 1.20: The expected chart for six clusters

Note

The solution for this activity can be found on page 201.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at ₹800/month. Cancel anytime