Chapter 6: Clustering
Activity 12: k-means Clustering of Sales Data
This section will detect product sales that perform similarly in nature to recognize trends in product sales.
We will be using the Sales Transactions Weekly Dataset from this URL:
https://archive.ics.uci.edu/ml/datasets/Sales_Transactions_Dataset_Weekly Perform clustering on the dataset using the k-means Algorithm. Make sure you prepare your data for clustering based on what you have learned in the previous chapters.
Use the default settings for the k-means algorithm.
- Load the dataset using pandas.
import pandas pandas.read_csv('Sales_Transactions_Dataset_Weekly.csv')
- If you examine the data in the CSV file, you can realize that the first column contains product id strings. These values just add noise to the clustering process. Also notice that for weeks 0 to 51, there is a W-prefixed label and a Normalized label. Using the normalized label makes more sense, so we can drop the regular...