Summary
In this chapter, we discussed different ways to segment the customer base. We first looked at how new versus repeat customers contribute to revenue, as well as how monthly progressions of new and repeat customer numbers can tell us which segment or group of customers to focus on during the next marketing campaigns. Then, we discussed how the K-means clustering algorithm can be used to programmatically build and identify different customer segments. Using the sales amount, order quantity, and refunds, we experimented with how these factors can be used to build different customer segments. In lieu of doing it, we touched on silhouette scores as a criterion for finding the best number of clusters and how log transformation can be beneficial when dealing with highly skewed datasets. Lastly, we used word and sentence embedding vectors to convert the product descriptions into numerical vectors with contextual understanding and further built customer segments based on their product...