Understanding clustering
So, what exactly is clustering and when might it be helpful? Let's start with a very simple example. Imagine you have a group of people for whom we want to make T-shirts. We can make a T-shirt for each one of them, in whatever size required. The main restriction is that we can only make one size. The sizes are as follows: [1, 2, 3, 4, 5, 7, 9, 11]. Think how you might tackle this problem. We will use the KMeans
algorithm for that, so let's start right away, as follows:
- Import the required packages and models.
NumPy
will be imported as a package, but fromsklearn
we will import the only model that we will be using for now, as illustrated in the following code snippet:import numpy as np from sklearn.cluster import KMeans
- Create a dataset of sizes in the required format. Note that each observation (person's size) should be represented as a list, so we use the
reshape
method ofNumPy
arrays to get the data in the required format, as follows...