Coding a hierarchical clustering algorithm
Let’s learn how we can code a hierarchical algorithm in Python:
- We will first import
AgglomerativeClustering
from thesklearn.cluster
library, along with thepandas
andnumpy
packages:from sklearn.cluster import AgglomerativeClustering import pandas as pd import numpy as np
- Then we will create 20 data points in a two-dimensional problem space:
dataset = pd.DataFrame({ 'x': [11, 11, 20, 12, 16, 33, 24, 14, 45, 52, 51, 52, 55, 53, 55, 61, 62, 70, 72, 10], 'y': [39, 36, 30, 52, 53, 46, 55, 59, 12, 15, 16, 18, 11, 23, 14, 8, 18, 7, 24, 70] })
- Then we create the hierarchical cluster by specifying the hyperparameters. Note that a hyperparameter refers to a configuration parameter of a machine learning model that is set before the training process and influences the model’s behavior and performance. We use the
fit_predict
function to actually process...