2. Hierarchical Clustering
Activity 2.01: Comparing k-means with Hierarchical Clustering
Solution:
- Import the necessary packages from scikit-learn (
KMeans
,AgglomerativeClustering
, andsilhouette_score
), as follows:from sklearn.cluster import KMeans from sklearn.cluster import AgglomerativeClustering from sklearn.metrics import silhouette_score import pandas as pd import matplotlib.pyplot as plt
- Read the wine dataset into the Pandas DataFrame and print a small sample:
wine_df = pd.read_csv("wine_data.csv") print(wine_df.head())
The output is as follows:
- Visualize the wine dataset to understand the data structure:
plt.scatter(wine_df.values[:,0], wine_df.values[:,1]) plt.title("Wine Dataset") plt.xlabel("OD Reading") plt.ylabel("Proline") plt.show()
The output is as follows:
- Use the
sklearn
implementation of k-means on the wine dataset, knowing that...