1. Introduction to Clustering
Activity 1.01: Implementing k-means Clustering
Solution:
- Import the required libraries:
from sklearn.datasets import make_blobs from sklearn.cluster import KMeans from sklearn.metrics import accuracy_score, silhouette_score import matplotlib.pyplot as plt import pandas as pd import numpy as np from scipy.spatial.distance import cdist import math np.random.seed(0) %matplotlib inline
- Load the seeds data file using
pandas
:seeds = pd.read_csv('Seed_Data.csv')
- Return the first five rows of the dataset, as follows:
seeds.head()
The output is as follows:
- Separate the
X
features as follows:X = seeds[['A','P','C','LK','WK','A_Coef','LKG']] y = seeds['target']
- Check the features as follows:
X.head()
The output is as follows:
- Define the
k_means
function as follows...