Implementing the k-means clustering algorithm
The k-means clustering algorithm partitions data into k different groups. These k groupings are called clusters, and the location of these clusters are adjusted iteratively. We compute the arithmetic mean of all the points in a group to obtain a centroid point that we use, replacing the previous cluster location.
Hopefully, after this succinct explanation, the name k-means clustering no longer sounds completely foreign. One of the best places to learn more about this algorithm is on Coursera: https://class.coursera.org/ml-003/lecture/78.
How to do it…
Create a new file, which we call Main.hs
, and perform the following steps:
Import the following built-in libraries:
import Data.Map (Map) import qualified Data.Map as Map import Data.List (minimumBy, sort, transpose) import Data.Ord (comparing)
Define a type synonym for points shown as follows:
type Point = [Double]
Define the Euclidian distance function between two points:
dist :: Point -> Point -...