In discretization using k-means clustering, the intervals are the clusters identified by the k-means algorithm. The number of clusters (k) is defined by the user. The k-means clustering algorithm has two main steps. In the initialization step, k observations are chosen randomly as the initial centers of the k clusters, and the remaining data points are assigned to the closest cluster. In the iteration step, the centers of the clusters are re-computed as the average points of all of the observations within the cluster, and the observations are reassigned to the newly created closest cluster. The iteration step continues until the optimal k centers are found. In this recipe, we will perform k-means discretization with scikit-learn, using the Boston House Prices dataset.