Compressing an image using vector quantization
One of the main applications of k-means clustering is vector quantization. Simply speaking, vector quantization is the N-dimensional version of "rounding off". When we deal with 1D data, such as numbers, we use the rounding-off technique to reduce the memory needed to store that value. For example, instead of storing 23.73473572, we just store 23.73 if we want to be accurate up to the second decimal place. Or, we can just store 24 if we don't care about decimal places. It depends on our needs and the trade-off that we are willing to make.
Similarly, when we extend this concept to N-dimensional data, it becomes vector quantization. Of course there are more nuances to it! You can learn more about it at http://www.data-compression.com/vq.shtml. Vector quantization is popularly used in image compression where we store each pixel using fewer bits than the original image to achieve compression.
How to do it…
The full code for this recipe is given in...