The curse of dimensionality
There is one fact that the Mahalanobis distance measure is unable to overcome, though, and this is known as the curse of dimensionality. As the number of dimensions in a dataset rises, every point tends to become equally far from every other point. We can demonstrate this quite simply with the following code:
(defn ex-6-27 [] (let [distances (for [d (range 2 100) :let [data (->> (dataset-of-dimension d) (s/mahalanobis-distance) (map first))]] [(apply min data) (apply max data)])] (-> (c/xy-plot (range 2 101) (map first distances) :x-label "Number of Dimensions" :y-label "Distance Between Points" :series-label "Minimum Distance" :legend true) (c/add-lines (range 2 101) (map second distances) ...