6.7 Hellinger distance
This section examines how we can measure the similarity between two comparable collections using the concept of Hellinger distance. The idea is that if the two collections are “close to each other distance-wise,” they are similar. similarity$Hellinger distance
Consider the game of pool, played with a cue and solid and striped balls on a table. Suppose I have a large box and place one hundred yellow pool balls, one hundred red pool balls, one hundred blue pool balls, and one hundred purple pool balls in the box. I mix the balls thoroughly, so if I reach in and take out a ball, I have the same probability of getting one color as any other. That is, I have a uniform distribution of the balls.
I reach into the box and remove one hundred balls. I record the colors and the count of each:
I put the balls back in the box, stir them up well, and then you remove one hundred balls. Oddly, you pull out balls with the same...