The example data includes 272 observations on two variables--eruption time in minutes and waiting time for the next eruption in minutes--for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA. This data is available in many places, including the freeware R program.
An original source is Hardle, W. (1991) Smoothing Techniques with Implementation in S. New York: Springer.
One reason that this data is featured in examples is that charts reveal that the observations on each input are clearly bimodal. For this reason, we use them to illustrate K-means clustering with two clusters specified.
Our analysis proceeds as usual:
- Descriptive analysis
- Cluster analysis
- Cluster profiling