Chapter 5 – How to Use Decision Trees to Enhance K-Means Clustering
The questions will focus on the hyperparameters.
- The number of k clusters is not that important. (Yes | No)
The answer is no. The number of clusters requires careful selection, possibly a trial-and-error approach. Each project will lead to different clusters.
- Mini-batches and batches contain the same amount of data. (Yes | No)
The answer is no. "Batch" generally refers to the dataset, and "mini-batch" represents a "subset" of data.
- K-means can run without mini-batches. (Yes | No)
The answer is yes, and no. If the volume of data remains small, then the training epochs can run on the whole dataset. If the data volume exceeds a reasonable amount of computer power (CPU or GPU), mini-batches must be created to optimize training computation.
- Must centroids be optimized for result acceptance? (Yes | No) ...