- What are the situations where you would apply the k-means algorithm compared to hierarchical clustering?
- What is the difference between a regular Spark estimator and an estimator that calls SageMaker?
- For a dataset that takes too long to train, why would it not be a good idea to launch such a job using a SageMaker estimator?
- Research and establish other alternative metrics for cluster evaluation.
- Why is string indexing not a good idea when encoding features for k-means?




















































