Case study
One of the problems that often plagues data scientists working on machine learning applications is the amount of time it takes to "train" a model. In our specific example of the k-nearest neighbors implementation, training means performing the hyperparameter tuning to find an optimal value of k and the right distance algorithm. In the previous chapters of our case study, we've tacitly assumed there will be an optimal set of hyperparameters. In this chapter, we'll look at one way to locate the optimal parameters.
In more complex and less well-defined problems, the time spent training the model can be quite long. If the volume of data is immense, then very expensive compute and storage resources are required to build and train the model.
As an example of a more complex model, look at the MNIST dataset. See http://yann.lecun.com/exdb/mnist/ for the source data for this dataset and some kinds of analysis that have been performed. This problem requires...