Implementing facial keypoint detection
So far, we have learned about predicting classes that are binary (cats versus dogs) or are multi-label (Fashion-MNIST). Let’s now learn a regression problem and, in so doing, a task where we are predicting not one but several continuous outputs (and hence a multi-regression learning).
Imagine a scenario where you are asked to predict the keypoints present on an image of a face; for example, the location of the eyes, nose, and chin. In this scenario, we need to employ a new strategy to build a model to detect the keypoints.
Before we dive further, let’s understand what we are trying to achieve through the following image:
Figure 5.8: (Left) Input image; (Right) Input image overlaid with facial keypoints
As you can observe in the preceding image, facial keypoints denote the markings of various keypoints on an image that contains a face.
To solve this problem, we would first have to solve a few other problems...