So far, we have learned about predicting classes that are binary (cats versus dogs) or are multi-label (fashionMNIST). Let's now learn a regression problem and, in so doing, a task where we are predicting not one but several continuous outputs. Imagine a scenario where you are asked to predict the key points present on an image of a face, for example, the location of the eyes, nose, and chin. In this scenario, we need to employ a new strategy to build a model to detect the key points.
Before we dive further, let's understand what we are trying to achieve through the following image:
As you can observe in the preceding image, facial key points denote the markings of various key points on the image that contains a face.
To solve this problem, we would have to solve a few problems first:
- Images can be of different shapes:
- This warrants an adjustment in the key point locations while adjusting images to bring them all to a standard image...