Rather than detecting the skin color and then the region with that skin color, we can use OpenCV's floodFill() function, which is similar to the bucket fill tool in most image editing software. We know that the regions in the middle of the screen should be skin pixels (since we asked the user to put their face in the middle), so to change the whole face to have green skin, we can just apply a green flood fill on the center pixel, which will always color some parts of the face green. In reality, the color, saturation, and brightness are likely to be different in different parts of the face, so a flood fill will rarely cover all the skin pixels of a face unless the threshold is so low that it also covers unwanted pixels outside of the face. So instead of applying a single flood fill in the center of the image, let's apply a flood fill on six different points around the face that should be skin pixels.
A nice feature of OpenCV's floodFill() is that it can draw the flood fill in an external image rather than modify the input image. So, this feature can give us a mask image for adjusting the color of the skin pixels without necessarily changing the brightness or saturation, producing a more realistic image than if all the skin pixels became an identical green pixel (losing significant face detail).
Skin color changing does not work so well in the RGB color space, because you want to allow brightness to vary in the face but not allow skin color to vary much, and RGB does not separate brightness from the color. One solution is to use the HSV color space since it separates the brightness from the color (hue) as well as the colorfulness (Saturation). Unfortunately, HSV wraps the hue value around red, and since the skin is mostly red, it means that you need to work both with hue < 10% and hue > 90% since these are both red. So, instead we will use the Y'CrCb color space (the variant of YUV that is in OpenCV), since it separates brightness from color and only has a single range of values for typical skin color rather than two. Note that most cameras, images, and videos actually use some type of YUV as their color space before conversion to RGB, so in many cases you can get a YUV image free without converting it yourself.
Since we want our alien mode to look like a cartoon, we will apply the alien filter after the image has already been cartoonified. In other words, we have access to the shrunken color image produced by the bilateral filter, and access to the full-sized edge mask. Skin detection often works better at low resolutions, since it is the equivalent of analyzing the average value of each high-resolution pixel's neighbors (or the low-frequency signal instead of the high-frequency noisy signal). So, let's work at the same shrunken scale as the bilateral filter (half-width and half-height). Let's convert the painting image to YUV:
Mat yuv = Mat(smallSize, CV_8UC3);
cvtColor(smallImg, yuv, CV_BGR2YCrCb);
We also need to shrink the edge mask so it is on the same scale as the painting image. There is a complication with OpenCV's floodFill() function, when storing to a separate mask image, in that the mask should have a one-pixel border around the whole image, so if the input image is W x H pixels in size, then the separate mask image should be (W+2) x (H+2) pixels in size. But the floodFill() function also allows us to initialize the mask with edges that the flood fill algorithm will ensure it does not cross. Let's use this feature, in the hope that it helps prevent the flood fill from extending outside of the face. So, we need to provide two mask images: one is the edge mask of W x H in size, and the other image is the exact same edge mask but (W+2) x (H+2) in size because it should include a border around the image. It is possible to have multiple cv::Mat objects (or headers) referencing the same data, or even to have a cv::Mat object that references a sub-region of another cv::Mat image. So, instead of allocating two separate images and copying the edge mask pixels across, let's allocate a single mask image including the border, and create an extra cv::Mat header of W x H (which just references the region of interest in the flood fill mask without the border). In other words, there is just one array of pixels of size (W+2) x (H+2) but two cv::Mat objects, where one is referencing the whole (W+2) x (H+2) image and the other is referencing the W x H region in the middle of that image:
auto sw = smallSize.width;
auto sh = smallSize.height;
Mat mask, maskPlusBorder;
maskPlusBorder = Mat::zeros(sh+2, sw+2, CV_8UC1);
mask = maskPlusBorder(Rect(1,1,sw,sh));
// mask is now in maskPlusBorder.
resize(edges, mask, smallSize); // Put edges in both of them.
The edge mask (shown on the left of the following diagram) is full of both strong and weak edges, but we only want strong edges, so we will apply a binary threshold (resulting in the middle image in the following diagram). To join some gaps between edges, we will then combine the morphological operators dilate() and erode() to remove some gaps (also referred to as the close operator), resulting in the image on the right:
const int EDGES_THRESHOLD = 80;
threshold(mask, mask, EDGES_THRESHOLD, 255, THRESH_BINARY);
dilate(mask, mask, Mat());
erode(mask, mask, Mat());
We can see the result of applying thresholding and morphological operation in the following image, first image is the input edge map, second the thresholding filter, and last image is the dilate and erode morphological filters:
As mentioned earlier, we want to apply flood fills in numerous points around the face, to make sure we include the various colors and shades of the whole face. Let's choose six points around the nose, cheeks, and forehead, as shown on the left-hand side of the following screenshot. Note that these values are dependent on the face outline being drawn earlier:
auto const NUM_SKIN_POINTS = 6;
Point skinPts[NUM_SKIN_POINTS];
skinPts[0] = Point(sw/2, sh/2 - sh/6);
skinPts[1] = Point(sw/2 - sw/11, sh/2 - sh/6);
skinPts[2] = Point(sw/2 + sw/11, sh/2 - sh/6);
skinPts[3] = Point(sw/2, sh/2 + sh/16);
skinPts[4] = Point(sw/2 - sw/9, sh/2 + sh/16);
skinPts[5] = Point(sw/2 + sw/9, sh/2 + sh/16);
Now, we just need to find some good lower and upper bounds for the flood fill. Remember that this is being performed in the Y'CrCb color space, so we basically decide how much the brightness can vary, how much the red component can vary, and how much the blue component can vary. We want to allow the brightness to vary a lot, to include shadows as well as highlights and reflections, but we don't want the colors to vary much at all:
const int LOWER_Y = 60;
const int UPPER_Y = 80;
const int LOWER_Cr = 25;
const int UPPER_Cr = 15;
const int LOWER_Cb = 20;
const int UPPER_Cb = 15;
Scalar lowerDiff = Scalar(LOWER_Y, LOWER_Cr, LOWER_Cb);
Scalar upperDiff = Scalar(UPPER_Y, UPPER_Cr, UPPER_Cb);
We will use the floodFill() function with its default flags, except that we want to store to an external mask, so we must specify FLOODFILL_MASK_ONLY:
const int CONNECTED_COMPONENTS = 4; // To fill diagonally, use 8.
const int flags = CONNECTED_COMPONENTS | FLOODFILL_FIXED_RANGE
| FLOODFILL_MASK_ONLY;
Mat edgeMask = mask.clone(); // Keep a copy of the edge mask.
// "maskPlusBorder" is initialized with edges to block floodFill().
for (int i = 0; i < NUM_SKIN_POINTS; i++) {
floodFill(yuv, maskPlusBorder, skinPts[i], Scalar(), NULL,
lowerDiff, upperDiff, flags);
}
The following image on the left-hand side shows the six flood fill locations (shown as circles), and the right-hand side of the image shows the external mask that is generated, where the skin is shown as gray and edges are shown as white. Note that the right-hand image was modified for this book so that skin pixels (of value 1) are clearly visible:
The mask image (shown on the right-hand side of the preceding image) now contains the following:
- Pixels of value 255 for the edge pixels
- Pixels of value 1 for the skin regions
- Pixels of value 0 for the rest
Meanwhile, edgeMask just contains edge pixels (as value 255). So to get just the skin pixels, we can remove the edges from it:
mask -= edgeMask;
The mask variable now just contains 1s for skin pixels and 0s for non-skin pixels. To change the skin color and brightness of the original image, we can use the cv::add() function with the skin mask to increase the green component in the original BGR image:
auto Red = 0;
auto Green = 70;
auto Blue = 0;
add(smallImgBGR, CV_RGB(Red, Green, Blue), smallImgBGR, mask);
The following diagram shows the original image on the left and the final alien cartoon image on the right, where at least six parts of the face will now be green!
Notice that we have made the skin look green but also brighter (to look like an alien that glows in the dark). If you want to just change the skin color without making it brighter, you can use other color changing methods, such as adding 70 to green while subtracting 70 from red and blue, or convert to the HSV color space using cvtColor(src, dst, "CV_BGR2HSV_FULL") and adjust the hue and saturation.