Mastering OpenCV 3

Cartoonifier and Skin Changer for Raspberry Pi

This chapter will show how to write some image processing filters for desktop and for small embedded systems such as Raspberry Pi. First, we develop it for the desktop (in C/C++) and then port the project to Raspberry Pi, since this is the recommended scenario when developing for embedded devices. This chapter will cover the following topics:

How to convert a real-life image to a sketch drawing
How to convert to a painting and overlay the sketch to produce a cartoon
A scary evil mode to create bad characters instead of good characters
A basic skin detector and skin color changer, to give someone green alien skin
Finally, how to create an embedded system based on our desktop application

Note that an embedded system is basically a computer motherboard placed inside a product or device, designed to perform specific tasks, and Raspberry Pi is a very low-cost and popular motherboard for building an embedded system:

The preceding picture shows what you could make after this chapter: a battery-powered Raspberry Pi + screen you could wear to Comic Con, turning everyone into a cartoon!

We want to make the real-world camera frames automatically look like they are from a cartoon. The basic idea is to fill the flat parts with some color and then draw thick lines on the strong edges. In other words, the flat areas should become much more flat and the edges should become much more distinct. We will detect edges, smooth the flat areas, and draw enhanced edges back on top, to produce a cartoon or comic-book effect.

When developing an embedded computer vision system, it is a good idea to build a fully working desktop version first before porting it to an embedded system, since it is much easier to develop and debug a desktop program than an embedded system! So this chapter will begin with a complete Cartoonifier desktop program that you can create using your favorite IDE (for example, Visual Studio, XCode, Eclipse, QtCreator). After it is working properly on your desktop, the last section shows how to create an embedded system based on the desktop version. Many embedded projects require some custom code for the embedded system, such as to use different inputs and outputs or use some platform-specific code optimizations. However, for this chapter, we will actually be running identical code on the embedded system and the desktop, so we only need to create one project.

The application uses an OpenCV GUI window, initializes the camera, and with each camera frame it calls the function cartoonifyImage(), containing most of the code in this chapter. It then displays the processed image on the GUI window. This chapter will explain how to create the desktop application from scratch using a USB webcam, and the embedded system-based on the desktop application using a Raspberry Pi Camera Module. So first you would create a desktop project in your favorite IDE, with a main.cpp file to hold the GUI code given in the following sections such as the main loop, webcam functionality, and keyboard input, and you would create a cartoon.cpp file with the image processing operations with most of this chapter's code in a function called cartoonifyImage().

The full source code of this book is available at http://github.com/MasteringOpenCV/code.

Accessing the webcam

To access a computer's webcam or camera device, you can simply call the open() function on a cv::VideoCapture object (OpenCV's method of accessing your camera device), and pass 0 as the default camera ID number. Some computers have multiple cameras attached, or they do not work as default camera 0, so it is common practice to allow the user to pass the desired camera number as a command-line argument, in case they want to try camera 1, 2, or -1, for example. We will also try to set the camera resolution to 640x480 using cv::VideoCapture::set() to run faster on high-resolution cameras.

Depending on your camera model, driver, or system, OpenCV might not change the properties of your camera. It is not important for this project, so don't worry if it does not work with your webcam.

You can put this code in the main() function of your main.cpp file:

    int cameraNumber = 0; 
    if (argc> 1) 
    cameraNumber = atoi(argv[1]); 

    // Get access to the camera. 
    cv::VideoCapture camera; 
    camera.open(cameraNumber); 
    if (!camera.isOpened()) { 
      std::cerr<<"ERROR: Could not access the camera or video!"<< 
      std::endl; 
      exit(1); 
    } 

    // Try to set the camera resolution. 
    camera.set(cv::CV_CAP_PROP_FRAME_WIDTH, 640); 
    camera.set(cv::CV_CAP_PROP_FRAME_HEIGHT, 480);

After the webcam has been initialized, you can grab the current camera image as a cv::Mat object (OpenCV's image container). You can grab each camera frame by using the C++ streaming operator from your cv::VideoCapture object into a cv::Mat object, just like if you were getting input from a console.

OpenCV makes it very easy to capture frames from a video file (such as an AVI or MP4 file) or network stream instead of a webcam. Instead of passing an integer such as camera.open(0), pass a string such as camera.open("my_video.avi") and then grab frames just like it was a webcam. The source code provided with this book has an initCamera() function that opens a webcam, video file, or network stream.

Main camera processing loop for a desktop app

If you want to display a GUI window on the screen using OpenCV, you call the cv::namedWindow() function and then cv::imshow()function for each image, but you must also call cv::waitKey() once per frame, otherwise your windows will not update at all! Calling cv::waitKey(0) waits forever until the user hits a key in the window, but a positive number such as waitKey(20) or higher will wait for at least that many milliseconds.

Put this main loop in the main.cpp file, as the base of your real-time camera app:

     while (true) { 
      // Grab the next camera frame. 
      cv::Mat cameraFrame; 
      camera>>cameraFrame; 
      if (cameraFrame.empty()) { 
        std::cerr<<"ERROR: Couldn't grab a camera frame."<< 
        std::endl; 
        exit(1); 
      } 
      // Create a blank output image, that we will draw onto. 
      cv::Mat displayedFrame(cameraFrame.size(), cv::CV_8UC3); 

      // Run the cartoonifier filter on the camera frame. 
      cartoonifyImage(cameraFrame, displayedFrame); 

      // Display the processed image onto the screen. 
      imshow("Cartoonifier", displayedFrame); 

      // IMPORTANT: Wait for atleast 20 milliseconds, 
      // so that the image can be displayed on the screen! 
      // Also checks if a key was pressed in the GUI window. 
      // Note that it should be a "char" to support Linux. 
      char keypress = cv::waitKey(20);  // Needed to see anything! 
      if (keypress == 27) {   // Escape Key 
        // Quit the program! 
        break; 
      } 
    }//end while

Generating a black and white sketch

To obtain a sketch (black and white drawing) of the camera frame, we will use an edge detection filter, whereas to obtain a color painting, we will use an edge preserving filter (Bilateral filter) to further smoothen the flat regions while keeping edges intact. By overlaying the sketch drawing on top of the color painting, we obtain a cartoon effect, as shown earlier in the screenshot of the final app.

There are many different edge detection filters, such as Sobel, Scharr, Laplacian filters, or a Canny edge detector. We will use a Laplacian edge filter since it produces edges that look most similar to hand sketches compared to Sobel or Scharr, and are quite consistent compared to a Canny edge detector, which produces very clean line drawings but is affected more by random noise in the camera frames and therefore the line drawings would often change drastically between frames.

Nevertheless, we still need to reduce the noise in the image before we use a Laplacian edge filter. We will use a Median filter because it is good at removing noise while keeping edges sharp, but is not as slow as a Bilateral filter. Since Laplacian filters use grayscale images, we must convert from OpenCV's default BGR format to grayscale. In your empty cartoon.cpp file, put this code on the top so you can access OpenCV and STD C++ templates without typing cv:: and std:: everywhere:

    // Include OpenCV's C++ Interface 
    #include "opencv2/opencv.hpp" 

    using namespace cv; 
    using namespace std;

Put this and all remaining code in a cartoonifyImage() function in your cartoon.cpp file:

    Mat gray; 
    cvtColor(srcColor, gray, CV_BGR2GRAY); 
    const int MEDIAN_BLUR_FILTER_SIZE = 7; 
    medianBlur(gray, gray, MEDIAN_BLUR_FILTER_SIZE); 
    Mat edges; 
    const int LAPLACIAN_FILTER_SIZE = 5; 
    Laplacian(gray, edges, CV_8U, LAPLACIAN_FILTER_SIZE);

The Laplacian filter produces edges with varying brightness, so to make the edges look more like a sketch, we apply a binary threshold to make the edges either white or black:

    Mat mask; 
    const int EDGES_THRESHOLD = 80; 
    threshold(edges, mask, EDGES_THRESHOLD, 255, THRESH_BINARY_INV);

In the following figure, you see the original image (to the left) and the generated edge mask (to the right) that looks similar to a sketch drawing. After we generate a color painting (explained later), we also put this edge mask on top to have black line drawings:

Generating a color painting and a cartoon

A strong Bilateral filter smoothens flat regions while keeping edges sharp; and therefore, is great as an automatic cartoonifier or painting filter, except that it is extremely slow (that is, measured in seconds or even minutes, rather than milliseconds!). Therefore, we will use some tricks to obtain a nice cartoonifier, while still running in acceptable speed. The most important trick we can use is that we can perform Bilateral filtering at a lower resolution and it will still have a similar effect as a full resolution, but run much faster. Lets reduce the total number of pixels by four (for example, half width and half height):

    Size size = srcColor.size(); 
    Size smallSize; 
    smallSize.width = size.width/2; 
    smallSize.height = size.height/2; 
    Mat smallImg = Mat(smallSize, CV_8UC3); 
    resize(srcColor, smallImg, smallSize, 0,0, INTER_LINEAR);

Rather than applying a large Bilateral filter, we will apply many small Bilateral filters, to produce a strong cartoon effect in less time. We will truncate the filter (see the following figure) so that instead of performing a whole filter (for example, a filter size of 21x21, when the bell curve is 21 pixels wide), it just uses the minimum filter size needed for a convincing result (for example, with a filter size of just 9x9 even if the bell curve is 21 pixels wide). This truncated filter will apply the major part of the filter (gray area) without wasting time on the minor part of the filter (white area under the curve), so it will run several times faster:

Therefore, we have four parameters that control the Bilateral filter: color strength, positional strength, size, and repetition count. We need a temp Mat since the bilateralFilter()function can't overwrite its input (referred to as in-place processing), but we can apply one filter storing a temp Mat and another filter storing back the input:

    Mat tmp = Mat(smallSize, CV_8UC3); 
    int repetitions = 7;  // Repetitions for strong cartoon effect. 
    for (int i=0; i<repetitions; i++) { 
      int ksize = 9;     // Filter size. Has large effect on speed.  
      double sigmaColor = 9;    // Filter color strength. 
      double sigmaSpace = 7;    // Spatial strength. Affects speed. 
      bilateralFilter(smallImg, tmp, ksize, sigmaColor, sigmaSpace);    
      bilateralFilter(tmp, smallImg, ksize, sigmaColor, sigmaSpace); 
    }

Remember that this was applied to the shrunken image, so we need to expand the image back to the original size. Then we can overlay the edge mask that we found earlier. To overlay the edge mask sketch onto the Bilateral filter painting (left side of the following figure), we can start with a black background and copy the painting pixels that aren't edges in the sketch mask:

    Mat bigImg; 
    resize(smallImg, bigImg, size, 0,0, INTER_LINEAR); 
    dst.setTo(0); 
    bigImg.copyTo(dst, mask);

The result is a cartoon version of the original photo, as shown on the right side of the following figure, where the sketch mask is overlaid on the painting:

Generating an evil mode using edge filters

Cartoons and comics always have both good and bad characters. With the right combination of edge filters, a scary image can be generated from the most innocent looking people! The trick is to use a small-edge filter that will find many edges all over the image, then merge the edges using a small Median filter.

We will perform this on a grayscale image with some noise reduction, so the preceding code for converting the original image to grayscale and applying a 7x7 Median filter should still be used (the first image in the following figure shows the output of the grayscale Median blur). Instead of following it with a Laplacian filter and Binary threshold, we can get a more scary look if we apply a 3x3 Scharr gradient filter along x and y (second image in the figure), then a binary threshold with a very low cutoff (third image in the figure),and a 3x3 Median blur, producing the final evil mask (fourth image in the figure):

    Mat gray;
    cvtColor(srcColor, gray, CV_BGR2GRAY);
    const int MEDIAN_BLUR_FILTER_SIZE = 7;
    medianBlur(gray, gray, MEDIAN_BLUR_FILTER_SIZE);
    Mat edges, edges2;
    Scharr(srcGray, edges, CV_8U, 1, 0);
    Scharr(srcGray, edges2, CV_8U, 1, 0, -1);
    edges += edges2;
    // Combine the x & y edges together.
    const int EVIL_EDGE_THRESHOLD = 12
    threshold(edges, mask, EVIL_EDGE_THRESHOLD, 255,
    THRESH_BINARY_INV);
    medianBlur(mask, mask, 3)

Now that we have an evil mask, we can overlay this mask onto the cartoonified painting image like we did with the regular sketch edge mask. The final result is shown on the right side of the following figure:

Generating an alien mode using skin detection

Now that we have a sketch mode, a cartoon mode (painting + sketch mask), and an evil mode (painting + evil mask), for fun, let's try something more complex: an alien mode, by detecting the skin regions of the face and then changing the skin color to green.

Skin detection algorithm

There are many different techniques used for detecting skin regions, from simple color thresholds using RGB (Red-Green-Blue), HSV (Hue-Saturation-Brightness) values, or color histogram calculation and re-projection, to complex machine-learning algorithms of mixture models that need camera calibration in the CIELab color-space and offline training with many sample faces, and so on. But even the complex methods don't necessarily work robustly across various camera and lighting conditions and skin types. Since we want our skin detection to run on an embedded device, without any calibration or training, and we are just using skin detection for a fun image filter, it is sufficient for us to use a simple skin detection method. However, the color responses from the tiny camera sensor in the Raspberry Pi Camera Module tend to vary significantly, and we want to support skin detection for people of any skin color but without any calibration, so we need something more robust than simple color thresholds.

For example, a simple HSV skin detector can treat any pixel as skin if its hue color is fairly red, and saturation is fairly high but not extremely high, and its brightness is not too dark or extremely bright. But cameras in mobile phones or Raspberry Pi Camera Modules often have bad white balancing, therefore a person's skin might look slightly blue instead of red, and so on, and this would be a major problem for simple HSV thresholding.

A more robust solution is to perform face detection with a Haar or LBP cascade classifier (shown in Chapter 6, Face Recognition using Eigenfaces or Fisherfaces), then look at the range of colors for the pixels in the middle of the detected face, since you know that those pixels should be skin pixels of the actual person. You could then scan the whole image or nearby region for pixels of a similar color as the center of the face. This has the advantage that it is very likely to find at least some of the true skin region of any detected person, no matter what their skin color is or even if their skin appears somewhat blueish or redish in the camera image.

Unfortunately, face detection using cascade classifiers is quite slow on current embedded devices, so that method might be less ideal for some real-time embedded applications. On the other hand, we can take advantage of the fact that for mobile apps and some embedded systems, it can be expected that the user will be facing the camera directly from a very close distance, so it can be reasonable to ask the user to place their face at a specific location and distance, rather than try to detect the location and size of their face. This is the basis of many mobile phone apps, where the app asks the user to place their face at a certain position or perhaps to manually drag points on the screen to show where the corners of their face are in a photo. So let's simply draw the outline of a face in the center of the screen, and ask the user to move their face to the shown position and size.

Showing the user where to put their face

When the alien mode is first started, we will draw the face outline on top of the camera frame so the user knows where to put their face. We will draw a big ellipse covering 70% of the image height, with a fixed aspect ratio of 0.72, so that the face will not become too skinny or fat depending on the aspect ratio of the camera:

    // Draw the color face onto a black background. 
    Mat faceOutline = Mat::zeros(size, CV_8UC3); 
    Scalar color = CV_RGB(255,255,0);    // Yellow. 
    int thickness = 4; 

    // Use 70% of the screen height as the face height. 
    int sw = size.width; 
    int sh = size.height; 
    int faceH = sh/2 * 70/100;  // "faceH" is radius of the ellipse. 

    // Scale the width to be the same nice shape for any screen width.   
    int faceW = faceH * 72/100; 
    // Draw the face outline. 
    ellipse(faceOutline, Point(sw/2, sh/2), Size(faceW, faceH), 
        0, 0, 360, color, thickness, CV_AA);

To make it more obvious that it is a face, let's also draw two eye outlines. Rather than drawing an eye as an ellipse, we can give it a bit more realism (see the following figure) by drawing a truncated ellipse for the top of the eye and a truncated ellipse for the bottom of the eye, because we can specify the start and end angles when drawing with the ellipse() function:

    // Draw the eye outlines, as 2 arcs per eye. 
    int eyeW = faceW * 23/100; 
    int eyeH = faceH * 11/100; 
    int eyeX = faceW * 48/100; 
    int eyeY = faceH * 13/100; 
    Size eyeSize = Size(eyeW, eyeH); 

    // Set the angle and shift for the eye half ellipses. 
    int eyeA = 15; // angle in degrees. 
    int eyeYshift = 11; 

    // Draw the top of the right eye. 
    ellipse(faceOutline, Point(sw/2 - eyeX, sh/2 -eyeY), 
    eyeSize, 0, 180+eyeA, 360-eyeA, color, thickness, CV_AA); 

    // Draw the bottom of the right eye. 
    ellipse(faceOutline, Point(sw/2 - eyeX, sh/2 - eyeY-eyeYshift), 
    eyeSize, 0, 0+eyeA, 180-eyeA, color, thickness, CV_AA); 

    // Draw the top of the left eye. 
    ellipse(faceOutline, Point(sw/2 + eyeX, sh/2 - eyeY), 
    eyeSize, 0, 180+eyeA, 360-eyeA, color, thickness, CV_AA); 

    // Draw the bottom of the left eye. 
    ellipse(faceOutline, Point(sw/2 + eyeX, sh/2 - eyeY-eyeYshift), 
      eyeSize, 0, 0+eyeA, 180-eyeA, color, thickness, CV_AA);

We can do the same to draw the bottom lip of the mouth:

    // Draw the bottom lip of the mouth. 
    int mouthY = faceH * 48/100; 
    int mouthW = faceW * 45/100; 
    int mouthH = faceH * 6/100; 
    ellipse(faceOutline, Point(sw/2, sh/2 + mouthY), Size(mouthW, 
      mouthH), 0, 0, 180, color, thickness, CV_AA);

To make it even more obvious that the user should put their face where shown, let's write a message on the screen!

    // Draw anti-aliased text. 
    int fontFace = FONT_HERSHEY_COMPLEX; 
    float fontScale = 1.0f; 
    int fontThickness = 2; 
    char *szMsg = "Put your face here"; 
    putText(faceOutline, szMsg, Point(sw * 23/100, sh * 10/100), 
    fontFace, fontScale, color, fontThickness, CV_AA);

Now that we have the face outline drawn, we can overlay it onto the displayed image by using alpha blending, to combine the cartoonified image with this drawn outline:

    addWeighted(dst, 1.0, faceOutline, 0.7, 0, dst, CV_8UC3);

This results in the outline in the following figure, showing the user where to put their face, so we don't have to detect the face location:

Implementation of the skin color changer

Rather than detecting the skin color and then the region with that skin color, we can use OpenCV's floodFill() function, which is similar to the bucket fill tool in many image editing software. We know that the regions in the middle of the screen should be skin pixels (since we asked the user to put their face in the middle), so to change the whole face to have green skin, we can just apply a green flood fill on the center pixel, which will always color some parts of the face green. In reality, the color, saturation, and brightness is likely to be different in different parts of the face, so a floodfill will rarely cover all the skin pixels of a face unless the threshold is so low that it also covers unwanted pixels outside of the face. So instead of applying a single flood fill in the center of the image, let's apply a flood fill on six different points around the face that should be skin pixels.

A nice feature of OpenCV's floodFill() is that it can draw the floodfill into an external image rather than modify the input image. So this feature can give us a mask image for adjusting the color of the skin pixels without necessarily changing the brightness or saturation, producing a more realistic image than if all the skin pixels became an identical green pixel(losing significant face detail).

Skin color changing does not work so well in the RGB color-space, because you want to allow brightness to vary in the face but not allow skin color to vary much, and RGB does not separate brightness from color. One solution is to use the HSV color-space, since it separates brightness from the color (Hue) as well as the corlorful-ness (Saturation). Unfortunately, HSV wraps the Hue value around red, and since skin is mostly red, it means that you need to work both with Hue < 10% and Hue > 90%, since these are both red. So, instead we will use the Y'CrCb color-space (the variant of YUV that is in OpenCV), since it separates brightness from color, and only has a single range of values for typical skin color rather than two. Note that most cameras, images, and videos actually use some type of YUV as their color-space before conversion to RGB, so in many cases you can get a YUV image free without converting it yourself.

Since we want our alien mode to look like a cartoon, we will apply the alien filter after the image has already been cartoonified. In other words, we have access to the shrunken color image produced by the Bilateral filter, and access to the full-sized edge mask. Skin detection often works better at low resolutions, since it is the equivalent of analyzing the average value of each high-resolution pixel's neighbors (or the low-frequency signal instead of the high-frequency noisy signal). So let's work at the same shrunk scale as the Bilateral filter (half-width and half-height). Let's convert the painting image to YUV:

    Mat yuv = Mat(smallSize, CV_8UC3); 
    cvtColor(smallImg, yuv, CV_BGR2YCrCb);

We also need to shrink the edge mask so it is at the same scale as the painting image. There is a complication with OpenCV's floodFill() function, when storing to a separate mask image, in that the mask should have a 1 pixel border around the whole image, so if the input image is WxH pixels in size then the separate mask image should be (W+2) x (H+2) pixels in size. But the floodFill() function also allows us to initialize the mask with edges, that the flood fill algorithm will ensure it does not cross. Let's use this feature, in the hope that it helps prevent the flood fill from extending outside of the face. So we need to provide two mask images: one is the edge mask of WxH in size, and the other image is the exact same edge mask but (W+2)x(H+2) in size because it should include a border around the image. It is possible to have multiple cv::Mat objects (or headers) referencing the same data, or even to have a cv::Mat object that references a sub-region of another cv::Mat image. So, instead of allocating two separate images and copying the edge mask pixels across, let's allocate a single mask image including the border, and create an extra cv::Mat header of WxH (that just references the region-of-interest in the flood fill mask without the border). In other words, there is just one array of pixels of size (W+2)x(H+2) but two cv::Mat objects, where one is referencing the whole (W+2)x(H+2) image and the other is referencing the WxH region in the middle of that image:

    int sw = smallSize.width; 
    int sh = smallSize.height; 
    Mat mask, maskPlusBorder; 
    maskPlusBorder = Mat::zeros(sh+2, sw+2, CV_8UC1);
    mask = maskPlusBorder(Rect(1,1,sw,sh));
    // mask is now in maskPlusBorder.
    resize(edges, mask, smallSize);     // Put edges in both of them.

The edge mask (shown on the left of the following figure) is full of both strong and weak edges, but we only want strong edges, so we will apply a binary threshold (resulting in the middle image in the following figure). To join some gaps between edges, we will then combine the morphological operators dilate() and erode() to remove some gaps (also referred to as the close operator), resulting in the right of the figure:

    const int EDGES_THRESHOLD = 80; 
    threshold(mask, mask, EDGES_THRESHOLD, 255, THRESH_BINARY); 
    dilate(mask, mask, Mat()); 
    erode(mask, mask, Mat());

As mentioned earlier, we want to apply flood fills in numerous points around the face, to make sure we include the various colors and shades of the whole face. Let's choose six points around the nose, cheeks, and forehead, as shown on the left-hand side of the following figure. Note that these values are dependent on the face outline drawn earlier:

    int const NUM_SKIN_POINTS = 6; 
    Point skinPts[NUM_SKIN_POINTS]; 
    skinPts[0] = Point(sw/2,          sh/2 - sh/6); 
    skinPts[1] = Point(sw/2 - sw/11,  sh/2 - sh/6); 
    skinPts[2] = Point(sw/2 + sw/11,  sh/2 - sh/6); 
    skinPts[3] = Point(sw/2,          sh/2 + sh/16); 
    skinPts[4] = Point(sw/2 - sw/9,   sh/2 + sh/16); 
    skinPts[5] = Point(sw/2 + sw/9,   sh/2 + sh/16);

Now we just need to find some good lower and upper bounds for the flood fill. Remember that this is being performed in Y'CrCb color-space, so we basically decide how much the brightness can vary, how much the red component can vary, and how much the blue component can vary. We want to allow the brightness to vary a lot, to include shadows as well as highlights and reflections, but we don't want the colors to vary much at all:

    const int LOWER_Y = 60; 
    const int UPPER_Y = 80; 
    const int LOWER_Cr = 25; 
    const int UPPER_Cr = 15; 
    const int LOWER_Cb = 20; 
    const int UPPER_Cb = 15; 
    Scalar lowerDiff = Scalar(LOWER_Y, LOWER_Cr, LOWER_Cb); 
    Scalar upperDiff = Scalar(UPPER_Y, UPPER_Cr, UPPER_Cb);

We will use the floodFill() function with its default flags, except that we want to store to an external mask, so we must specify FLOODFILL_MASK_ONLY:

    const int CONNECTED_COMPONENTS = 4;  // To fill diagonally, use 8.       
    const int flags = CONNECTED_COMPONENTS | FLOODFILL_FIXED_RANGE  
      | FLOODFILL_MASK_ONLY; 
    Mat edgeMask = mask.clone();    // Keep a copy of the edge mask. 
    // "maskPlusBorder" is initialized with edges to block floodFill(). 
    for (int i = 0; i < NUM_SKIN_POINTS; i++) { 
      floodFill(yuv, maskPlusBorder, skinPts[i], Scalar(), NULL, 
        lowerDiff, upperDiff, flags); 
    }

The following figure on the left-side shows the six flood fill locations (shown as circles), and the right-side of the figure shows the external mask that is generated, where skin is shown as gray and edges are shown as white. Note that the right-side image was modified for this book so that skin pixels (of value 1) are clearly visible:

The mask image (shown on the right side of the previous figure) now contains the following:

Pixels of value 255 for the edge pixels
Pixels of value 1 for the skin regions
Pixels of value 0 for the rest

Meanwhile, edgeMask just contains edge pixels (as value 255). So to get just the skin pixels, we can remove the edges from it:

    mask -= edgeMask;

The mask variable now just contains 1's for skin pixels and 0's for non-skin pixels. To change the skin color and brightness of the original image, we can use the cv::add()function with the skin mask, to increase the green component in the original BGR image:

    int Red = 0; 
    int Green = 70; 
    int Blue = 0; 
    add(smallImgBGR, CV_RGB(Red, Green, Blue), smallImgBGR, mask);

The following figure shows the original image on the left, and the final alien cartoon image on the right, where at least six parts of the face will now be green!

Notice that we have made the skin look green but also brighter (to look like an alien that glows in the dark). If you want to just change the skin color without making it brighter, you can use other color changing methods, such as adding 70 to green while subtracting 70 from red and blue, or convert to HSV color space using cvtColor(src, dst, "CV_BGR2HSV_FULL"), and adjust the hue and saturation.

Reducing the random pepper noise from the sketch image

Most of the tiny cameras in smartphones, RPi Camera Modules, and some webcams have significant image noise. This is normally acceptable, but it has a large effect on our 5x5 Laplacian edge filter. The edge mask (shown as the sketch mode) will often have thousands of small blobs of black pixels called pepper noise, made of several black pixels next to each other in a white background. We are already using a Median filter, which is usually strong enough to remove pepper noise, but in our case it may not be strong enough. Our edge mask is mostly a pure white background (value of 255) with some black edges (value of 0) and the dots of noise (also values of 0). We could use a standard closing morphological operator but it will remove a lot of edges. So instead, we will apply a custom filter that removes small black regions that are surrounded completely by white pixels. This will remove a lot of noise while having little effect on actual edges.

We will scan the image for black pixels, and at each black pixel, we'll check the border of the 5x5 square around it to see if all the 5x5 border pixels are white. If they are all white then we know we have a small island of black noise, so then we fill the whole block with white pixels to remove the black island. For simplicity in our 5x5 filter, we will ignore the two border pixels around the image and leave them as they are.

The following figure shows the original image from an Android tablet on the left-side, with a sketch mode in the center, showing small black dots of pepper noise, and the result of our pepper-noise removal shown on the right-side, where the skin looks cleaner:

The following code can be named the removePepperNoise()function to edit the image in-place for simplicity:

    void removePepperNoise(Mat &mask) 
    { 
      for (int y=2; y<mask.rows-2; y++) { 
        // Get access to each of the 5 rows near this pixel. 
        uchar *pUp2 = mask.ptr(y-2); 
        uchar *pUp1 = mask.ptr(y-1); 
        uchar *pThis = mask.ptr(y); 
        uchar *pDown1 = mask.ptr(y+1); 
        uchar *pDown2 = mask.ptr(y+2); 

        // Skip the first (and last) 2 pixels on each row. 
        pThis += 2; 
        pUp1 += 2; 
        pUp2 += 2; 
        pDown1 += 2; 
        pDown2 += 2; 
        for (int x=2; x<mask.cols-2; x++) { 
          uchar value = *pThis;  // Get pixel value (0 or 255). 
          // Check if it's a black pixel surrounded bywhite 
          // pixels (ie: whether it is an "island" of black). 
          if (value == 0) { 
            bool above, left, below, right, surroundings; 
            above = *(pUp2 - 2) && *(pUp2 - 1) && *(pUp2) && 
            *(pUp2 + 1) && *(pUp2 + 2); 
            left = *(pUp1 - 2) && *(pThis - 2) && *(pDown1 - 2); 
            below = *(pDown2 - 2) && *(pDown2 - 1) && *(pDown2) 
              &&*(pDown2 + 1) && *(pDown2 + 2); 
            right = *(pUp1 + 2) && *(pThis + 2) && *(pDown1 + 2); 
            surroundings = above && left && below && right; 
            if (surroundings == true) { 
              // Fill the whole 5x5 block as white. Since we  
              // knowthe 5x5 borders are already white, we just 
              // need tofill the 3x3 inner region. 
              *(pUp1 - 1) = 255; 
              *(pUp1 + 0) = 255; 
              *(pUp1 + 1) = 255; 
              *(pThis - 1) = 255; 
              *(pThis + 0) = 255; 
              *(pThis + 1) = 255; 
              *(pDown1 - 1) = 255; 
              *(pDown1 + 0) = 255; 
              *(pDown1 + 1) = 255; 
              // Since we just covered the whole 5x5 block with 
              // white, we know the next 2 pixels won't be 
              // black,so skip the next 2 pixels on the right. 
              pThis += 2; 
              pUp1 += 2; 
              pUp2 += 2; 
              pDown1 += 2; 
              pDown2 += 2; 
            } 
          } 
          // Move to the next pixel on the right. 
          pThis++; 
          pUp1++; 
          pUp2++; 
          pDown1++; 
          pDown2++; 
        } 
      } 
    }

That's all! Run the app in the different modes until you are ready to port it to embedded!

Porting from desktop to embedded

Now that the program works on desktop, we can make an embedded system from it. The details given here are specific to Raspberry Pi, but similar steps apply when developing for other embedded Linux systems such as BeagleBone, ODROID, Olimex, Jetson, and so on.

There are several different options for running our code on an embedded system, each with some advantages and disadvantages in different scenarios.

There are two common methods for compiling the code for an embedded device:

Copy the source code from the desktop onto the device and compile it directly onboard the device. This is often referred to as native compilation, since we are compiling our code natively on the same system that it will eventually run on.
Compile all the code on the desktop but using special methods to generate code for the device, and then you copy the final executable program onto the device. This is often referred to as cross-compilation since you need a special compiler that knows how to generate code for other types of CPUs.

Cross-compilation is often significantly harder to configure than native compilation, especially if you are using many shared libraries, but since your desktop is usually a lot faster than your embedded device, cross-compilation is often much faster at compiling large projects. If you expect to be compiling your project hundreds of times so as to work on it for months, and your device is quite slow compared to your desktop, such as the Raspberry Pi 1 or Raspberry Pi Zero that are very slow compared to a desktop, then cross-compilation is a good idea. But in most cases, especially for small simple projects, you should just stick with native compilation since it is easier.

Note that all the libraries used by your project will also need to be compiled for the device, so you will need to compile OpenCV for your device. Natively compiling OpenCV on a Raspberry Pi 1 can take hours, whereas, cross-compiling OpenCV on a desktop might take just 15 minutes. But you usually only need to compile OpenCV once and then you'll have it for all your projects, so it is still worth sticking with native compilation of your project (including native compilation of OpenCV) in most cases.

There are also several options for how to run the code on an embedded system:

Use the same input and output methods you used on desktop, such as the same video files or USB webcam or keyboard as input, and display text or graphics to an HDMI monitor in the same way you were doing on desktop.
Use special devices for input and output. For example, instead of sitting at a desk using a USB webcam and keyboard as input and displaying the output to a desktop monitor, you could use the special Raspberry Pi Camera Module for video input, use custom GPIO push-buttons or sensors for input, and use a 7-inch MIPI DSI screen or GPIO LED lights as the output, and then by powering it all with a common portable USB charger, you can be wearing the whole computer platform in your backpack or attach it on your bicycle!
Another option is to stream data in or out of the embedded device to other computers, or even use one device to stream out the camera data and one device to use that data. For example, you can use the Gstreamer framework to configure the Raspberry Pi to stream H.264 compressed video from its Camera Module onto the Ethernet network or through Wi-Fi, so that a powerful PC or server rack on the local network or Amazon AWS cloud-computing services can process the video stream somewhere else. This method allows a small and cheap camera device to be used in a complex project requiring large processing resources located somewhere else.

If you do wish to perform computer vision onboard the device, beware that some low-cost embedded devices such as Raspberry Pi 1, Raspberry Pi Zero, and BeagleBone Black have significantly slower computing power than desktops or even cheap netbooks or smartphones, perhaps 10-50 times slower than your desktop, so depending on your application you might need a powerful embedded device or to stream video to a separate computer as mentioned previously. If you don't need much computing power (for example, you only need to process one frame every 2 seconds, or you only need to use 160x120 image resolution), then a Raspberry Pi Zero running some Computer Vision onboard might be fast enough for your requirements. But many Computer Vision systems need far more computing power, and so if you want to perform Computer Vision onboard the device, you will often want to use a much faster device with a CPU in the range of 2 GHz, such as a Raspberry Pi 3, ODROID-XU4, or Jetson TK1.

Equipment setup to develop code for an embedded device

Let's begin by keeping it as simple as possible, by using a USB keyboard and mouse and a HDMI monitor just like our desktop system, compiling the code natively on the device, and running our code on the device. Our first step will be to copy the code onto the device, install the build tools, and compile OpenCV and our source code on the embedded system.

Many embedded devices such as Raspberry Pi have an HDMI port and at least one USB port. Therefore, the easiest way to start using an embedded device is to plug in a HDMI monitor and USB keyboard and mouse for the device, to configure settings and see output, while doing the code development and testing using your desktop machine. If you have a spare HDMI monitor, plug that into the device, but if you don't have a spare HDMI monitor, you might consider buying a small HDMI screen just for your embedded device.

Also, if you don't have a spare USB keyboard and mouse, you might consider buying a wireless keyboard and mouse that has a single USB wireless dongle, so you only use up a single USB port for both the keyboard and mouse. Many embedded devices use a 5V power supply, but they usually need more power (electrical current) than a desktop or laptop will provide in its USB port. So you should obtain either a separate 5V USB charger (atleast 1.5 Amps, ideally 2.5 Amps), or a portable USB battery charger that can provide atleast 1.5 Amps of output current. Your device might only use 0.5 Amps most of the time, but there will be occasional times when it needs over 1 Amps, so it's important to use a power supply that is rated for at least 1.5 Amps or more, otherwise your device will occasionally reboot or some hardware could behave strangely at important times or the filesystem could become corrupt and you lose your files! A 1 Amp supply might be good enough if you don't use cameras or accessories, but 2.0-2.5 Amps is safer.

For example, the following photographs show a convenient setup containing a Raspberry Pi 3, a good quality 8 GB micro-SD card for $10 (http://ebay.to/2ayp6Bo), a 5-inch HDMI resistive-touchscreen for $30-$45 (http://bit.ly/2aHQO2G), a wireless USB keyboard and mouse for $30 (http://ebay.to/2aN2oXi), a 5V 2.5A power supply for $5 (http://ebay.to/2aCBLVK), a USB webcam such as the very fast PS3 Eye for just $5 (http://ebay.to/2aVWCUS), a Raspberry Pi Camera Module v1 or v2 for $15-$30 (http://bit.ly/2aF9PxD), and an Ethernet cable for $2 (http://ebay.to/2aznnjd),connecting the Raspberry Pi into the same LAN network as your development PC or laptop. Notice that this HDMI screen is designed specifically for the Raspberry Pi, since the screen plugs directly into the Raspberry Pi below it, and has a HDMI male-to-male adapter (shown in the right-hand side photo) for the Raspberry Pi so you don't need an HDMI cable, whereas other screens may require an HDMI cable (http://ebay.to/2aW4Fko) or MIPI DSI or SPI cable. Also note that some screens and touch panels need configuration before they will work, whereas most HDMI screens should work without any configuration:

Notice the black USB webcam (on the far left of the LCD), the Raspberry Pi Camera Module (green and black board sitting on the top-left corner of the LCD), Raspberry Pi board (underneath the LCD), HDMI adapter (connecting the LCD to the Raspberry Pi below it), a blue Ethernet cable (plugged into a router), a small USB wireless keyboard and mouse dongle, and a micro-USB power cable (plugged into a 5V 2.5A power supply).

Configuring a new Raspberry Pi

The following steps are specific to Raspberry Pi (also referred to as an RPi), so if you are using a different embedded device or you want a different type of setup, search the Web about how to setup your board. To setup an RPi 1, 2, or 3 (including their variants such as RPi Zero, RPi2B, 3B, and so on, and RPi 1A+ if you plug in a USB Ethernet dongle):

Get a fairly new, good-quality micro-SD card of at least 8 GB. If you use a cheap micro-SD card or an old micro-SD card that you already used many times before and it has degraded in quality, it might not be reliable enough to boot the RPi, so if you have trouble booting the RPi, you should try a good quality Class 10 micro-SD card (such as SanDisk Ultra or better) that says it handles at least 45 MB/s or can handle 4K video.
Download and burn the latest Raspbian IMG (not NOOBS) to the micro-SD card. Note that burning an IMG is different to simply copying the file to SD. Visit https://www.raspberrypi.org/documentation/installation/installing-images/ and follow the instructions for your desktop's OS, to burn Raspbian to a micro-SD card. Be aware that you will lose any files that were previously on the card.
Plug a USB keyboard and mouse and HDMI display into the RPi, so you can easily run some commands and see the output.
Plug the RPi into a 5V USB power supply with atleast 1.5A, ideally 2.5A or higher. Computer USB ports aren't powerful enough.
You should see many pages of text scrolling while it is booting up Raspbian Linux, then it should be ready after 1 or 2 minutes.
If, after booting, it's just showing a black console screen with some text (such as if you downloaded Raspbian Lite), you are at the text-only login prompt. Log in by typing pi as the username and then hit Enter. Then type raspberry as the password and hit Enter again.
Or if it booted to the graphical display, click on the black Terminal icon at the top to open a shell (Command Prompt).
Initialize some settings in your RPi:

Type sudo raspi-config and hit Enter (see the following screenshot).
First, run Expand Filesystem and then finish and reboot your device, so the Raspberry Pi can use the whole micro-SD card.
If you use a normal (US) keyboard, not a British keyboard, in Internationalization Options, change to Generic 104-key keyboard, Other, English (US), and then for the AltGr and similar questions just hit Enter unless you are use a special keyboard.
In Enable Camera, enable the RPi Camera Module.
In Overclock Options, set to RPi2 or similar so the device runs faster (but generates more heat).
In Advanced Options, enable SSH server.
In Advanced Options, if you are using Raspberry Pi 2 or 3, change Memory Split to 256MB so the GPU has plenty of RAM for video processing. For Raspberry Pi 1 or Zero, use 64 MB or the default.
Finish then Reboot the device.

(Optional) Delete Wolfram, to save 600 MB of space on your SD card:

      sudo apt-get purge -y wolfram-engine

It can be installed back using sudo apt-get install wolfram-engine

To see the remaining space on your SD card, run df -h | head -2

Assuming you plugged the RPi into your Internet router, it should already have Internet access. So update your RPi to the latest RPi firmware, software locations, OS, and software. Warning: Many Raspberry Pi tutorials say you should run sudo rpi-update; however, in recent years it's no longer a good idea to run rpi-update since it can give you an unstable system or firmware. The following instructions will update your Raspberry Pi to have stable software and firmware (note that these commands might take up to 1 hour):

      sudo apt-get -y update
      sudo apt-get -y upgrade
      sudo apt-get -y dist-upgrade
      sudo reboot

Find the IP address of the device:

      hostname -I

Try accessing the device from your desktop.

For example, assuming the device's IP address is 192.168.2.101.

On a Linux desktop:

      ssh-X pi@192.168.2.101

Or on a Windows desktop:

Download, install, and run PuTTY
Then in PuTTY, connect to the IP address (192.168.2.101),
As user pi with password raspberry

(Optional) If you want your Command Prompt to be a different color than the commands and show the error value after each command:

      nano ~/.bashrc

Add this line to the bottom:

      PS1="[e[0;44m]u@h: w ($?) $[e[0m] "

Save the file (hit Ctrl + X, then hit Y, and then hit Enter).

Start using the new settings:

      source ~/.bashrc

To disable the screensaver/screen blank power saving feature in Raspbian from turning off your screen on idle:

      sudo nano /etc/lightdm/lightdm.conf

Look for the line that says #xserver-command=X (jump to line 87 by pressing Alt + G and then typing 87 and hitting Enter).
Change it to: xserver-command=X -s 0 dpms

Save the file (hit Ctrl + X then hit Y then hit Enter).

    sudo reboot

You should be ready to start developing on the device now!

Installing OpenCV on an embedded device

There is a very easy way to install OpenCV and all its dependencies on a Debian-based embedded device such as Raspberry Pi:

    sudo apt-get install libopencv-dev

However, that might install an old version of OpenCV from 1 or 2 years ago.

To install the latest version of OpenCV on an embedded device such as Raspberry Pi, we need to build OpenCV from the source code. First we install a compiler and build system, then libraries for OpenCV to use, and finally OpenCV itself. Note that the steps for compiling OpenCV from source on Linux is the same whether you are compiling for desktop or for embedded. A Linux script install_opencv_from_source.sh is provided with this book; it is recommended you copy the file onto your Raspberry Pi (for example, with a USB flash stick) and run the script to download, build, and install OpenCV including potential multi-core CPU and ARM NEON SIMD optimizations (depending on hardware support):

    chmod +x install_opencv_from_source.sh
./install_opencv_from_source.sh

The script will stop if there is any error; for example, if you don't have Internet access, or a dependency package conflicts with something else you already installed. If the script stops with an error, try using info on the Web to solve that error, then run the script again. The script will quickly check all the previous steps and then continue from where it finished last time. Note that it will take between 20 minutes to 12 hours depending on your hardware and software!

It's highly recommended to build and run a few OpenCV samples every time you've installed OpenCV, so when you have problems building your own code, at least you will know whether the problem is the OpenCV installation or a problem with your code.

Let's try to build the simple edge sample program. If we try the same Linux command to build it from OpenCV 2, we get a build error:

cd ~/opencv-3.*/samples/cpp
g++ edge.cpp -lopencv_core -lopencv_imgproc -lopencv_highgui 
-o edge
/usr/bin/ld: /tmp/ccDqLWSz.o: undefined reference to symbol '_ZN2cv6imreadERKNS_6StringEi'
/usr/local/lib/libopencv_imgcodecs.so.3.1: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status

The second to last line of that error message tells us that a library was missing from the command line, so we simply need to add -lopencv_imgcodecs in our command next to the other OpenCV libraries we linked to. Now you know how to fix the problem anytime you are compiling an OpenCV 3 program and you see that error message. So let's do it correctly:

cd ~/opencv-3.*/samples/cpp
g++ edge.cpp -lopencv_core -lopencv_imgproc -lopencv_highgui 
-lopencv_imgcodecs -o edge

It worked! So now you can run the program:

    ./edge

Hit Ctrl + C on your keyboard to quit the program. Note that the edge program might crash if you try running the command in an SSH terminal and you don't redirect the window to display on the device's LCD screen. So if you are using SSH to remotely run the program, add DISPLAY=:0 before your command:

    DISPLAY=:0 ./edge

You should also plug a USB webcam into the device and test that it works:

g++ starter_video.cpp -lopencv_core -lopencv_imgproc
-lopencv_highgui -lopencv_imgcodecs -lopencv_videoio \
-o starter_video
DISPLAY=:0 ./starter_video 0

Note: If you don't have a USB webcam, you can test using a video file:

    DISPLAY=:0 ./starter_video ../data/768x576.avi

Now that OpenCV is successfully installed on your device, you can run the Cartoonifier applications we developed earlier. Copy the Cartoonifier folder onto the device (for example, by using a USB flash stick, or using scp to copy files over the network). Then build the code just like you did for desktop:

cd ~/Cartoonifier
export OpenCV_DIR="~/opencv-3.1.0/build"
mkdir build
cd build
cmake -D OpenCV_DIR=$OpenCV_DIR ..
make

And run it:

DISPLAY=:0 ./Cartoonifier

Using the Raspberry Pi Camera Module

While using a USB webcam on Raspberry Pi has the convenience of supporting identical behavior and code on desktop as on embedded device, you might consider using one of the official Raspberry Pi Camera Modules (referred to as the RPi Cams). They have some advantages and disadvantages over USB webcams.

The RPi Cams use the special MIPI CSI camera format, designed for smartphone cameras to use less power. They have smaller physical size, faster bandwidth, higher resolutions, higher frame rates, and reduced latency, compared to USB. Most USB 2.0 webcams can only deliver 640x480 or 1280x720 30 FPS video, since USB 2.0 is too slow for anything higher (except for some expensive USB webcams that perform onboard video compression) and USB 3.0 is still too expensive. Whereas, smartphone cameras (including the RPi Cams) can often deliver 1920x1080 30 FPS or even Ultra HD/4K resolutions. The RPi Cam v1 can in fact deliver upto 2592x1944 15 FPS or 1920x1080 30 FPS video even on a $5 Raspberry Pi Zero, thanks to the use of MIPI CSI for the camera and a compatible video processing ISP and GPU hardware inside the Raspberry Pi. The RPi Cams also support 640x480 in 90 FPS mode (such as for slow-motion capture), and this is quite useful for real-time computer vision so you can see very small movements in each frame, rather than large movements that are harder to analyze.

However, the RPi Cam is a plain circuit board that is highly sensitive to electrical interference, static electricity, or physical damage (simply touching the small orange flat cable with your finger can cause video interference or even permanently damage your camera!). The big flat white cable is far less sensitive but it is still very sensitive to electrical noise or physical damage. The RPi Cam comes with a very short 15 cm cable. It's possible to buy third-party cables on eBay with lengths between 5 cm to 1 m, but cables 50cm or longer are less reliable, whereas USB webcams can use 2 m to 5 m cables and can be plugged into USB hubs or active extension cables for longer distances.

There are currently several different RPi Cam models, notably the NoIR version that doesn't have an internal infrared filter; therefore, a NoIR camera can easily see in the dark (if you have an invisible infrared light source), or see infrared lasers or signals far clearer than regular cameras that includes an infrared filter inside them. There are also two different versions of RPi Cam: RPi Cam v1.3 and RPi Cam v2.1, where the v2.1 uses a wider angle lens with a Sony 8 Mega-Pixel sensor instead of a 5 Mega-Pixel OmniVision sensor, and has better support for motion in low lighting conditions, and adds support for 3240x2464 video at 15 FPS and potentially upto 120 FPS video at 720p. However, USB webcams come in thousands of different shapes and versions, making it easy to find specialized webcams such as waterproof or industrial-grade webcams, rather than requiring you to create your own custom housing for an RPi Cam.

IP cameras are also another option for a camera interface that can allow 1080p or higher resolution videos with Raspberry Pi, and IP cameras support not just very long cables, but potentially even work anywhere in the world using the Internet. But IP cameras aren't quite as easy to interface with OpenCV as USB webcams or the RPi Cam.

In the past, RPi Cams and the official drivers weren't directly compatible with OpenCV; you often used custom drivers and modified your code in order to grab frames from RPi Cams, but it's now possible to access an RPi Cam in OpenCVin the exact same way as a USB webcam! Thanks to recent improvements in the v4l2 drivers, once you load the v4l2 driver the RPi Cam will appear as a /dev/video0 or /dev/video1 file like a regular USB webcam. So traditional OpenCV webcam code such as cv::VideoCapture(0) will be able to use it just like a webcam.

Installing the Raspberry Pi Camera Module driver

First let's temporarily load the v4l2 driver for the RPi Cam to make sure our camera is plugged in correctly:

    sudo modprobe bcm2835-v4l2

If the command failed (if it printed an error message to the console, or it froze, or the command returned a number besides 0), then perhaps your camera is not plugged in correctly. Shutdown and then unplug power from your RPi and try attaching the flat white cable again, looking at photos on the Web to make sure it's plugged in the correct way around. If it is the correct way around, it's possible the cable wasn't fully inserted before you closed the locking tab on the RPi. Also check whether you forgot to click Enable Camera when configuring your Raspberry Pi earlier, using the sudoraspi-config command.

If the command worked (if the command returned 0 and no error was printed to the console), then we can make sure the v4l2 driver for the RPi Cam is always loaded on bootup, by adding it to the bottom of the /etc/modules file:

sudo nano /etc/modules
# Load the Raspberry Pi Camera Module v4l2 driver on bootup:
bcm2835-v4l2

After you save the file and reboot your RPi, you should be able to run ls /dev/video* to see a list of cameras available on your RPi. If the RPi Cam is the only camera plugged into your board, you should see it as the default camera (/dev/video0), or if you also have a USB webcam plugged in then it will be either /dev/video0 or /dev/video1.

Let's test the RPi Cam using the starter_video sample program we compiled earlier:

    cd ~/opencv-3.*/samples/cpp
DISPLAY=:0 ./starter_video 0

If it's showing the wrong camera, try DISPLAY=:0 ./starter_video 1.

Now that we know the RPi Cam is working in OpenCV, let's try Cartoonifier:

    cd ~/Cartoonifier
DISPLAY=:0 ./Cartoonifier 0

Or DISPLAY=:0 ./Cartoonifier 1 for the other camera.

Making Cartoonifier to run full screen

In embedded systems, you often want your application to be full screen and hide the Linux GUI and menu. OpenCV offers an easy method to set the full screen window property, but make sure you created the window using the NORMAL flag:

// Create a fullscreen GUI window for display on the screen.
namedWindow(windowName, WINDOW_NORMAL);
setWindowProperty(windowName, WND_PROP_FULLSCREEN,
CV_WINDOW_FULLSCREEN);

Hiding the mouse cursor

You might notice the mouse cursor is shown on top of your window even though you don't want to use a mouse in your embedded system. To hide the mouse cursor, you can use the xdotool command to move it to the bottom-right corner pixel, so it's not noticeable, but is still available if you want to occasionally plug in your mouse to debug the device. Install xdotool and create a short Linux script to run it with Cartoonifier:

sudo apt-get install -y xdotool
cd ~/Cartoonifier/build
nano runCartoonifier.sh
#!/bin/sh
# Move the mouse cursor to the screen's bottom-right pixel.
xdotoolmousemove 3000 3000
# Run Cartoonifier with any arguments given.
/home/pi/Cartoonifier/build/Cartoonifier "$@"

Finally, make your script executable:

chmod +x runCartoonifier.sh

Try running your script, to make sure it works:

DISPLAY=:0 ./runCartoonifier.sh

Running Cartoonifier automatically after bootup

Often when you build an embedded device, you want your application to be executed automatically after the device has booted up, rather than requiring the user to manually run your application. To automatically run our application after the device has fully booted up and logged into the graphical desktop, create an autostart folder with a file in it with certain contents including the full path to your script or application:

    mkdir ~/.config/autostart
nano ~/.config/autostart/Cartoonifier.desktop
        [Desktop Entry]
        Type=Application
        Exec=/home/pi/Cartoonifier/build/runCartoonifier.sh
        X-GNOME-Autostart-enabled=true

Now, whenever you turn the device on or reboot it, Cartoonifier will begin running!

Speed comparison of Cartoonifier on Desktop versus Embedded

You will notice that the code runs much slower on Raspberry Pi than on your desktop! By far the two easiest ways to run it faster are to use a faster device or use a smaller camera resolution. The following table shows some frame rates, Frames per Seconds (FPS) for both the Sketch and Paint modes of Cartoonifier on a desktop, RPi 1, RPi 2, RPi 3, and Jetson TK1. Note that the speeds don't have any custom optimizations and only run on a single CPU core, and the timings include the time for rendering images to the screen. The USB webcam used is the fast PS3 Eye webcam running at 640x480 since it is the fastest low-cost webcam on the market.
It's worth mentioning that Cartoonifier is only using a single CPU core, but all the devices listed have four CPU cores except for RPi 1 which has a single core, and many x86 computers have hyperthreading to give roughly eight CPU cores. So if you wrote your code to efficiently make use of multiple CPU cores (or GPU), the speeds might be 1.5 to 3 times faster than the single-threaded figures shown:

Computer	Sketch mode	Paint mode
Intel Core i7 PC	20 FPS	2.7 FPS
Jetson TK1ARM CPU	16 FPS	2.3 FPS
Raspberry Pi 3	4.3 FPS	0.32 FPS (3 seconds/frame)
Raspberry Pi 2	3.2 FPS	0.28 FPS (4 seconds/frame)
Raspberry Pi Zero	2.5 FPS	0.21 FPS (5 seconds/frame)
Raspberry Pi 1	1.9 FPS	0.12 FPS (8 seconds/frame)

Notice that Raspberry Pi is extremely slow at running the code, especially the Paint mode, so we will try simply changing the camera and the resolution of the camera.

Changing the camera and camera resolution

The following table shows how the speed of the Sketch mode compares on Raspberry Pi 2 using different types of cameras and different camera resolutions:

Hardware	640x480 resolution	320x240 resolution
RPi 2 with RPi Cam	3.8 FPS	12.9 FPS
RPi 2 with PS3 Eye webcam	3.2 FPS	11.7 FPS
RPi 2 with unbranded webcam	1.8 FPS	7.4 FPS

As you can see, when using the RPi Cam in 320x240, it seems we have a good enough solution to have some fun, even if it's not in the 20-30 FPS range that we would prefer.

Power draw of Cartoonifier running on desktop versus embedded system

We've seen that various embedded devices are slower than desktop, from the RPi 1 being roughly 20 times slower than a desktop, up to Jetson TK1 being roughly 1.5 times slower than a desktop. But for some tasks, low speed is acceptable if it means there will also be significantly lower battery draw, allowing for small batteries or low year-round electricity costs for a server or low heat generated.

Raspberry Pi has different models even for the same processor, such as Raspberry Pi 1B, Zero, and 1A+ that all run at similar speeds but have significantly different power draw. MIPI CSI cameras such as the RPi Cam also use less electricity than webcams. The following table shows how much electrical power is used by different hardware running the same Cartoonifier code. Power measurements of Raspberry Pi were performed as shown in the following photo using a simple USB current monitor (for example, J7-T Safety Tester--h t t p ://b i t . l y /2a S Z a 6H--for $5), and a DMM multimeter for the other devices:

Idle Power measures power when the computer is running but no major applications are being used, whereas Cartoonifier Power measures power when Cartoonifier is running. Efficiency is Cartoonifier Power / Cartoonifier Speed in a 640x480 Sketch mode.

Hardware	Idle Power	Cartoonifier Power	Efficiency
RPi Zero with PS3 Eye	1.2 Watts	1.8 Watts	1.4 Frames per Watt
RPi 1A+ with PS3 Eye	1.1 Watts	1.5 Watts	1.1 Frames per Watt
RPi 1B with PS3 Eye	2.4 Watts	3.2 Watts	0.5 Frames per Watt
RPi 2B with PS3 Eye	1.8 Watts	2.2 Watts	1.4 Frames per Watt
RPi 3B with PS3 Eye	2.0 Watts	2.5 Watts	1.7 Frames per Watt
Jetson TK1 with PS3 Eye	2.8 Watts	4.3 Watts	3.7 Frames per Watt
Core i7 laptop with PS3 Eye	14.0 Watts	39.0 Watts	0.5 Frames per Watt

We can see that RPi 1A+ uses the least power, but the most power-efficient options are Jetson TK1 and Raspberry Pi 3B. Interestingly, the original Raspberry Pi (RPi1B) has roughly the same efficiency as an x86 laptop. All later Raspberry Pis are significantly more power-efficient than the original (RPi 1B).

Disclaimer: The author is a former employee of NVIDIA that produced the Jetson TK1, but the results and conclusions are believed to be authentic.

Lets also look at the power draw of different cameras that work with Raspberry Pi:

Hardware	Idle Power	Cartoonifier Power	Efficiency
RPi Zero with PS3 Eye	1.2 Watts	1.8 Watts	1.4 Frames per Watt
RPi Zero with RPi Cam v1.3	0.6 Watts	1.5 Watts	2.1 Frames per Watt
RPi Zero with RPi Cam v2.1	0.55 Watts	1.3 Watts	2.4 Frames per Watt

We see that RPi Cam v2.1 is slightly more power-efficient than RPi Cam v1.3, and significantly more power-efficient than a USB webcam.

Streaming video from Raspberry Pi to a powerful computer

Thanks to the hardware-accelerated video encoders in all modern ARM devices including Raspberry Pi, a valid alternative to performing Computer Vision onboard an embedded device is to use the device to just capture video and stream it across a network in realtime to a PC or server rack. All Raspberry Pi models contain the same video encoder hardware, so an RPi 1A+ or RPi Zero with a Pi Cam is quite a good option for a low-cost, low-power portable video streaming server. Raspberry Pi 3 adds Wi-Fi for additional portable functionality.

There are numerous ways live camera video can be streamed from a Raspberry Pi, such as using the official RPi V4L2 camera driver to allow the RPi Cam to appear like a webcam, then use Gstreamer, liveMedia, netcat, or VLC to stream the video across a network. However, these methods often introduce 1 or 2 seconds of latency and often require customizing the OpenCV client code or learning how to use Gstreamer efficiently. So instead, the following section will show how to perform both the camera capture and network streaming using an alternative camera driver named UV4L:

Install UV4L on the Raspberry Pi by following h t t p ://w w w . l i n u x - p r o j e c t s . o r g /u v 4l /i n s t a l l a t i o n /:

      curl http://www.linux-projects.org/listing/uv4l_repo/lrkey.asc 
      sudo apt-key add -
      sudo su
      echo "# UV4L camera streaming repo:">> /etc/apt/sources.list
      echo "deb http://www.linux-   
        projects.org/listing/uv4l_repo/raspbian/jessie main">> 
        /etc/apt/sources.list
      exit
      sudo apt-get update
      sudo apt-get install uv4l uv4l-raspicam uv4l-server

Run the UV4L streaming server manually (on the RPi) to check that it works:

      sudo killall uv4l
sudo LD_PRELOAD=/usr/lib/uv4l/uv4lext/armv6l/libuv4lext.so 
uv4l -v7 -f --sched-rr --mem-lock --auto-video_nr
--driverraspicam --encoding mjpeg
--width 640 --height 480 --framerate15

Test the camera's network stream from your desktop:

Install VLC Media Player.
File | Open Network Stream | visit h t t p ://192. 168. 2. 111:8080/s t r e a m /v i d e o . m j p e g.
Adjust the URL to the IP address of your Raspberry Pi. Run hostname -I on RPi to find its IP address.

Now get the UV4L server to run automatically on bootup:

      sudo apt-get install uv4l-raspicam-extras

Edit any UV4L server settings you want in uv4l-raspicam.conf such as resolution and frame rate:

      sudo nano /etc/uv4l/uv4l-raspicam.conf
      drop-bad-frames = yes 
      nopreview = yes
      width = 640
      height = 480
      framerate = 24
      sudo reboot

Now we can tell OpenCV to use our network stream as if it was a webcam. As long as your installation of OpenCV can use FFMPEG internally, OpenCV will be able to grab frames from an MJPEG network stream just like a webcam:

      ./Cartoonifier http://192.168.2.101:8080/stream/video.mjpeg

Your Raspberry Pi is now using UV4L to stream the live 640x480 24 FPS video to a PC that is running Cartoonifier in Sketch mode, achieving roughly 19 FPS (with 0.4 seconds of latency). Notice this is almost the same speed as using the PS3 Eye webcam directly on the PC (20 FPS)!

Note that when you are streaming the video to OpenCV, it won't be able to set the camera resolution; you need to adjust the UV4L server settings to change the camera resolution. Also note that instead of streaming MJPEG, we could have streamed H.264 video that uses lower bandwidth, but some computer vision algorithms don't handle video compression such as H.264 very well, so MJPEG will cause less algorithm problems than H.264.

If you have both the official RPi V4L2 driver and the UV4L driver installed, they will both be available as cameras 0 and 1 (devices /dev/video0 and /dev/video1), but you can only use one camera driver at a time.

Customizing your embedded system!

Now that you have created a whole embedded Cartoonifier system, and you know the basics of how it works and which parts do what, you should customize it! Make the video full screen, change the GUI, or change the application behavior and workflow, or change the Cartoonifier filter constants, or the skin detector algorithm, or replace the Cartoonifier code with your own project ideas. Or stream the video to the cloud and process it there!

You can improve the skin detection algorithm in many ways, such as a more complex skin detection algorithm (for example, using trained Gaussian models from many recent CVPR or ICCV conference papers at http://www.cvpapers.com), or add face detection (see the Face detection section of Chapter 6, Face Recognition using Eigenfaces and Fisherfaces) to the skin detector, so it detects where the user's face is, rather than asking the user to put their face in the center of the screen. Beware that face detection may take many seconds on some devices or high-resolution cameras, so they may be limited in their current real-time uses. But embedded system platforms are getting faster every year, so this may be less of a problem over time.

The most significant way to speed up embedded computer vision applications is to reduce the camera resolution absolutely as much as possible (for example, 0.5 mega pixel instead of 5 megapixels), allocate and free images as rarely as possible, and do image format conversions as rarely as possible. In some cases, there might be some optimized image processing or math libraries, or optimized version of OpenCV from the CPU vendor of your device (for example, Broadcom, NVIDIA Tegra, Texas Instruments OMAP, Samsung Exynos), or for your CPU family (for example, ARM Cortex-A9).

To make customizing embedded and desktop image processing code easier, this book comes with the files, ImageUtils.cpp and ImageUtils.h, to help you experiment. They include functions such as printMatInfo() that prints a lot of info about a cv::Mat object, making debugging OpenCV much easier. There are also timing macros to easily add detailed timing statistics to your C/C++ code. For example:

    DECLARE_TIMING(myFilter); 

    void myImageFunction(Mat img) { 
      printMatInfo(img, "input"); 

      START_TIMING(myFilter); 
      bilateralFilter(img, ...); 
      STOP_TIMING(myFilter); 
      SHOW_TIMING(myFilter, "My Filter"); 
    }

You would then see something like the following printed to your console:

    input: 800w600h 3ch 8bpp, range[19,255][17,243][47,251] 
    My Filter: time: 213ms (ave=215ms min=197ms max=312ms, across 57 runs).

This is useful when your OpenCV code is not working as expected, particularly for embedded development where it is often difficult to use an IDE debugger.