As an input, we access pixelBuffer, which contains the pixel of the video feed from the camera of the device. We run our face detection model and obtain faceObservations. This will contain the detection results. If the variable is empty, it means that no face was detected and we do not go further in the function:
try faceDetectionHandler.perform([faceDetectionRequest], on: pixelBuffer, orientation: exifOrientation)
guard let faceObservations = faceDetectionRequest.results as? [VNFaceObservation], faceObservations.isEmpty == false else {
return
}
Then, for each faceObservation in faceObservations, we classify the area containing the face:
let classificationHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: .right, options: [:])
let box = faceObservation.boundingBox
let region = CGRect(x: box.minY, y: 1 - box.maxX, width: box.height, height:box.width)
self.classificationRequest.regionOfInterest = region
try classificationHandler.perform([self.classificationRequest...