Summary
We started this chapter by learning about the basic building blocks of every Vision feature: how to use a VNRequest
instance, its corresponding VNRequestHandler
instances, and the resulting VNObservation
instances.
After learning the basics, we applied them to text recognition. We compared different recognition levels by using .fast
and .accurate
. We also learned about regions of interest and how they can affect the performance of Vision requests. Finally, we improved our results in text recognition by applying domain knowledge, fixing potential errors and misreads from Vision.
Finally, we learned about the new hand landmarks recognition capability. But this time, we also learned how to apply Vision requests to real-time video streams. We were able to detect hand landmarks in a video feed from a device's front camera and display an overlay to show the results. This chapter also provided a similar example that could be applied to body pose recognition.
In the...