In today's world, we have advanced cameras that are very successful at mimicking how a human eye captures light and color; but image-capturing in the right way is just stage one in the whole image-comprehension aspect. Post image-capturing, we will need to enable technology that interprets what has been captured and build context around it. This is what the human brain does when the eyes see something. Here comes the huge challenge: we all know that computers see images as huge piles of integer values that represent intensities across a spectrum of colors, and of course, computer have no context associated with the image itself. This is where ML comes into play. ML allows us to train a context for a dataset such that it enables computers to understand what objects certain sequences of numbers actually represent.
Computer vision is one of the...