Computing depth from stereo image
Humans view the world in three dimensions using their two eyes. Robots can do the same when they are equipped with two cameras. This is called stereovision. A stereo rig is a pair of cameras mounted on a device, looking at the same scene and separated by a fixed baseline (distance between the two cameras). This recipe will show you how a depth map can be computed from two stereo images by computing dense correspondence between the two views.
Getting ready
A stereovision system is generally made of two side-by-side cameras looking at the same direction. The following figure illustrates such a stereo system in a perfectly aligned configuration:
Under this ideal configuration the cameras are only separated by a horizontal translation and therefore all epipolar lines are horizontal. This means that corresponding points have the same y
coordinates, which reduces the search for matches to a 1D line. The difference in their x
coordinates depends on the depth of...