POSIT
After we have found the 2D position of our landmark points, we can derive the 3D pose of our model using the POSIT. The pose P of a 3D object is defined as the 3 x 3 rotation matrix R and the 3D translation vector T, hence P is equal to [ R | T ].
Note
Most of this section is based on the OpenCV POSIT tutorial by Javier Barandiaran.
As the name implies, POSIT uses the Pose from Orthography and Scaling (POS) algorithm in several iterations, so it is an acronym for POS with Iterations. The hypothesis for its working is that we can detect and match in the image four or more non-coplanar feature points of the object and that we know their relative geometry on the object.
The main idea of the algorithm is that we can find a good approximation to the object pose, supposing that all the model points are in the same plane, since their depths are not very different from one another if compared to the distance from the camera to a face. After the initial pose is obtained, the rotation matrix and...