Summary
In this chapter, we presented a brief introduction to OCR applications. We saw that the preprocessing phase of such systems must be adjusted according to the type of documents that we are planning to identify. We learned the common operations while preprocessing text files, such as thresholding, cropping, skewing, and text region segmentation. Finally, we learned how to install and use Tesseract OCR to convert our image to text.
In the next chapter, we'll use a more sophisticated OCR technique to identify text in a casually taken picture or video—a situation known as scene text recognition. This is a much more complex scenario, since the text can be anywhere, in any font, and with different illuminations and orientations. There can be no text at all! We'll also learn how to use the OpenCV 3.0 text contribution module, which is fully integrated with Tesseract.