In the preceding section, we succeeded in extracting text from images with well-typeset text; for example, scanned documents. However, for text in photos of common scenes, our application doesn't work well. In this section, we are going to fix this issue of our application.
In this section, we will resort to the EAST text detector with OpenCV to detect the presence of text in an image. EAST is short for an Efficient and Accurate Scene Text detector, a description of which can be found at https://arxiv.org/abs/1704.03155. It is a neural network-based algorithm, but the architecture of its neural network model and the training process are beyond the scope of this chapter. In this section, we will focus on how to use a pretrained model of OpenCV's EAST text detector.
Before starting with the code, let's get the pretrained model ready...