Introducing optical character recognition
Identifying text in an image is a very popular application for computer vision. This process is commonly called optical character recognition, and is divided as follows:
- Text preprocessing and segmentation: During this step, the computer must deal with image noise, and rotation (skewing), and identify what areas are candidate text.
- Text identification: This is the process of identifying each letter in text which will be covered in the later chapters.
The preprocessing and segmentation phase can vary greatly depending on the source of the text. Let's take a look at common situations where preprocessing is done:
- Production OCR applications with a scanner: This is a very reliable source of text. In this scenario, the background of the image is usually white and the document is almost aligned with the scanner margins. The content that's being scanned contains basically text, with almost no noise. This kind of application relies on simple preprocessing techniques...