Transcribing handwritten images
Imagine a scenario where you must extract information from a scanned document (extracting keys and values from a picture of an ID card or a picture of a manually filled-in form). You’ll have to extract (transcribe) text from the image. This problem gets tricky due to variety in the following:
- Handwriting
- Quality of the scan/picture
- Lighting conditions
In this section, we will learn about the technique to transcribe handwritten images.
Let us first understand how an encoder-decoder architecture of a transformer can be applied to transcribe a handwritten image.
Handwriting transcription workflow
We will leverage the TrOCR architecture (source: https://arxiv.org/abs/2109.10282) to transcribe handwritten information.
The following diagram shows the workflow that is followed:
Figure 15.9: TrOCR workflow
As shown in the preceding picture, the workflow is as follows:
- We take an...