- How is it possible to recognize characters in non-English languages with Tesseract?
Specify the corresponding language name when initializing the TessBaseAPI instance.
- When we used the EAST model to detect text areas, the detected areas are actually rotated rectangles, and we simply use their bounding rectangles instead. Is this always correct? If not, how can this approach be rectified?
It is correct, but this is not the best approach. We can copy the region in the bounding boxes of the rotated rectangles to new images, and then rotate and crop them to transform the rotated rectangles into regular rectangles. After that, we will generally get better outputs by sending the resulting regular rectangles to Tesseract in order to extract the text.
- Try to figure out a way to allow users to adjust the selected region after dragging the mouse...