Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rotation preprocessing option #648

Closed
Balearica opened this issue Aug 23, 2022 · 2 comments
Closed

Add rotation preprocessing option #648

Balearica opened this issue Aug 23, 2022 · 2 comments

Comments

@Balearica
Copy link
Member

Tesseract performs extremely poorly when text is at an angle. For example, below is a scan with ~5 degrees of rotation. The first image shows the text Tesseract recognized without applying preprocessing while the second image shows what Tesseract recognized after rotating.

The maintainers of the main Tesseract repo frequently suggest adding image preprocessing steps (including auto-rotation) to workflows to address this, however this option is not ideal for web users. Given we already include the Leptonica image processing library, we should be able to expose a rotation option without much effort. Auto-rotation would be ideal, but is likely significantly more difficult to implement.

Possibly related to #588, which requests high-level functions that expose processed (binarized) images.

@Balearica
Copy link
Member Author

This feature has been added in the development branch for version 4 and will be included in that release. That branch is functional at present if you would like to try it out, and is described in more detail in #662. An example has also been included to demonstrate usage.

@Balearica
Copy link
Member Author

Closing as this was added in Version 4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant