-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significantly wrong recognition result for images from iPhone #783
Comments
Thanks for reporting. I was able to replicate using the images provided in Tesseract.js. Handling of .jpeg files is handled by Tesseract rather than anything specific to Tesseract.js. Therefore, this is definitely caused by issues with Tesseract dependencies rather than any JavaScript code in this repo. I think the next question is whether this bug exists in the latest version of Tesseract (and the image processing libraries it uses) or if it has already been patched. It would be great if you could try and replicate with the latest version of the Tesseract CLI, and if the issue can be replicate with that, open an issue in the main Tesseract project. Alternatively, if this does not occur for the latest version of the Tesseract CLI, that indicates the issue has already been resolved, and we can update the dependencies we use for Tesseract.js. Out of curiosity, do you know of any particular place where files with this format come from (aside from producing them on purpose in editing software)? I'm curious if this is a newer thing we should expect to see more of, or more of a niche format used only within particular applications. |
Hi @Balearica, I got these images with Display P3 color space in our web application by importing images from the gallery on IOS devices (iphone 14 - IOS 16.2, iphone XR - IOS 14.4). One noticeable point is that by default, on IOS, the image taken from the camera would have HEIC type with Display P3 color space. When importing them to browsers (Safari, Chrome), they are automatically converted to JPEG. That's how I got JPEG images with Display P3 color space. Regarding the point to verify in the main Tesseract project, since I have never run it before, it could be really complicated for me. Could you help me to reproduce this bug in that project? Thanks in advance. |
@hieunguyen2211 To clarify, does this mean that all photos taken with an iPhone/iPad (using default settings) do not work? If you are not sure how to install/compile Tesseract, I can check that myself at some point this week. |
Hi @Balearica, I do not have a chance to test with iPad. "All photos taken with an iPhone using default settings do not work" -> correct! |
I looked into this further, and it looks like the color space is not the core issue, but rather the orientation metadata. This can be verified by checking the intermediate images (this example shows how to do this). Unfortunately for developers, Apple does not actually rotate images depending on the orientation of your phone. Instead, it adds a metadata tag indicating the orientation, offloading the work of rotating the image onto the image viewer program. Some image processing programs have a step that recognizes this metadata and rotates the image, while others do not. Leptonica (the image processing library used by Tesseract) does not have a step for this. The issue was presumably resolved when you saved as a different format as almost all image processing programs not created by Apple will save the image with the correct rotation so it can be viewed correctly by all programs without extra steps. Tesseract.js does have code intended to detect orientation metadata and rotate images, however it must not be working in this instance. I will investigate further this week. |
Unfortunately it looks like there was a regression in our handling of images with orientation metadata. Notably, we did not have unit tests for this, so the fact that the feature was broken was not caught automatically. In #784 I fix the issue, as well as add unit tests for .jpeg images with 90/180/270 degrees of rotation specified in metadata, so if there are similar issues in the future we should catch them. |
Thanks a lot for your help. Could you give me some information about when the next release with this fix will be? |
I just published v4.1.1 so updating to the latest version should resolve. |
I just give it a try and it worked as expected. Thanks in advance. |
Tesseract.js version (version number for npm/GitHub release, or specific commit for repo)
Still occurs in the latest version (v4.11.0) and in the showcase (2.0.0).
Describe the bug
The OCR recognition result is significantly wrong if using images with the Display P3 color profile. When I tested In the showcase, all recognition text boxes are totally incorrectly marked. However, with the same image, after changing to the sRGB color profile, everything works smoothly (all recognition text boxes are correctly marked).
To Reproduce
Steps to reproduce the behavior:
Images
Expected behavior
Device Version:
The text was updated successfully, but these errors were encountered: