-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve loadImage
performance by between 20% and 100%
#726
Conversation
Thank you for this thoughtful contribution and explanation. Recognizing extremely small images is outside of my personal use case, so I was unaware of the large performance overhead in this specific situation. My initial thoughts are below.
Additionally, if Tesseract.js switches to using bmp more, I will have to look into the impact of this part of the tesseract.js/src/worker-script/utils/setImage.js Lines 16 to 26 in d7d0c2e
At present all bmp images are re-encoded at this step to account for the fact that Leptonica (image processing library used by Tesseract) can not process certain bmp images. Presumably this adds some overhead. If this overhead is meaningful, and the bmp images produced by toBlob are not subject to these issues, we should think about bypassing this step.
|
Great points, I will follow up on these later this week! 👌 |
I looked into this more today and am now confused regarding whether converting from canvas elements to bmp files is supported by browsers at all. Running the following in Chrome returns an object where the
By setting the second argument of
However, by setting the second argument to
Additionally, I did not find any indication in documentation for |
I just realized a major problem with my benchmark... I was using the API from But what caused the significant difference in performance then? There must be some kind of caching, because the first call to Between this and your notes about bitmap weirdness in Tesseract I think might need to rethink the whole approach. I found another format called PNM in the Leptonica source code. /**
* The pnm formats are exceedingly simple, because they have
* no compression and no colormaps. They support images that
* are 1 bpp; 2, 4, 8 and 16 bpp grayscale; and rgb.
*/ Source: I assume Tesseract.js uses all the Leptonica stuff under the hood. In theory it would be trivial to take a canvas |
e3e2e20
to
8a586ab
Compare
I've updated the implementation.
Yes. I've guarded each usage of } else if (OffscreenCanvas && image instanceof OffscreenCanvas) {
// ...
} If In addition to these fallbacks,
Yes, I've added a test that duplicates the regular canvas test exactly, using
Between this and the questionable behavior of const imageDataToPBM = (imageData) => {
const { width, height, data } = imageData;
const DEPTH = 4; // channels per pixel (RGBA = 4)
const MAXVAL = 255; // range of each channel (0-255)
const TUPLTYPE = 'RGB_ALPHA';
let header = 'P7\n';
header += `WIDTH ${width}\n`;
header += `HEIGHT ${height}\n`;
header += `DEPTH ${DEPTH}\n`;
header += `MAXVAL ${MAXVAL}\n`;
header += `TUPLTYPE ${TUPLTYPE}\n`;
header += 'ENDHDR\n';
const encoder = new TextEncoder();
const binaryHeader = encoder.encode(header);
const binary = new Uint8Array(binaryHeader.length + data.length);
binary.set(binaryHeader);
binary.set(data, binaryHeader.length);
return binary;
}; |
Here is an updated sample run of the Canvas unit tests, before and after this PR. It includes both the total runtime of the tests, as well as a Before
After
Like before, the performance improvement is impressive percentage-wise, but only about a 10ms difference in absolute terms. That means it will be most noticeable on very small input images which already run end-to-end very fast (100ms or less). |
Thanks for updating, I will review at some point this week. |
I tested this today with the benchmark images, and for the larger images this branch appears to run significantly slower. For example, when I loaded the largest benchmark image ( |
Interesting. I'll take a look and try to track down the source of the slowness on large images. Maybe it's something like an unnecessary alpha channel that I'm always including? I'll also take a closer look at the PBM format I'm generating and make sure the resulting images appear as expected (visually). It could also be some kind of unintended color-shift under the hood if I've got the image format slightly off. Thanks for your time looking into this. |
I tried removing the alpha channel. It's a bit faster than before, but still slower than
|
I ran your branch with Runs with master: Runs with pr: I also ran the updated branch with |
Let's ditch all this custom image loading stuff for now. It's too unclear what is causing the performance differences. I'd like to recover a small piece of working functionality from this PR: support for Take a look at your convenience. #766 |
Canvas
.toBlob()
BenchmarksI tested the performance of
toBlob
with 4 different lossless image formats:image/png
)image/x-dcraw
)image/tif
)image/bmp
)There's a list of other potential mime types in this thread, however
image/bmp
is commonly supported and gave me the best results on average (see below).On a 1955x3036px Canvas image:
(Using pixel data from the
meditations.jpg
from thetesseract.js
repo)Average of 19.9% speedup for BMP
On a 100x100px Canvas image:
Average of 97.7% speedup for BMP (!!)
Summary
The benefits of a faster image encoder are most noticeable on a small image size (100x100px or smaller). In addition, Tesseract.js can achieve truly lightning fast speeds on small inputs -- I'm seeing single-digit millisecond calls to
recognize
in some of my testing. It is plenty fast enough for fully realtime use cases when PNG encoding is avoided, since PNG encoding could take 4-5x longer than the actual text recognition at these sizes.Other changes
Add support for
OffscreenCanvas
This enables Tesseract.js to be started from inside another Web Worker, if desired (for example, after some image pre-processing which already happens off of the main thread). It uses the
OffscreenCanvas.convertToBlob
method which is an exact parallel ofCanvas.toBlob
.Also, since
HTMLElement
is not defined inside a web worker, I added a check for that before using theinstanceof
method in order to avoid errors when callingloadImage
from inside a Worker.Update
ImageLike
typestype ImageLike
currently contains two types for which there is no implementation:CanvasRenderingContext2D
, which is not the intended way to read from<canvas>
and would have no effectImageData
which similarly would need to be converted to a blob somehow, and it is easier to start from the<canvas>
directlyBoth options would currently result in runtime errors if they were used. So I've removed these two types, while also adding the
OffscreenCanvas
type as an option.(not implemented) Better support for
<video>
Support for
<video>
sources could be improved, to obtain individual video frames instead of the staticvideo.poster
which is currently used. This would involve capturing video frames viacanvasContex.drawImage(video, 0, 0)
and then callingcanvas.toBlob()
like any other source. However, this would technically be a breaking change if anyone relies on the existingvideo.poster
behavior, so I've left that out of this PR for now. I think reading video frames would be a more intuitive behavior than reading the video poster/thumbnail, but for now it would suffice for the user to implement that manually and maintain backwards compatibility.Checks
npm run test:all
passes ✔