47% of the time is spent in `BoolReader::read_bool` when decoding lossy images #71

Shnatsel · 2024-04-30T21:58:44Z

Decoding this file with image-webp is 1.8x slower than dwebp -noasm -nofancy:
Puente_de_Don_Luis_I,_Oporto,_Portugal,_2019-06-02,_DD_29-31_HDR.webp.gz

(source: wikipedia, converted to lossy WebP using imagemagick)

Profiling with samply shows image_webp::vp8::BoolReader::read_bool being responsible for 47% of the total runtime when decoding this image: https://share.firefox.dev/49ZwlOo

This seems to be a similar issue to #55

The text was updated successfully, but these errors were encountered:

fintelia · 2024-05-01T03:04:11Z

Equivalent function in libwebp

Unfortunately, unlike with Huffman compression I don't think it is possible to avoid the bit-by-bit decoding for arithmetic coding.

Shnatsel · 2024-10-16T21:54:39Z

Unfortunately #72 didn't seem to make a difference on benchmarks on my machine. Decoding the linked image is still 1.8x slower than dwebp -noasm -nofancy.

I've grepped libwebp source code for VP8GetBit and I couldn't find any specialized assembly routines for it, but they do have a dedicated variant for prob=0x80 (128 in decimal), so that's something we could probably crib for easy gains. prob=128 is used here and here.

There is also a VP8GetBitAlt function the purpose of which isn't clear to me at a glance.

Shnatsel · 2024-12-10T16:37:50Z

This might be useful as reference for optimizing this: https://fgiesen.wordpress.com/2018/02/20/reading-bits-in-far-too-many-ways-part-2/

fintelia · 2024-12-11T07:16:08Z

That whole series of articles is phenomenal and I've referred to it many times. Unfortunately it isn't directly applicable to this function. The problem is that while read_bool sounds like it is just reading a bit from the input stream, it is actually doing range coding to extract a compressed 1-bit symbol based on the current state of the range coder and the given probability.

fintelia mentioned this issue May 1, 2024

High compression ratio mode for lossless WebP image-rs/image#2221

Open

okaneco mentioned this issue May 7, 2024

Optimize Frame::fill_rgb/fill_rgba clip function, reduce code in BoolReader::read_bool match statement #72

Merged

Shnatsel mentioned this issue Jul 28, 2024

Tracking issue for improvements in our dependencies Shnatsel/wondermagick#1

Open

Shnatsel mentioned this issue Oct 22, 2024

Decoding animated WebP is 4x slower than libwebp-sys + webp-animation #119

Open

This was referenced Oct 30, 2024

Parallelization opportunities #120

Open

Eliminate bounds checks in read_coefficients() (for zero performance gain?) #121

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

47% of the time is spent in `BoolReader::read_bool` when decoding lossy images #71

47% of the time is spent in `BoolReader::read_bool` when decoding lossy images #71

Shnatsel commented Apr 30, 2024 •

edited

Loading

fintelia commented May 1, 2024

Shnatsel commented Oct 16, 2024 •

edited

Loading

Shnatsel commented Dec 10, 2024

fintelia commented Dec 11, 2024

47% of the time is spent in BoolReader::read_bool when decoding lossy images #71

47% of the time is spent in BoolReader::read_bool when decoding lossy images #71

Comments

Shnatsel commented Apr 30, 2024 • edited Loading

fintelia commented May 1, 2024

Shnatsel commented Oct 16, 2024 • edited Loading

Shnatsel commented Dec 10, 2024

fintelia commented Dec 11, 2024

47% of the time is spent in `BoolReader::read_bool` when decoding lossy images #71

47% of the time is spent in `BoolReader::read_bool` when decoding lossy images #71

Shnatsel commented Apr 30, 2024 •

edited

Loading

Shnatsel commented Oct 16, 2024 •

edited

Loading