Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

47% of the time is spent in BoolReader::read_bool when decoding lossy images #71

Open
Shnatsel opened this issue Apr 30, 2024 · 4 comments

Comments

@Shnatsel
Copy link
Contributor

Shnatsel commented Apr 30, 2024

Decoding this file with image-webp is 1.8x slower than dwebp -noasm -nofancy:
Puente_de_Don_Luis_I,_Oporto,_Portugal,_2019-06-02,_DD_29-31_HDR.webp.gz

(source: wikipedia, converted to lossy WebP using imagemagick)

Profiling with samply shows image_webp::vp8::BoolReader::read_bool being responsible for 47% of the total runtime when decoding this image: https://share.firefox.dev/49ZwlOo

This seems to be a similar issue to #55

@fintelia
Copy link
Contributor

fintelia commented May 1, 2024

Equivalent function in libwebp

Unfortunately, unlike with Huffman compression I don't think it is possible to avoid the bit-by-bit decoding for arithmetic coding.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Oct 16, 2024

Unfortunately #72 didn't seem to make a difference on benchmarks on my machine. Decoding the linked image is still 1.8x slower than dwebp -noasm -nofancy.

I've grepped libwebp source code for VP8GetBit and I couldn't find any specialized assembly routines for it, but they do have a dedicated variant for prob=0x80 (128 in decimal), so that's something we could probably crib for easy gains. prob=128 is used here and here.

There is also a VP8GetBitAlt function the purpose of which isn't clear to me at a glance.

@Shnatsel
Copy link
Contributor Author

This might be useful as reference for optimizing this: https://fgiesen.wordpress.com/2018/02/20/reading-bits-in-far-too-many-ways-part-2/

@fintelia
Copy link
Contributor

That whole series of articles is phenomenal and I've referred to it many times. Unfortunately it isn't directly applicable to this function. The problem is that while read_bool sounds like it is just reading a bit from the input stream, it is actually doing range coding to extract a compressed 1-bit symbol based on the current state of the range coder and the given probability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants