Improve image decompression speed and handling #2414

matejcik · 2022-07-26T12:22:28Z

comment thread from a different PR: #2339 (comment)

measurement results:

Some results from the testing:

the fastest is decompress as a whole, without window usage

slowest is to decompress byte by byte

This is where intuitive results end though. When decompressing by chunks, fastest seem to be using very small chunks of 2 or 4 bytes, after that, with increasing chunk size the speed actually goes down. I was testing this on two images, the best chunk size differ in each. I also tried comiling the uzlib with -O3 for one image, which sped up the decompressing quite nicely, but didn't change the dependence on the chunk size. Even when decompressing in one chunk when using window, its way slower than without window usage or small chunks.

Just to have some idea what increases we can get:

Decompressing by best chunk was 17% to 65% faster than per byte

Decompressing by 128 byte chunk was 7% to 40% faster than per byte.

Decompressing whole image without window was 70% to 110% faster than per byte.

This will differ a lot by image. Maybe testing on more images would bring better results.

So based on this, i would suggest using chunks of 4 bytes for decompression of large images (that means modify display_icon, display_image functions) and implementing new display_small_icon with some reasonable limits (loader using 64*64 so perhaps that) where we could do the whole decompression at once.

semantic issues:
B&W models (T1, TR) use display_icon to render the homescreen. The homescreen is certainly not an icon, and the important feature of "icon" is that its format is 4bpp alpha and so can be blended into different colored foregrounds/backgrounds -- unlike a "homescreen" which is an image at the display's native color resolution.
This is also incompatible with actual T1 homescreens which are stored as raw 1bpp data stream.

We probably need a separate function for rendering TR/T1 homescreens.
When we have that, we can limit display_icon accepted size: largest icons currently used are 24x24, so 576 pixels, or 288 bytes. That allows us to use window-less decompression into a static buffer.

Loaders are using a larger 64x64 center glyph, which is 4096 pixels or 2048 bytes. This could still be acceptable for window-less decompression.

possibly worth folding in: the new popups are using relatively large full-color icons. These could be split into smaller shape+glyph combinations: #2287 (comment)

The text was updated successfully, but these errors were encountered:

TychoVrahe · 2022-07-26T13:59:52Z

ad the 2287 issue: agree that splitting could be a way, though the shape is just circle and rounded triangle for which procedural implementation would make more sense. The glyph could then be greyscale, though it wouldn't fit the discussed 24*24 limit.

Maybe the display_image and display_icon functions are also named badly, wouldn't it be more fitting to call them display_16bpp_image and display_4bpp_image? The color is the real difference and neither does care about the size in current impl. Then it would actually make sense to also introduce display_1bpp_image function and small variants (either separate or hidden inside as discussed) with optimized decompression.

matejcik · 2022-07-26T14:26:11Z

display_16bpp_image and display_4bpp_image?

That is an option -- but the real difference IMO is that display_icon takes fg and bg color and uses a blending table instead of just dumping pixel values on screen. I don't think we ever want to display a 4bpp image as-is, but we do want to display the T1/TR homescreen as-is.

FWIW, display_icon does essentially the same thing as rendering a single glyph in display_text_render right?

the shape is just circle and rounded triangle for which procedural implementation would make more sense.

could be, but would the procedural implementation actually be smaller? :)

TychoVrahe · 2022-07-26T14:52:06Z

ever want to display a 4bpp image as-is

yeah that doesn't seem to be very useful. but this blending could be similar to 1bpp image usage on TT, in T1/TR homescreens the FG and BG color are just implicit, and theoretically you could want them flipped for some reason

display_icon does essentially the same thing as rendering a single glyph in display_text_render right?

essentially, but the text has more complexity due to the bearing and it also handles the different BPP values internally (in compile time)

could be, but would the procedural implementation actually be smaller? :)

who knows without trying, but procedural impl of basic geometric shapes can be easily reused in different context, sizes etc, which, ultimately, should lead to smaller flash footprint

TychoVrahe · 2022-07-26T14:55:15Z

One more thing to consider when optimizing decompression in animations like loader with icon is to do the decompression only once instead of during every frame. The decompressed data would(should) be stored on heap/gc allocated if used from rust.

matejcik · 2023-01-19T13:53:11Z

what remains to be done here? could we close, or transfer knowledge to Notion and then close?

TychoVrahe · 2023-01-19T14:24:14Z

waiting for micropython update #2341 , after that i want to set optimization of the deflate function to -03.

matejcik added the code Code improvements label Jul 26, 2022

matejcik mentioned this issue Jul 26, 2022

Success animation for Model R #2339

Merged

hynek-jina added this to Firmware Jul 26, 2022

TychoVrahe moved this to 🏃‍♀️ In progress in Firmware Aug 9, 2022

TychoVrahe self-assigned this Aug 9, 2022

sime moved this from 🏃‍♀️ In progress to 🎯 To do in Firmware Oct 13, 2022

matejcik added the blocked Blocked by external force. Third party inputs required. label Jan 19, 2023

Hannsek removed the blocked Blocked by external force. Third party inputs required. label Apr 25, 2023

TychoVrahe moved this from 🎯 To do to 🏃‍♀️ In progress in Firmware Apr 25, 2023

Hannsek mentioned this issue Apr 25, 2023

uzlib(trezor): partially optimize decompression for speed trezor/micropython#9

Merged

Hannsek closed this as completed Apr 26, 2023

Hannsek reopened this Apr 26, 2023

TychoVrahe mentioned this issue Apr 26, 2023

optimize uzlib (via micropython) #2983

Merged

TychoVrahe closed this as completed in #2983 Apr 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve image decompression speed and handling #2414

Improve image decompression speed and handling #2414

matejcik commented Jul 26, 2022

TychoVrahe commented Jul 26, 2022

matejcik commented Jul 26, 2022

TychoVrahe commented Jul 26, 2022 •

edited

Loading

TychoVrahe commented Jul 26, 2022

matejcik commented Jan 19, 2023

TychoVrahe commented Jan 19, 2023

Improve image decompression speed and handling #2414

Improve image decompression speed and handling #2414

Comments

matejcik commented Jul 26, 2022

TychoVrahe commented Jul 26, 2022

matejcik commented Jul 26, 2022

TychoVrahe commented Jul 26, 2022 • edited Loading

TychoVrahe commented Jul 26, 2022

matejcik commented Jan 19, 2023

TychoVrahe commented Jan 19, 2023

TychoVrahe commented Jul 26, 2022 •

edited

Loading