-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added improvements to color scaling and blurring #3904
Conversation
DedeHai
commented
Apr 15, 2024
- made blurring faster by not writing the color and then reading it back but keeping it as a variable: on a C3, FX black hole goes from 55FPS to 71FPS
- added optional parameter to blur function (smear) that can be used in combination with SEGMENT.clear(), blurring the frame without dimming the current display (repeated calls without clearing will flood the segment). this is useful to blur without 'motion blurring' being added.
- scale8 is inlined and repeated calls uses flash, plus it is slower than native 32bit, so I added 'color_scale' function which is native 32bit and scales 32bit colors (RGBW).
- bonus: changes save roughly 600bytes of flash memory
-changes save roughly 600bytes of flash -made blurring faster by not writing the color and then reading it back but keeping it as a variable: on a C3, FX black hole goes from 55FPS to 71FPS -added optional parameter to blur (smear) that can be used in combination with SEGMENT.clear(), blurring the frame without dimming the current frame (repeated calls without clearing will result in white). this is useful to blur without 'motion blurring' being added -scale8 is inlined and repeated calls uses flash, plus it is slower than native 32bit, so I added 'color_scale' function which is native 32bit and scales 32bit colors (RGBW).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I approve of the changes with minor issues regarding formatting and unnecessary modifications.
I would avoid introducing new function color_scale()
when color_fade()
is intended to do the same. If you find the fade misleading (as opposed to scale) then please change the original function name instead. There are only a handful of references used, but I would recommend against it.
Thanks for the review (and for spotting that bug in fx_fcn, it should indeed be 255, leftover from testing). I will fix the unnecessary changes, that was my auto-format plugin that sometimes does this if I am not careful. The new function should indeed be integrated into color_fade, this resulting code is what came from a lot of code changes that did not make it into the final version. |
also replaced scale8_video with 32bit calculation in color_fade for consistency and speed.
Undo indent
FYI I do not see any performance gain on ESP32 or ESP8266. |
You can test it using 'Black Hole' FX. Here are my test result: |
Well, in my case it is:
Limit was set to 60FPS. |
added improvements to color scaling and blurring
@DedeHai thanks for this PR 😃 I've also merged it into the MoonModules fork. It does give me the performance gain you describe, at least on larger fixtures (tested with 32x32 on 4 led pins). The main speedup comes from your optimizations of 2D blur, looks like color_add only has a minor part in the speedup. Good improvement, as blur() and fade_out() are indeed on a very critical path👍 |
@softhack007 you are welcome. I discovered this 'insufficiency' in my particle system ventures where I noticed that accessing getPixelColorXY() and setPixelColorXY() are quite slow due to the many checks (and conversions) that need to be performed and blurring did it twice per pixel which I changed to only once per pixel, making it much faster. |
@DedeHai I think this is already happening in 🤔 Maybe conversions and several layers of abstractions - that happen between Segment::setPixelColor() and BusDigital::setPixelColor() - are still slowing down the process, so you see a 2X speedup when adding your own buffer on top... (off-topic) just for comparison, you could try to run your particle code in the MoonModules fork
I would be interested to know if the MM code performs better for you. maybe we find the pieces that could be brought back into "upstream AC" to further optimize speed. |
FYI a future version of NPB does away with double/tripple buffering and If we reintroduce my infamous |
the issue is not the buffers themselves but the checking that setPixelColor() (rightfully) does. Using a local buffer with a known size can get away without all those checks and just straight forward copy the data, which is a lot faster. the blur function acts on a known segment size and could safely do so. |
That's what |
the Pixels() method returns a pointer to buffer directly, You can use that to bypass the range checking. |
@Makuna we have translation from Segment (a virtual canvas) into pixel strip. Segments can overlap and can also encompass inexistent pixels so quite a few translations have to be made to set/get exact pixel. EDIT: @DedeHai overlapping Segments will wreck havoc if you do not blend local buffers correctly. Which you can't from within Segment environment (effect function). |
how so? OR: how are overlapping segments handled in general? What I am wondering: with overlapping segments setPixelColor() will overwrite what the segment put there or is there a check for that and it will blend? |
Segments are individual and do not share any data between themselves - except actual LED color returned by Non-local LED buffer is either global buffer or NPB's edit buffer. The problem with (current) NPB's edit buffer is the modified LED data if brightness is set to anything other than 255 (or 65535 for 16bit) as when you apply
How overlapping segments behave (with non-local buffer) very much depends on how the effects are written. Some will take into account previous/underlying pixel values some will not. So you can expect anything from only one effect showing to uncontrolled flickering and since segments are stored in a vector their order may get shifted when vectors are reallocated in memory (though that should be rare the case). |
Ok, I think I understand how it's done. I just tested overlapping segments and FX that use additive colors do overlap nicely and FX that use setPixelColor just overwrite the lower numbered segment. Edit: was just thinking if I could somehow make the particle system render on overlapping segments and I think it is currently not possible. It relies on a black frame and each particle will add its color to it, so I cannot use getPixelColor to blend with other FX. A global rendering function would have to take care of blending the segments, as described above. |
Out of band question. When you render your buffers are you thinking it would be multiple being blended into a destination? How many source buffers? And the destination buffer by something like a GetPixelColor? I was just experimenting with a two source buffers for a render (NeoPixelBus terms which are similar to PC game graphics terms), and was considering how many sources would be useful. |
FastLED implemented + operator (and +=) to simplify blending of two colors (it just added respective RGB channels) so many FastLED examples will exhibit something like: CRGB leds[200];
CRGB col1 = CRGB::RED;
CRGB col2 = CRGB::GREEN;
CRGB yellow = col1 + col2;
...
leds[100] = CRGB::BLUE;
leds[100] += yellow; So the + operator was actually Currently in WLED we don't do any special blending (though I did have a branch that blended two local buffers) as Segments are rendered one after the other and effect functions are all over the place regarding how they treat existing pixels (some will |
Thanks, enough with my side bar conversations.
I wonder if that snippet you provided is a bilinear blend?
|