-
-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document performance caveats of RGB image formats versus RGBA #79771
base: master
Are you sure you want to change the base?
Document performance caveats of RGB image formats versus RGBA #79771
Conversation
Most of these changes say " |
@@ -107,6 +107,7 @@ | |||
<param index="0" name="format" type="int" enum="Image.Format" /> | |||
<description> | |||
Converts the image's format. See [enum Format] constants. | |||
[b]Note:[/b] Converting to [constant FORMAT_RGBA8] is slow, as it is not aligned to 1, 2 or 4 bytes. If you need to frequently convert an image, consider using [constant FORMAT_RGBA8], or better, [constant FORMAT_RG8] or [constant FORMAT_R8] if possible. The same applies to [constant FORMAT_RGBAH] and [constant FORMAT_RGBAF] versus their RGB/RG/R counterparts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[b]Note:[/b] Converting to [constant FORMAT_RGBA8] is slow, as it is not aligned to 1, 2 or 4 bytes. If you need to frequently convert an image, consider using [constant FORMAT_RGBA8], or better, [constant FORMAT_RG8] or [constant FORMAT_R8] if possible. The same applies to [constant FORMAT_RGBAH] and [constant FORMAT_RGBAF] versus their RGB/RG/R counterparts. | |
[b]Note:[/b] Converting to [constant FORMAT_RGB8] is slow, as it is not aligned to 1, 2 or 4 bytes. If you need to frequently convert an image, consider using [constant FORMAT_RGBA8], or for images without a blue channel, [constant FORMAT_RG8] or [constant FORMAT_R8] if possible. The same applies to [constant FORMAT_RGBH] and [constant FORMAT_RGBF] versus their RGBA/RG/R counterparts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can at least do without the bit-alignment piece of knowledge.
[b]Note:[/b] Converting to [constant FORMAT_RGBA8] is slow, as it is not aligned to 1, 2 or 4 bytes. If you need to frequently convert an image, consider using [constant FORMAT_RGBA8], or better, [constant FORMAT_RG8] or [constant FORMAT_R8] if possible. The same applies to [constant FORMAT_RGBAH] and [constant FORMAT_RGBAF] versus their RGB/RG/R counterparts. | |
[b]Note:[/b] Converting to [constant FORMAT_RGB8] is [i]very[/i] slow. If you need to frequently convert an image, consider using [constant FORMAT_RGBA8], [constant FORMAT_RG8], or [constant FORMAT_R8], which are faster. The same applies to [constant FORMAT_RGBH] and [constant FORMAT_RGBF] compared to their RGBA/RG/R counterparts. |
Or
[b]Note:[/b] Converting to [constant FORMAT_RGBA8] is slow, as it is not aligned to 1, 2 or 4 bytes. If you need to frequently convert an image, consider using [constant FORMAT_RGBA8], or better, [constant FORMAT_RG8] or [constant FORMAT_R8] if possible. The same applies to [constant FORMAT_RGBAH] and [constant FORMAT_RGBAF] versus their RGB/RG/R counterparts. | |
[b]Note:[/b] Converting to [constant FORMAT_RGB8] is [i]very[/i] slow. If you need to frequently convert an image, consider using [constant FORMAT_RGBA8]. If possible, [constant FORMAT_RG8] or [constant FORMAT_R8] are faster. The same applies to [constant FORMAT_RGBH] and [constant FORMAT_RGBF] compared to their RGBA/RG/R counterparts. |
Or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we can make the text both informative and user-friendly by taking your suggestion but prepending "Due to memory alignment reasons," or appending "This slowness happens due to memory alignment reasons."
Before documenting this, I'd really like to see a performance analysis of what exactly is slow. I don't see an image conversion in the texture update codepath. We need to understand if the slowness is in the GPU driver (in which case we need to document it) or if the slowness is from something we are doing (in which case we may be able to fix the underlying issue) |
@clayjohn I did some digging during #74238 when we noticed that sometimes custom image mipmaps were wiped out in the renderer/storage: #66848 (comment) There are a few places in texture storage where images are converted, for example this one (same also for FORMAT_RGBF and FORMAT_RGBH): godot/servers/rendering/renderer_rd/storage_rd/texture_storage.cpp Lines 1467 to 1481 in 1c1524a
And a few other places: godot/servers/rendering/renderer_rd/storage_rd/texture_storage.cpp Lines 1267 to 1271 in 1c1524a
|
@bitsawer Thanks for pointing that out. Looks like the condition was set to always be false in 42b44f4. Although, it probably doesn't matter much as actual GPU support is really poor I think we still need to check whether the call to |
This is a
godot/drivers/vulkan/rendering_device_vulkan.cpp Line 2598 in 6588a4a
Image::convert Line 552 in 6588a4a
Line 640 in 6588a4a
Something to note: When I run the benchmark with an official 4.1.1 build, a single iteration takes about 70 ms. With my custom build I get around 70 to 80 ms. I wonder if the difference can be explained by me using MSVC to compile the build? I copied the scons arguments from https://github.com/godotengine/godot-build-scripts/blob/main/build-windows/build.sh |
Official builds are compiled with a recent MinGW-GCC using
Unfortunately, no. This is planned for a future release, but since we use MinGW, you'd have to convert the DWARF debug symbols to PDB format (there's a tool for that out there). Footnotes
|
Is this still relevant now? I feel like the note in the current PR is particularly excessive. |
@Mickeon Considering how extreme the performance disparity is, a strongly worded note is important. |
Both of the problematic functions have been massively optimized in the past year. So this needs to be re-evaluated |
I ran the benchmark again using an updated MRP on 4.4.dev fd4c29a: test_image_performance.zip PC specifications
Using an optimized editor build for the tests.
The RGB functions are on average 4 times slower than the RGBA ones, despite storing less data. |
The optimization is incredible, but the stark difference in performance with RGB is still way too noticeable to ignore, yeah. |
@Calinou I found your results surprising, so I tested on 4.4 dev4 on my device (i7-1165G7). I suspect the difference comes from the graphics drivers
|
For context, @lyuma ran this benchmark on a 4096×4096 texture:
These are the results they got depending on the Image format used:
Counterintuitively, the more memory-intensive RGBA formats always outperform the RGB format by a significant margin (by a 5× factor on average on this test). This is due to memory alignment, as RGB uses 3 components whereas other formats use 1, 2 or 4 components (powers of 2).
Considering how important the difference is, I think it's better to add notes everywhere it's relevant, even if it's a bit redundant.