Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix uninitialized variable ending up sent to Vulkan #80034

Merged
merged 1 commit into from
Jul 30, 2023

Conversation

darksylinc
Copy link
Contributor

@darksylinc darksylinc commented Jul 29, 2023

The first time a shader is compiled Godot performs the following:

for (uint32_t i = 0; i < SHADER_STAGE_MAX; i++) {
	if(spirv_data.push_constant_stages_mask.has_flag((ShaderStage)(1 << i))) {
		binary_data.push_constant_vk_stages_mask |= shader_stage_masks[i];
	}
}

However binary_data.push_constant_vk_stages_mask is never initialized to 0 and thus contains garbage data or'ed with the good data.

This value is used by push constants (and many other things) thus it can be a big deal.

Fortunately because the relevant flags are always guaranteed to be set (but not guaranteed to be unset), the damage is restricted to:

  1. Performance (unnecessary flushing & over-excessive barriers)
  2. Overwriting push descriptors already set (this would be serious, doesn't seem to be an issue)
  3. Driver implementations going crazy when they see bits set they don't expect (unknown if this is an issue)

This uninitialized value is later saved into the binary cache.

Valgrind is able to detect this bug on the first run, but not on the subsequent ones because the data comes from a file.

cache_file_version has been bumped to force rebuild of all cached shaders. Because the ones generated so far are compromised.

Can this fix TDRs? (aka DEVICE_LOST)

Possibly.

Should this be backported to older versions?

Probably, I didn't check

The first time a shader is compiled Godot performs the following:

```cpp
for (uint32_t i = 0; i < SHADER_STAGE_MAX; i++) {
	if
(spirv_data.push_constant_stages_mask.has_flag((ShaderStage)(1 << i))) {
		binary_data.push_constant_vk_stages_mask |=
shader_stage_masks[i];
	}
}
```

However binary_data.push_constant_vk_stages_mask is never initialized to
0 and thus contains garbage data or'ed with the good data.

This value is used by push constants (and many other things) thus it can
be a big deal.

Fortunately because the relevant flags are always guaranteed to be set
(but not guaranteed to be unset), the damage is restricted to:

1. Performance (unnecessary flushing & over-excessive barriers)
2. Overwriting push descriptors already set (this would be serious,
doesn't seem to be an issue)
3. Driver implementations going crazy when they see bits set they don't
expect (unknown if this is an issue)

This uninitialized value is later saved into the binary cache.

Valgrind is able to detect this bug on the first run, but not on the
subsequent ones because they data comes from a file.

cache_file_version has been bumped to force rebuild of all cached
shaders. Because the ones generated so far are compromised.
@darksylinc darksylinc requested a review from a team as a code owner July 29, 2023 21:31
@clayjohn clayjohn added this to the 4.2 milestone Jul 29, 2023
@clayjohn clayjohn requested a review from RandomShaper July 29, 2023 21:56
@Calinou
Copy link
Member

Calinou commented Jul 29, 2023

I wonder if this can help resolve #77427.

@darksylinc
Copy link
Contributor Author

I wonder if this can help resolve #77427.

That bug is still present. But I thought it was on purpose / a limitation (zero-initializing 3D volume data takes time).

I can look into it if that's an issue, but it looks like it's just the 3D buffer not being initialized.

Copy link
Contributor

@BastiaanOlij BastiaanOlij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auch...

@clayjohn clayjohn merged commit 262d1ea into godotengine:master Jul 30, 2023
@clayjohn
Copy link
Member

Looks great! Congratulations on your first (of many) merged PR.

Note for maintainers, this should probably not be cherry picked to 4.1. technically it's fine, the only thing it will do is invalidate the previous cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants