Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GLSL/MSL: Support VK_KHR_zero_initialize_workgroup_memory #2444

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dboyan
Copy link
Contributor

@dboyan dboyan commented Feb 8, 2025

The extension allows initializing workgroup memory with OpConstantNull, and is a part of Vulkan 1.3.

In GLSL, the support is realized via GL_EXT_null_initializer. While in MSL, we have to generate code snippets to let the threads initialize the memory. Currently I don't have a corresponding implementation for HLSL due to my lack of knowledge.

When connected with MoltenVK, the implementation can pass all Vulkan CTS compute tests enabled by the extension when an independent fix for specialization constants (KhronosGroup/MoltenVK#2434) is applied. The MoltenVK code (with downstream fix) is available at: https://github.com/dboyan/MoltenVK/tree/zero_init_wg_mem-specfix (need to manually specify SPIRV-Cross tree with this PR branch when using fetchDependencies while building).

Vulkan CTS compute cases before enabling VK_KHR_zero_initialize_workgroup_memory:

  Passed:        81/45160 (0.2%)
  Failed:        0/45160 (0.0%)
  Not supported: 45079/45160 (99.8%)

Vulkan CTS compute cases after enabling VK_KHR_zero_initialize_workgroup_memory (with KhronosGroup/MoltenVK#2434 applied):

  Passed:        712/45160 (1.6%)
  Failed:        0/45160 (0.0%)
  Not supported: 44448/45160 (98.4%)

Closes: #2443

@CLAassistant
Copy link

CLAassistant commented Feb 8, 2025

CLA assistant check
All committers have signed the CLA.

spirv_msl.cpp Outdated Show resolved Hide resolved
spirv_glsl.cpp Outdated Show resolved Hide resolved
// In the latter case, we cannot take the value as-is, as it can be changed to anything.
// Rather, we assume it to be *one* for the sake of initializer.
bool is_literal_array_size = constant_type.array_size_literal.back();
uint32_t count = is_literal_array_size ? constant_type.array.back() : 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work only because we kinda throw away the array size of an OpConstantNull later when initializing the threadgroup variable anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the case for shared (threadgroup) variable, but not for other kinds. In those cases, the initializer list is still used. In fact, I'm not 100% sure if the handling is correct for those cases where initializer is used (although I hope it is if the languages act like C/C++). At least Apple's MSL compiler does not complain about it. For GLSL, I'm not an expert to say if it is right. We could again use GL_EXT_null_initializer to eliminate doubt but it seems to require some extra work I don't quite know how. I tried to set the array size to 0, but it seems to turn the array into a scalar value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered the logic this part is a little problematic for glsl. I'm adding a test and trying to resolve it.

Copy link
Contributor

@HansKristian-Work HansKristian-Work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Biggest concern is lack of test.

…antNull

The usage appears in Vulkan CTS related to extension
VK_KHR_zero_initialize_workgroup_memory.
The extension allows initializing workgroup memory with OpConstantNull,
and is a part of Vulkan 1.3.

In GLSL, the support is realized via GL_EXT_null_initializer. While in MSL,
we have to generate code snippets to let the threads initialize the memory.
Copy link
Contributor

@HansKristian-Work HansKristian-Work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable. Mostly nits.

res += join(" = ", to_initializer_expression(variable));
else if (options.force_zero_initialized_variables && type_can_zero_initialize(type))
res += join(" = ", to_zero_initialized_expression(get_variable_data_type_id(variable)));
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brace on next line.


// "Real" workgroup variables in compute shaders needs extra caretaking.
// They need to be initialized with an extra routine as they come in arbitrary form.
if (var.storage == StorageClassWorkgroup && var.initializer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop braces here.

if (is_boolean) {
type.basetype = SPIRType::Boolean;
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brace on next line.

// We use short to represent bool for threadgroup variable to workaround compiler bug,
// so we do a temporary fixup here. Alas. (see the type_to_glsl method)
bool is_boolean = type.basetype == SPIRType::Boolean;
if (is_boolean) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop brace.

string var_stride_name = join(var_name, "_stride");
string var_ptr2_name = join(var_name, "_ptr2");

statement("threadgroup uint *", var_ptr_name, " = (threadgroup uint *)&", var_name, ";");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gross, but it's certainly not worse than the alternative ...

begin_scope();
statement(join(to_name(var.self), " = ", to_initializer_expression(var), ";"));
end_scope();
if (is_boolean) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop brace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for VK_KHR_zero_initialize_workgroup_memory
3 participants