Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a way to fix struct offsets automatically #52

Closed
wants to merge 2 commits into from

Conversation

Swoorup
Copy link

@Swoorup Swoorup commented Feb 2, 2024

No description provided.

@ScanMountGoat
Copy link
Owner

See the discussion for #50 for why this will not be merged. I understand that this feels wrong since on the CPU we have a lot of control over the memory layout of types and more relaxed alignment requirements. Adding padding fields to match the GPU layout encourages wasting memory that could be used by adding a padding field to the WGSL type itself. Packing multiple fields into a single vec4 or rarely a vec2 field is not uncommon in shader code for saving space. There are also packing/unpacking functions you can use if you really want to save memory. vec3 is something of a trap since it's essentially a vec4 with an inaccessible w component.

If you're willing to zero pad smaller types, you should just use encase. I would expect the performance impact compared to bytemuck to be minimal since uniform buffers tend to be small. You also won't have to manually initialize padding fields.

This is for OpenGL, but the tips still apply to wgpu/WebGPU:
https://stackoverflow.com/questions/38172696/should-i-ever-use-a-vec3-inside-of-a-uniform-buffer-or-shader-storage-buffer-o

@Swoorup
Copy link
Author

Swoorup commented Feb 3, 2024

I appreciate your feedback, and I completely understand the concerns about maintaining consistent structures. I'd like to discuss a specific example to better understand the reasoning behind the rejection.

Consider the following structures:

For example.

struct Test1 {
   mat3: mat3x3<f32>,
   eye: vec3<f32>,
}
@group(2) @binding(0) var<storage> test1: array<Test1>;

struct Test2 {
   mat3: mat4x4<f32>,
   eye: vec4<f32>,
}
@group(3) @binding(0) var<storage> test2: array<Test2>;

In this case, the first structure occupies a total of 64 bytes, while the second one occupies 80 bytes. From a performance perspective, it might be more optimal to prefer the first structure, as it aligns with the shape type and avoids the need to promote data to vec4 and mat4 on the CPU side.

I understand that issues may arise in scenarios involving arrays with non-aligned data, like below

struct Test3 {
   mat3: array<mat3x3<f32>, 4>,
   eye: vec4<f32>,
}

While I could potentially use encase, my codebase heavily relies on bytemuck, and my generic utilities are designed around it. On other note, it does seem that the user should only be allowed to choose either bytemuck or encase but not both. Wonder if there was a reason for allowing to enable both on?

@ScanMountGoat
Copy link
Owner

There are no performance advantages for the first example. Each column within a matrix is also aligned. A struct with mat3x4 and vec4 has equivalent size and memory layout. It's possible to use both encase and bytemuck without issue if the types already have the correct layout.
https://www.w3.org/TR/WGSL/#alignment-and-size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants