Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSL: runtime array over argument buffers #2179

Closed
wants to merge 11 commits into from

Conversation

Try
Copy link
Contributor

@Try Try commented Jul 17, 2023

This PR adds support for Metal3-style bindless. This is last bit that is missing for my engine to enable ray-query on Metal3.

For now works only, if descriptor-set over argument-buffer emulation (flag --msl-argument-buffers) is disabled. It's possible to extend, if required in future.

Translation example:
uniform texture2D textures[]; -> const device spvDescriptor<texture2d<float>>* textures
uniform Ubo { uint val; } ubo[] -> const device spvDescriptor<constant Ubo*>* ubo

spvDescriptor<T> is simple wrapper structure, unfortunately MSL doesn't allow to have flat declaration.
const device address space is only option here. While Metal3 now allows any(device/constant/managed) space, device is most flexible.

related to #2112

@billhollings
Copy link
Contributor

This PR adds support for Metal3-style bindless.

Metal3-style bindless (ie- consistent 64-bit argbuff elements) seems to be working already. What Metal3 features are you adding here?

uniform texture2D textures[]; -> const device spvDescriptor<texture2d<float>>* textures

I guess this implies argument buffers can contain only homogeneous types? Is that the reason for excluding descriptor-set structures?

A typical descriptor set use would be:

struct spvDescriptorSetBuffer0
{
    array<sampler, 4> m_17 [[id(0)]];
    array<texture2d<float>, 16> m_12 [[id(4)]];
};

A runtime array would only be used for the last member, and should be usable by declaring that member as array<texture2d<float>, 1> [[id(4)]];, while knowing the shader will access it out-of-bounds. I'm wondering if that would solve the issue you are addressing here, in a generic way that would also include descriptor set member use?

@Try
Copy link
Contributor Author

Try commented Jul 18, 2023

Hi, @billhollings ! Glad, that you jumping in discussion)

I'm wondering if that would solve the issue you are addressing here, in a generic way that would also include descriptor set member use?
A runtime array would only be used for the last member

No, this is actually not a solution at all. Engine setup I'm working on requires multiple runtime arrays (not just one), where each array can be modified/resized at runtime.
In DX12: each array can be moved in unique space, with NumDescriptors==-1, with dedicated root-parameter
In Metal: each array maps straight to array of descriptor buffers
In Vulkan: no good solutions - recompiling pipeline on draw-time with exact sizes.

Also to be clear: I'm not using MoltenVK, proffering native Metal support on engine level.

Is that the reason for excluding descriptor-set structures?

Just don't need them :)
It's legal in Metal3 to have:

struct spvDescriptorSetBuffer0
{
    const device spvDescriptor<sampler>* m_17;
    const device spvDescriptor<texture2d<float>>* m_12;
};

Yet, don't know if MoltenVK needs something, like this.

@Try Try marked this pull request as draft July 19, 2023 20:20
@cdavis5e
Copy link
Contributor

A runtime array would only be used for the last member, and should be usable by declaring that member as array<texture2d<float>, 1> [[id(4)]];, while knowing the shader will access it out-of-bounds.

I thought that didn't work, and you had to use a bare array for that instead: texture2d<float> [[id(4)]] [1]; (Note the position of the id attribute, before the array length.)

@billhollings
Copy link
Contributor

A runtime array would only be used for the last member, and should be usable by declaring that member as array<texture2d<float>, 1> [[id(4)]];, while knowing the shader will access it out-of-bounds.

I thought that didn't work, and you had to use a bare array for that instead: texture2d<float> [[id(4)]] [1]; (Note the position of the id attribute, before the array length.)

Oops! My bad. I was quoting from memory, based on the pattern of using a length-1 array, and clearly got the syntax wrong. Thanks for the correction.

@Try Try marked this pull request as ready for review July 20, 2023 21:42
@HansKristian-Work
Copy link
Contributor

I've been on vacation for some time and I'm back from vacation on Aug 6th ish to look at this, so please excuse any inactivity.

@HansKristian-Work
Copy link
Contributor

spvDescriptor is simple wrapper structure, unfortunately MSL doesn't allow to have flat declaration.

Can you explain what you mean here?

@HansKristian-Work
Copy link
Contributor

I thought that didn't work, and you had to use a bare array for that instead: texture2d [[id(4)]] [1]; (Note the position of the id attribute, before the array length.)

Do we know if the 1-sized array trick will work for descriptors? (It's silly that there is no C99 unsized array).

@HansKristian-Work
Copy link
Contributor

The spvBufferDescriptor feels awkward. I assume this forms an ABI that users must be aware of? I think we'd need an option to control the layout of array-of-SSBO to deal with this. Also, it seems useful to be able to use the array style even for non-runtime sized arrays. The current implementation of array-of-SSBO and UBO for discrete descriptors is extremely awkward, having to unroll them and reroll them back into a thread local array is pretty disgusting.

@HansKristian-Work
Copy link
Contributor

I also wonder if it's possible to wrap the entire array into a type that deals with the .value stuff. E.g. something like:

template <typename T>
struct spvDescriptor
{
    T value;
};

template <typename T>
struct spvDescriptorArray
{
    const device spvDescriptor<T> *array;
    T operator[](uint index) { return array[index].value; }
};

void main0(spvDescriptorArray<texture2d<float>> array
   /* array itself is in thread storage */
  [[buffer(N)]]
  /* Not sure if buffer is needed here? This should be equivalent of push constant a pointer. */)
{
    array[10].sample();
}

That should simplify a lot of things and we might be able to wrap most of this into the type declaration rather than special casing this everywhere.

@Try
Copy link
Contributor Author

Try commented Aug 9, 2023

[[MSL doesn't allow to have flat declaration]] Can you explain what you mean here?

It's not legal in MSL to declare array of pointers/descriptors straight:
const device texture2d<float>* textures [[buffer(0)]], so I had to introduce a wrapper structure.

The spvBufferDescriptor feels awkward. I assume this forms an ABI that users must be aware of?

Yes, engine have to pass buffer size to shader, so OpArrayLength can work. On C++ side it's quite simple:

struct spvBufferDescriptor {
  uint64_t gpuAddress; 
  uint32_t byteSize;
  // 4 bytes of padding
  };

the array style even for non-runtime sized arrays.

Yes, yet keep in mind that this feature is Metal3-Tier2 - quite high-end.

I think we'd need an option to control the layout of array-of-SSBO to deal with this.

Can you provide some details, on how do you want it?

I also wonder if it's possible to wrap the entire array into a type that deals with the .value stuff.

Almost. It' not legal in declaration of main to use spvDescriptorArray directly, but possible to have it in following way:

fragment main0_out main0(
    sampler smp,
    const device spvDescriptor<texture2d<float>>* textures [[buffer(0)]]
    )
{
    spvDescriptorArray<texture2d<float>> tex {textures}; // wrap argument buffer
    use(tex[i]);

@HansKristian-Work
Copy link
Contributor

Almost. It' not legal in declaration of main to use spvDescriptorArray directly, but possible to have it in following way:

I think that style would be preferable. Having a fixup hook to declare that object is easy, and all other code can consider the wrapped type, avoiding any special hooks.

Can you provide some details, on how do you want it?

I think we'd just want an option to either enable or disable the length field. Using OpArrayLength is somewhat esoteric and I can see the desire to just have a flat array of pointers. If ArrayLength is used without it enabled, it should throw an error.

@Try
Copy link
Contributor Author

Try commented Aug 13, 2023

Addressed issues:

  • add spvDescriptorArray wrapper with nice operator []
  • added CLI option "--msl-runtime-array-rich-decriptor" to toggle spvBufferDescriptor/spvDescriptor
  • generalize a little runtime array of buffers and fixed sized buffers, so no more need for buffer_arrays_discrete

@HansKristian-Work HansKristian-Work mentioned this pull request Aug 17, 2023
@HansKristian-Work
Copy link
Contributor

Merged on master with some nits applied + squash. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants