Proposal: Low level struct improvements #3936

jaredpar · 2020-09-23T23:33:55Z

This design covers adding the following features to C# for low level
struct performance improvements:

Allowing ref struct to contain ref fields.
Allowing struct and ref struct to return ref to their fields.
Allowing safe fixed buffers for unmanaged and managed types.

Note: the section on providing parameter escape notations is incomplete
at this time. I will send a PR in the future once I finalize this
portion. Feel free to still leave comments though.

This design covers adding the following features to C# for low level struct performance improvements: - Allowing `ref struct` to contain `ref` fields. - Allowing `struct` and `ref struct` to return `ref` to their fields. - Allowing safe fixed buffers for unmanaged and managed types. Note: the section on providing parameter escape notations is incomplete at this time. I will send a PR in the future once I finalize this portion. Feel free to still leave comments though.

VSadov · 2020-09-24T00:21:45Z

I like this!!! 👍

JamesNK · 2020-09-24T00:22:20Z

I think I have a situation where this would be useful. Would it solve this highlighted problem? - https://github.com/protocolbuffers/protobuf/blob/a690227398d307419d45b1f352494a176873897b/csharp/src/Google.Protobuf/ParseContext.cs#L77-L80

One issue is the field is being updated outside of the constructor/initialization. From the spec:

There are designs choices we could make to allow more flexible ref re-assignment of fields. For example it could be allowed in cases where we knew the receiver had a safe-to-escape scope that was not outside the current method scope. Further we could provide syntax for making such downward facing values easier to declare: essentially values that have safe-to-escape scopes restricted to the current method. Such a design is discussed here). However extra complexity of such rules do not seem to be worth the limited cases this enables. Should compelling samples come up we can revisit this decision.

This means though that ref fields are largely in practice ref readonly. The one exception being inside object initializers.

VSadov · 2020-09-24T00:28:51Z

proposals/low-level-struct-improvements.md

+
+- If the parameter is the `this` parameter of a `struct` type, it is 
+*ref-safe-to-escape* to the top scope of the enclosing method unless the 
+method is annotated with `[RefEscapes]` in which case it is *ref-safe-to-escape*


I think the ref-escape of [RefEscapes] members will have to match ref-escape of the container.
If we are permitting returning refs to fields, we must ensure the containing variable is not dead by the time the ref is used.

I.E accessing such ref-returning members would work as if taking a ref to a field or the variable itself.

Agree but I believe that falls out from the proposed rules. Essentially this is meant to change the ref-safe-to-escape scope for this when the [RefEscapes] attribute is applied. that will require an update to the Parameters section of the span safety rules. That should then feed into the existing Field rules for determining appropriate scopes.

proposals/low-level-struct-improvements.md

YairHalberstadt · 2020-09-24T03:58:46Z

To me the rules around how I can or can't use a ref field in a ref struct are extremely detailed and arcane. I can't imagine remembering them all, and I like remembering language details :-) I'm worried that for any particular use case that doesn't work straight out the box developers are going to waste hours fumbling around trying to find a way around it.

I think either of the following would be much more palatable:

Take the breaking change. If a method accepts a ref, and returns a ref struct or vice versa, the returned scope is equivalent to the passed in scope.
Any method which reads or writes a ref field in a struct must be marked with a modreq. Any method which uses those must be marked with a modreq. For methods marked with this modreq, if a method accepts a ref, and returns a ref struct or vice versa, the returned scope is equivalent to the passed in scope. It is a language decision whether or not you need to explicitly mark this via an attribute since it is a breaking change, but at least the error message is dead simple - add this attribute.

tannergooding · 2020-09-24T05:49:49Z

proposals/low-level-struct-improvements.md

+outside the enclosing method.
+
+Misc Notes:
+- A member marked as `[RefEscapes]` can not implement an `interface` method.


Is this true even if the interface is annotated as RefEscapes?

My intent for [RefEscapes] when writing this proposal was that should apply to struct only. Having it work on struct and interface is fine though; it's straight forward to add consistent rules here. I didn't add interface originally because the scenario seemed pretty limited. Then again I have an other proposal in the works to expand the use of ref struct into generics so it's possible this will be more common.

Think I'll change this so that it can apply to struct or interface and see how that conversation goes.

davkean · 2020-09-24T05:51:05Z

proposals/low-level-struct-improvements.md

+```
+
+Members which contain the `[RefEscapes]` attribute cannot be used to implement 
+interface members. This would hide the lifetime nature of the member at 


And overrides/virtuals?

Well, the only overrides that can exist in structs are GetHashCode() and Equals(object).....

Given the exact structure of object today I don't think it's a problem to have [RefEscapes] on override. There is no method on object that has a ref return hence it can't really feed into this problem.

Generally though the compiler does not make assumptions about object (or most other types). Hence we do need to consider a hypothetical corelib where a ref returning method is added to object. Lets say the following:

class Object { public virtual ref int GetIt() => ...; }

Even in this case I believe it would be safe for a struct to override this method and return ref to its internal state. That is because there are only two ways in which this method can be invoked:

Through a reference to the struct. In that case the rules outlined here would come into effect and everything would be safe.

Through a reference to object. At that point the struct is no longer on the stack because it's a boxed object. Hence the compiler interpretation of the ref as being safe-to-escape outside the enclosing method scope is correct because it's returning a ref to the boxed object, not a stack value.

Even after writing that justification out I'm unsure if it should still be allowed. It feels cute and doesn't really open up any scenarios. Going to think on this a bit and see how I feel later in the day.

YairHalberstadt · 2020-09-24T05:53:24Z

After discussing with @tannergooding on discord, I think what's more important to me from a usability perspective than the actual rules, are the diagnostics which going along with them.

For example, let's say I write:

  public Span<int> M(ref int x) {
        return new Span<int>(ref x);
    }

I think it's important that the error message is really simple and points me to the correct solution:

parameter 'ref int x' must be marked with [RefEscapes]

Instead of:

Span<int>(ref x) cannot be returned from M since x has a scope greater than the current method.

Similiarly if I then use M:

public Span<int> UseM(ref int x)
{
    return M(ref x);
}

The error message should be:

Parameter 'ref int x' must be marked with [RefEscapes]

YairHalberstadt · 2020-09-24T05:55:30Z

proposals/low-level-struct-improvements.md

+ensuring compatibility with `ref struct` returning methods like `CreateSpan<T>` 
+is a key pivot point in the design of `ref` fields.
+
+To do this we will change the escape rules for a constructor invocation, which


Why not instead of changing the rules for ref struct constructors (which is a breaking change), mark them with [RefEscapes] explicitly, and consider this to be the return type of a constructor?

These rules only apply to constructors on ref struct which directly contain ref fields. Given that ref fields don't exist today this can't be a breaking change because there are no constructors that they presently apply to.

Except for Span<T>, right?

Ruikuan

fix some typos.

proposals/low-level-struct-improvements.md

Co-authored-by: Ruikuan <liruikuan@outlook.com>

jaredpar · 2020-09-24T16:23:09Z

Thanks @Ruikuan !

jnm2

Yay for English

proposals/low-level-struct-improvements.md

jaredpar · 2020-09-24T16:34:22Z

@JamesNK

I think I have a situation where this would be useful. Would it solve this highlighted problem? -

Not as presently written. The rules are pretty restrictive, they assert that any use of a constructor that has a ref parameter on a ref struct with a ref field is always limited to the current method scope. Hence in this case you wouldn't be able to assign that into ctx here.

In this case though the invocation is safe. It's passing a ref to a known heap location hence there is no reason to restrict the return. An earlier version of the doc had this exception but I pulled it out because I thought it might be too much of a special case. This is the second time though this has come up so I'm leaning towards finding a way to add it back.

jaredpar · 2020-09-24T16:37:34Z

@YairHalberstadt

I think what's more important to me from a usability perspective than the actual rules, are the diagnostics which going along with them.

Agree. Good diagnostics will be important.

tannergooding · 2020-09-24T17:02:52Z

Just wanted to confirm something that was discussed in Discord (as its my understanding this is the case).

RefEscapes would allow the following code to work (it is illegal today):

public Span<int> M([RefEscapes]ref int x) {
    return new Span<int>(ref x);
}

While DoesNotEscape would allow the following, currently also illegal:

public ref int M([DoesNotEscape] Span<int> x, int[] y)
{
    return ref y[0];
}

public ref int X(int[] y)
{
    return ref M(stackalloc int[5], y);
}

In the case of the latter, DoesNotEscape would do basic analysis to validate it doesn't escape (I'm not sure if any validation could be done or should be done for RefEscapes).

Would these be just attributes, modreq, modopt?
Would these attributes only be valid on in, out, ref and ref struct?

Co-authored-by: Joseph Musser <me@jnm2.com>

jaredpar · 2020-09-24T17:21:29Z

Thanks @jnm2. With only three typo suggestions before you, I was beginning to worry that maybe I'd gotten better at English. 😄

jaredpar · 2020-09-24T17:32:44Z

@tannergooding

RefEscapes would allow the following code to work (it is illegal today):

No it would not. The [RefEscapes] attribute can only be applied to instance members of struct (explicitly called out in the proposal). It cannot be applied to individual parameters of members. Further the [RefEscapes] attribute only impacts the ref-safe-to-escape scope of this. That in turn causes a tiny tweak to the ref-safe-to-escape rules for method invocations returning a ref value. What your discussing requires changes to the safe-to-escape return of all methods.

Essentially the behavior you are looking for here is changing how safe-to-escape is interpreted for method calls. That is a very different feature than [RefEscapes]. Sure it's possible to add such annotations to the language but my intuition is that it's not really worth it.

One thing to remember is that the rules today around how this return pattern is calculated came in large part to our own usability feedback implementing Span<T>. If we were starting from scratch today I think that we would very likely end up with the exact same rules because it optimizes for the majority use case.

While DoesNotEscape would allow the following, currently also illegal:

Correct. This is the exact intended purpose of DoesNotEscape.

If you have real world examples of this pattern I would be eager to see them. It will help ensure the feature is solving the intended problem as well as making me feel more confident that it's worth adding to the language.

Would these be just attributes, modreq, modopt?

modreq

Would these attributes only be valid on in, out, ref and ref struct?

Applicability:

RefEscapes applicable to struct instance members only
DoesNotEscape likely applicable to parameters and instance members. This section is still a work in progress so less certainty in all the places it will apply

jnm2 · 2020-09-24T17:44:25Z

@jaredpar I think the usability of the English language is highly questionable :D

tannergooding · 2020-09-24T17:54:36Z

The [RefEscapes] attribute can only be applied to instance members of struct (explicitly called out in the proposal). It cannot be applied to individual parameters of members.

Ah, I see. I had seen it applied to the getter and misinterpreted that.

RefEscapes applicable to struct instance members only

So this is Methods that have a return, or fields and such as well? And it basically just indicates that the ref return is going to be this or a subset of this?

Essentially the behavior you are looking for here is changing how safe-to-escape is interpreted for method calls. That is a very different feature than [RefEscapes].

Hmmm, I had interpreted it as changing how safe-to-escape is interpreted because that seems like a useful feature.
Being able to annotate that a ref is this allows ref returns to non ref-fields. However, being able to annotate that a ref escapes allows things like MemoryMarshal.CreateSpan to exist and to exist "safely" without leaving a hole around something that can't otherwise be expressed.

jaredpar · 2020-09-24T19:23:16Z

@tannergooding

So this is Methods that have a return, or fields and such as well? And it basically just indicates that the ref return is going to be this or a subset of this?

It indicates that the ref return is potentially returning state inside of this. That effectively makes the following edit to method invocation rules for calculating ref-safe-to-escape when the target method is annotated with [RefEscapes]

the ref-safe-to-escape of all ref and out argument expressions ~~(excluding the receiver)~~

This is a very specific and narrow fix because it's a narrow and specific problem. 😄

Being able to annotate that a ref is this allows ref returns to non ref-fields

No it does not. I understand on the surface they seem related but in terms of how the span safety rules work they are very different problems and require different solutions. The [RefEscapes] proposal is all about changing how ref-safe-to-escape validation works (and it's a very targeted change). The behavior you are asking for is about how safe-to-escape returns are calculated. Attempting to use one solution both unnecessarily complicates the space and very likely creates other side effects that you almost certainly do not want.

I think the name [RefEscapes] is adding to the confusion here. Think of the name as [ThisRefEscapes] instead.

However, being able to annotate that a ref escapes allows things like MemoryMarshal.CreateSpan to exist and to exist "safely" without leaving a hole around something that can't otherwise be expressed.

Annotating that a ref escapes though is somewhat rhetorical because ref can escape today, and they can escape by ref:

ref int Identity(ref int p) => ref p;

The issue is more that ref cannot be smuggled in the proposal. Or rather a ref that sneakily escapes by ref inside a value.

Again though, this is solvable with another annotation. The issue is that when looking at the issues and samples the extra complexity did not seem warranted. All of the features in this proposal line up to specific customer asks based on samples, real scenarios, etc ... The same does not exist for this request. Further the existing rules, and the decisions we made about them originally WRT usability, provide reasonable evidence this is not actually needed as a feature.

If you feel passionately about this I think the way to move it forward is with samples, evidence, etc ....

tannergooding · 2020-09-24T19:31:18Z

If you feel passionately about this I think the way to move it forward is with samples, evidence, etc ....

Not super passionate and I don't think I have any specific examples today. I'd have to go back through the various hacks I'm using right now and double-check.

I think I have a better understanding of the differences now though, so thanks!

I think the name [RefEscapes] is adding to the confusion here. Think of the name as [ThisRefEscapes] instead.

Any reason why we couldn't use this name instead, to help reduce confusion?

jaredpar · 2020-09-24T20:28:48Z

@tannergooding

Any reason why we couldn't use this name instead, to help reduce confusion?

Because once proposals are published they're immutable. 😄

Joking aside it's one of the changes I'm considering.

gdkchan · 2020-09-27T05:31:52Z

Haven't seen anything about multi-dimensional fixed buffers in the proposal, but it would be useful too.
Example:

struct Matrix4x4
{
    public fixed float M[4][4];
}

It would be equivalent to the following C struct:

typedef struct {
    float M[4][4];
} Matrix4x4;

I'm not sure how one would pass it around though. According to the current proposal, I believe the safe method to pass the fixed buffers around is passing as Span<T>. However there's no multi-dimensional Span<T>, so it wouldn't work for this case.

Thealexbarney · 2020-09-27T06:14:18Z

Haven't seen anything about multi-dimensional fixed buffers in the proposal, but it would be useful too.

I agree. They would be useful.

Multi-dimensional arrays could work without relatively major changes, but only if accessing a single element or getting a span of the lowest dimension are the only things allowed.

Example:

struct MyStruct
{
    public fixed int MyArray[4][5][6];
}

int element = myStruct.MyArray[2][3][4];

Could be equivalent to (Bounds checks skipped for brevity):

struct MyStruct
{
    public fixed int MyArray[120];

    public int GetMyArray(int dim1, int dim2, int dim3)
    {
        return MyArray[dim1 * 5 * 6 + dim2 * 6 + dim3];
    }
}

int element = myStruct.GetMyArray(2, 3, 4);

Getting a single dimension of the array would be more difficult. Getting a span from the least significant dimension wouldn't be difficult:

public Span<int> Get(int dim1, int dim2)
{
    return MyArray.Slice(dim1 * 5 * 6 + dim2 * 6, 6);
}

Getting higher dimensions might be done by encoding the dimension sizes in the type. C# can't do anything like MultiDim<int, 4, 5, 6>, so the only way I see would be to generate a struct for each dimension. However the types would get ugly rather quickly, not to mention that these compiler-generated types would need to be accessible by user-facing code.

Pseudocode example:

public ref struct Array4_5_6
{
    private Span<int> _span;

    public Array5_6 GetDim2(int index)
    {
        return new Array5_6(_span.Slice(index * 5 * 6, 5 * 6));
    }

    public Span<int> GetDim3(int dim1, int dim2)
    {
        return _span.Slice(dim1 * 5 * 6 + dim2 * 6, 6);
    }
}

public ref struct Array5_6
{
    private Span<int> _span;

    public Span<int> GetDim2(int index)
    {
        return _span.Slice(index * 6, 6);
    }

    public int Get(int dim1, int dim2)
    {
        return _span[dim1 * 6 + dim2];
    }
}

The name `[RefEscapes]` was causing a lot of confusion in the discussions. People were mis-interpretting it as a general feature that could apply to other constructs when in reality it's a very specific and targetted feature. Renaming to `[ThisRefEscapes]` to make it clear this is a very targetted feature.

Based on feedback from several people I've decided that we need to make the rules for `ref` fields more flexible when the values being passed around are known to refer to the heap. In those cases there is simply no reason to restrict the created `ref struct` as it is always *ref-safe-to-escape* to the enclosing method as well as not requiring any adjustment to the method invocation rules. This does mean that we need to add the notion of "refers to the heap" to the span safety document. That doesn't change any assumptions there. It's just that before this change the notion of "refers to the heap" wasn't necessary to describe our safety rules.

jaredpar · 2020-09-28T14:17:36Z

@gdkchan

I had not considered multi-dimensional array support here. The idea of this proposal was essentially to extend the existing support for fixed sized buffers to include supporting all element types and not just unmanaged types. Multi-dimensional support is essentially expanding the scope of the original feature quite a bit. Given that it's existed for the entirety of C# and there hasn't been a big push to add this support I'm fairly skeptical that it's worth the extra complexity.

Added it to the open issue section though so we can discuss in LDM when we discuss the rest of the feature (this specific section hasn't been covered yet).

jaredpar · 2020-09-28T14:21:42Z

Thanks for the feedback everyone! Have updated the proposal based on various discussions here. Particularly:

Renamed [RefEscapes] to [ThisRefEscapes] to make the semantics of the feature much clearer.
Made constructors and ref field assignment more flexible when the values being used are known to refer to the heap. Scenarios provided by @JamesNK and @davidwrighton made me feel the extra complexity in the rules are justified (and really it's actually a fairly small change).
Responded to several feedback points by @VSadov on how fixed buffers should be emitted to make them more closely resemble existing arrays.

jaredpar · 2020-09-29T15:29:35Z

Thanks again for the feedback everyone. At this point the initial design of ref field, [ThisRefEscapes] and safe fixed buffers is a good place. The ref field portion has largely been presented to LDM while the [ThisRefEscapes] + safe fixed buffers should be presented this week / next week. Likely another round of edits will follow after that.

Merging now because I want to make progress on the [DoesNotEscape] section. Using a separate PR for that as I noted that section should not be considered in this PR as it was decidedly incomplete.

YairHalberstadt · 2020-12-24T07:20:58Z

@jaredpar is there any championed issue for this proposal?

Thanks

333fred · 2020-12-24T08:07:39Z

There are a few. #1147 is ref fields.

YairHalberstadt · 2020-12-24T08:09:47Z

Would it be useful to have a single one for this proposal? Are there any for returning a field of a struct by ref?

jaredpar · 2021-01-04T15:37:31Z

I've been using #1147 as the primary championed issue here.

CodingMadness · 2022-03-12T12:14:42Z

@jaredpar is something of this possible by current (.NET 6.0 version, VS2022) ?Would be nice to hear :)

jaredpar mentioned this pull request Sep 23, 2020

Allow ref struct to contain ref fields dotnet/runtime#32060

Closed

VSadov reviewed Sep 24, 2020

View reviewed changes

proposals/low-level-struct-improvements.md Show resolved Hide resolved

VSadov reviewed Sep 24, 2020

View reviewed changes

proposals/low-level-struct-improvements.md Show resolved Hide resolved

tannergooding reviewed Sep 24, 2020

View reviewed changes

davkean reviewed Sep 24, 2020

View reviewed changes

YairHalberstadt reviewed Sep 24, 2020

View reviewed changes

Ruikuan reviewed Sep 24, 2020

View reviewed changes

proposals/low-level-struct-improvements.md Outdated Show resolved Hide resolved

proposals/low-level-struct-improvements.md Outdated Show resolved Hide resolved

proposals/low-level-struct-improvements.md Outdated Show resolved Hide resolved

jaredpar and others added 2 commits September 24, 2020 09:21

Update proposals/low-level-struct-improvements.md

b83c02c

Co-authored-by: Ruikuan <liruikuan@outlook.com>

Apply suggestions from code review

8fff276

Co-authored-by: Ruikuan <liruikuan@outlook.com>

jnm2 reviewed Sep 24, 2020

View reviewed changes

Apply suggestions from code review

610ce4b

Co-authored-by: Joseph Musser <me@jnm2.com>

jaredpar added 3 commits September 28, 2020 06:40

Fixed buffer updates

a34b6d7

modreq clarification

da1b2a4

Small update

67e7e40

jaredpar merged commit 77b2ab8 into dotnet:master Sep 29, 2020

ufcpp mentioned this pull request Oct 20, 2020

allow references to structs to be stored in fields of ref structs #1147

Open

huoyaoyuan mentioned this pull request Dec 24, 2020

[Proposal]: Allow struct to return instance members by reference (avoid CS8170) #4260

Closed

4 tasks

Proposal: Low level struct improvements #3936

Proposal: Low level struct improvements #3936

Conversation

jaredpar commented Sep 23, 2020

VSadov commented Sep 24, 2020

JamesNK commented Sep 24, 2020 • edited Loading

VSadov Sep 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

YairHalberstadt commented Sep 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davkean Sep 24, 2020 • edited Loading

Choose a reason for hiding this comment

Joe4evr Sep 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

YairHalberstadt commented Sep 24, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ruikuan left a comment • edited Loading

Choose a reason for hiding this comment

jaredpar commented Sep 24, 2020

jnm2 left a comment

Choose a reason for hiding this comment

jaredpar commented Sep 24, 2020

jaredpar commented Sep 24, 2020

tannergooding commented Sep 24, 2020 • edited Loading

jaredpar commented Sep 24, 2020

jaredpar commented Sep 24, 2020

jnm2 commented Sep 24, 2020

tannergooding commented Sep 24, 2020

jaredpar commented Sep 24, 2020

tannergooding commented Sep 24, 2020

jaredpar commented Sep 24, 2020

gdkchan commented Sep 27, 2020 • edited Loading

Thealexbarney commented Sep 27, 2020 • edited Loading

jaredpar commented Sep 28, 2020

jaredpar commented Sep 28, 2020

jaredpar commented Sep 29, 2020

YairHalberstadt commented Dec 24, 2020

333fred commented Dec 24, 2020 • edited Loading

YairHalberstadt commented Dec 24, 2020

jaredpar commented Jan 4, 2021

CodingMadness commented Mar 12, 2022

JamesNK commented Sep 24, 2020 •

edited

Loading

VSadov Sep 24, 2020 •

edited

Loading

YairHalberstadt commented Sep 24, 2020 •

edited

Loading

davkean Sep 24, 2020 •

edited

Loading

Joe4evr Sep 24, 2020 •

edited

Loading

Ruikuan left a comment •

edited

Loading

tannergooding commented Sep 24, 2020 •

edited

Loading

gdkchan commented Sep 27, 2020 •

edited

Loading

Thealexbarney commented Sep 27, 2020 •

edited

Loading

333fred commented Dec 24, 2020 •

edited

Loading