-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Low level struct improvements #3936
Conversation
This design covers adding the following features to C# for low level struct performance improvements: - Allowing `ref struct` to contain `ref` fields. - Allowing `struct` and `ref struct` to return `ref` to their fields. - Allowing safe fixed buffers for unmanaged and managed types. Note: the section on providing parameter escape notations is incomplete at this time. I will send a PR in the future once I finalize this portion. Feel free to still leave comments though.
I like this!!! 👍 |
I think I have a situation where this would be useful. Would it solve this highlighted problem? - https://github.com/protocolbuffers/protobuf/blob/a690227398d307419d45b1f352494a176873897b/csharp/src/Google.Protobuf/ParseContext.cs#L77-L80 One issue is the field is being updated outside of the constructor/initialization. From the spec:
|
|
||
- If the parameter is the `this` parameter of a `struct` type, it is | ||
*ref-safe-to-escape* to the top scope of the enclosing method unless the | ||
method is annotated with `[RefEscapes]` in which case it is *ref-safe-to-escape* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ref-escape of [RefEscapes]
members will have to match ref-escape of the container.
If we are permitting returning refs to fields, we must ensure the containing variable is not dead by the time the ref is used.
I.E accessing such ref-returning members would work as if taking a ref to a field or the variable itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree but I believe that falls out from the proposed rules. Essentially this is meant to change the ref-safe-to-escape scope for this
when the [RefEscapes]
attribute is applied. that will require an update to the Parameters
section of the span safety rules. That should then feed into the existing Field
rules for determining appropriate scopes.
To me the rules around how I can or can't use a ref field in a ref struct are extremely detailed and arcane. I can't imagine remembering them all, and I like remembering language details :-) I'm worried that for any particular use case that doesn't work straight out the box developers are going to waste hours fumbling around trying to find a way around it. I think either of the following would be much more palatable:
|
outside the enclosing method. | ||
|
||
Misc Notes: | ||
- A member marked as `[RefEscapes]` can not implement an `interface` method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true even if the interface is annotated as RefEscapes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent for [RefEscapes]
when writing this proposal was that should apply to struct
only. Having it work on struct
and interface
is fine though; it's straight forward to add consistent rules here. I didn't add interface
originally because the scenario seemed pretty limited. Then again I have an other proposal in the works to expand the use of ref struct
into generics so it's possible this will be more common.
Think I'll change this so that it can apply to struct
or interface
and see how that conversation goes.
``` | ||
|
||
Members which contain the `[RefEscapes]` attribute cannot be used to implement | ||
interface members. This would hide the lifetime nature of the member at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And overrides/virtuals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the only overrides that can exist in structs are GetHashCode()
and Equals(object)
.....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the exact structure of object
today I don't think it's a problem to have [RefEscapes]
on override
. There is no method on object
that has a ref
return hence it can't really feed into this problem.
Generally though the compiler does not make assumptions about object
(or most other types). Hence we do need to consider a hypothetical corelib where a ref
returning method is added to object
. Lets say the following:
class Object
{
public virtual ref int GetIt() => ...;
}
Even in this case I believe it would be safe for a struct
to override
this method and return ref
to its internal state. That is because there are only two ways in which this method can be invoked:
- Through a reference to the
struct
. In that case the rules outlined here would come into effect and everything would be safe. - Through a reference to
object
. At that point thestruct
is no longer on the stack because it's a boxed object. Hence the compiler interpretation of theref
as being safe-to-escape outside the enclosing method scope is correct because it's returning aref
to the boxed object, not a stack value.
Even after writing that justification out I'm unsure if it should still be allowed. It feels cute and doesn't really open up any scenarios. Going to think on this a bit and see how I feel later in the day.
After discussing with @tannergooding on discord, I think what's more important to me from a usability perspective than the actual rules, are the diagnostics which going along with them. For example, let's say I write: public Span<int> M(ref int x) {
return new Span<int>(ref x);
} I think it's important that the error message is really simple and points me to the correct solution:
Instead of:
Similiarly if I then use M: public Span<int> UseM(ref int x)
{
return M(ref x);
} The error message should be:
|
ensuring compatibility with `ref struct` returning methods like `CreateSpan<T>` | ||
is a key pivot point in the design of `ref` fields. | ||
|
||
To do this we will change the escape rules for a constructor invocation, which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not instead of changing the rules for ref struct constructors (which is a breaking change), mark them with [RefEscapes] explicitly, and consider this
to be the return type of a constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These rules only apply to constructors on ref struct
which directly contain ref
fields. Given that ref
fields don't exist today this can't be a breaking change because there are no constructors that they presently apply to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for Span<T>
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix some typos.
Co-authored-by: Ruikuan <liruikuan@outlook.com>
Co-authored-by: Ruikuan <liruikuan@outlook.com>
Thanks @Ruikuan ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay for English
Not as presently written. The rules are pretty restrictive, they assert that any use of a constructor that has a In this case though the invocation is safe. It's passing a |
Agree. Good diagnostics will be important. |
Just wanted to confirm something that was discussed in Discord (as its my understanding this is the case).
public Span<int> M([RefEscapes]ref int x) {
return new Span<int>(ref x);
} While public ref int M([DoesNotEscape] Span<int> x, int[] y)
{
return ref y[0];
}
public ref int X(int[] y)
{
return ref M(stackalloc int[5], y);
} In the case of the latter, Would these be just attributes, |
Co-authored-by: Joseph Musser <me@jnm2.com>
Thanks @jnm2. With only three typo suggestions before you, I was beginning to worry that maybe I'd gotten better at English. 😄 |
No it would not. The Essentially the behavior you are looking for here is changing how safe-to-escape is interpreted for method calls. That is a very different feature than One thing to remember is that the rules today around how this return pattern is calculated came in large part to our own usability feedback implementing
Correct. This is the exact intended purpose of If you have real world examples of this pattern I would be eager to see them. It will help ensure the feature is solving the intended problem as well as making me feel more confident that it's worth adding to the language.
Applicability:
|
@jaredpar I think the usability of the English language is highly questionable :D |
Ah, I see. I had seen it applied to the
So this is Methods that have a return, or fields and such as well? And it basically just indicates that the
Hmmm, I had interpreted it as changing how safe-to-escape is interpreted because that seems like a useful feature. |
It indicates that the
This is a very specific and narrow fix because it's a narrow and specific problem. 😄
No it does not. I understand on the surface they seem related but in terms of how the span safety rules work they are very different problems and require different solutions. The I think the name
Annotating that a ref int Identity(ref int p) => ref p; The issue is more that Again though, this is solvable with another annotation. The issue is that when looking at the issues and samples the extra complexity did not seem warranted. All of the features in this proposal line up to specific customer asks based on samples, real scenarios, etc ... The same does not exist for this request. Further the existing rules, and the decisions we made about them originally WRT usability, provide reasonable evidence this is not actually needed as a feature. If you feel passionately about this I think the way to move it forward is with samples, evidence, etc .... |
Not super passionate and I don't think I have any specific examples today. I'd have to go back through the various hacks I'm using right now and double-check. I think I have a better understanding of the differences now though, so thanks!
Any reason why we couldn't use this name instead, to help reduce confusion? |
Because once proposals are published they're immutable. 😄 Joking aside it's one of the changes I'm considering. |
Haven't seen anything about multi-dimensional fixed buffers in the proposal, but it would be useful too. struct Matrix4x4
{
public fixed float M[4][4];
} It would be equivalent to the following C struct: typedef struct {
float M[4][4];
} Matrix4x4; I'm not sure how one would pass it around though. According to the current proposal, I believe the safe method to pass the fixed buffers around is passing as |
I agree. They would be useful. Multi-dimensional arrays could work without relatively major changes, but only if accessing a single element or getting a span of the lowest dimension are the only things allowed. Example: struct MyStruct
{
public fixed int MyArray[4][5][6];
}
int element = myStruct.MyArray[2][3][4]; Could be equivalent to (Bounds checks skipped for brevity): struct MyStruct
{
public fixed int MyArray[120];
public int GetMyArray(int dim1, int dim2, int dim3)
{
return MyArray[dim1 * 5 * 6 + dim2 * 6 + dim3];
}
}
int element = myStruct.GetMyArray(2, 3, 4); Getting a single dimension of the array would be more difficult. Getting a span from the least significant dimension wouldn't be difficult: public Span<int> Get(int dim1, int dim2)
{
return MyArray.Slice(dim1 * 5 * 6 + dim2 * 6, 6);
} Getting higher dimensions might be done by encoding the dimension sizes in the type. C# can't do anything like Pseudocode example: public ref struct Array4_5_6
{
private Span<int> _span;
public Array5_6 GetDim2(int index)
{
return new Array5_6(_span.Slice(index * 5 * 6, 5 * 6));
}
public Span<int> GetDim3(int dim1, int dim2)
{
return _span.Slice(dim1 * 5 * 6 + dim2 * 6, 6);
}
}
public ref struct Array5_6
{
private Span<int> _span;
public Span<int> GetDim2(int index)
{
return _span.Slice(index * 6, 6);
}
public int Get(int dim1, int dim2)
{
return _span[dim1 * 6 + dim2];
}
} |
The name `[RefEscapes]` was causing a lot of confusion in the discussions. People were mis-interpretting it as a general feature that could apply to other constructs when in reality it's a very specific and targetted feature. Renaming to `[ThisRefEscapes]` to make it clear this is a very targetted feature.
Based on feedback from several people I've decided that we need to make the rules for `ref` fields more flexible when the values being passed around are known to refer to the heap. In those cases there is simply no reason to restrict the created `ref struct` as it is always *ref-safe-to-escape* to the enclosing method as well as not requiring any adjustment to the method invocation rules. This does mean that we need to add the notion of "refers to the heap" to the span safety document. That doesn't change any assumptions there. It's just that before this change the notion of "refers to the heap" wasn't necessary to describe our safety rules.
I had not considered multi-dimensional array support here. The idea of this proposal was essentially to extend the existing support for fixed sized buffers to include supporting all element types and not just unmanaged types. Multi-dimensional support is essentially expanding the scope of the original feature quite a bit. Given that it's existed for the entirety of C# and there hasn't been a big push to add this support I'm fairly skeptical that it's worth the extra complexity. Added it to the open issue section though so we can discuss in LDM when we discuss the rest of the feature (this specific section hasn't been covered yet). |
Thanks for the feedback everyone! Have updated the proposal based on various discussions here. Particularly:
|
Thanks again for the feedback everyone. At this point the initial design of Merging now because I want to make progress on the |
@jaredpar is there any championed issue for this proposal? Thanks |
There are a few. #1147 is ref fields. |
Would it be useful to have a single one for this proposal? Are there any for returning a field of a struct by ref? |
I've been using #1147 as the primary championed issue here. |
@jaredpar is something of this possible by current (.NET 6.0 version, VS2022) ?Would be nice to hear :) |
This design covers adding the following features to C# for low level
struct performance improvements:
ref struct
to containref
fields.struct
andref struct
to returnref
to their fields.Note: the section on providing parameter escape notations is incomplete
at this time. I will send a PR in the future once I finalize this
portion. Feel free to still leave comments though.