Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid using Span<T> on TFMs older than net45 #117

Closed
jnm2 opened this issue Feb 17, 2021 · 72 comments · Fixed by #140
Closed

Avoid using Span<T> on TFMs older than net45 #117

jnm2 opened this issue Feb 17, 2021 · 72 comments · Fixed by #140
Assignees

Comments

@jnm2
Copy link
Contributor

jnm2 commented Feb 17, 2021

This causes the project to fail to build for any target frameworks.

For example, I'm targeting a library to net35;net48;net5.0-windows.

@AArnott AArnott changed the title BITMAPINFO.g.cs uses the Span<T> type on target frameworks where it does not exist Avoid using Span<T> on TFMs older than net45 Feb 17, 2021
@AArnott
Copy link
Member

AArnott commented Feb 17, 2021

Only net35 is a problem in this case, since all the others you list have the type available.
I don't know that we'll support this. If we can trivially replace all uses of Span<T> with ArraySegment<T> or perhaps T[] then maybe. But we have a lot of work to do for the more current frameworks that will have to be prioritized before exhaustive net35 support.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 18, 2021

This is the generated code. Would it be acceptable to omit the bmiColors member if compilation.GetTypeByMetadataName(typeof(Span<T>).FullName)) is null?

internal partial struct BITMAPINFO
{
    internal BITMAPINFOHEADER bmiHeader;
    
    internal unsafe Span<RGBQUAD> bmiColors
    {
        get
        {
            fixed (void *p = &this.__bmiColors)
                return new Span<RGBQUAD>(p, 1);
        }
    }

    [StructLayout(LayoutKind.Sequential, Pack = 1)]
    private struct __bmiColors_1
    {
        private RGBQUAD _1;
    }

    [DebuggerBrowsable(DebuggerBrowsableState.Never)]
    private __bmiColors_1 __bmiColors;
}

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 18, 2021

Separate issue, but I'm pretty sure this member is unusable because the number of elements is wrong:

The number of entries in the array depends on the values of the biBitCount and biClrUsed members of the BITMAPINFOHEADER structure.

(https://docs.microsoft.com/en-us/windows/win32/api/wingdi/ns-wingdi-bitmapinfo)

@AArnott
Copy link
Member

AArnott commented Feb 19, 2021

The number of elements isn't of concern. It is the number and order of fields, and the generated struct has just two, in the same order as the two documented, so we're ok there.

And no, simply dropping Span<T> members alone would be insufficient. For one thing, the field it wraps is still private so you'd need a way to access it. Secondly, Span<T> appears in method signatures as well.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 19, 2021

In this specific case, I'm having trouble understanding how the one-element span is usable even if you do access it. The docs I quoted seem to say the number of elements is not actually 1. What is a usage example? Without that, I have a hard time knowing what a pre-net45 equivalent would be.

@AArnott
Copy link
Member

AArnott commented Feb 19, 2021

Oh, I think I misunderstood your prior comment. sorry.
Now that I'm looking more at this struct, I see it's not a fixed size array at all. It's just helping C code find the start of an array whose length is really specified by the BITMAPINFOHEADER struct. Goo. In this case the metadata is wrong. I filed microsoft/win32metadata#266 to track that.

For a workaround, you can just define your own struct with a RGBQUAD field in place of the span.
If you're initializing the memory yourself, you'll have to allocate enough memory for the struct plus any additional "array" elements. When accessing this "array" you'll have to use pointers. Something like:

RGBQUAD* pFirst = &bitmapInfo.bmiColors`
pFirst[0] = something;
pFirst[1] = else`

It'll be on you to make sure you don't write to more memory than you allocated.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 19, 2021

So from the perspective of whether Span<T> should be used in this particular struct, would you agree that a span should not be used and that the only valid way to use the bmiColors field is to use pointers? (In all the years that I've used BITMAPINFO, I've only used it with CreateDIBSection and always defined it to only have the header field.)

That will be an answer that informs my original question, which is what code should be generated on net35 instead of the current code. If it turns out that Span<T> shouldn't even be used in a scenario like this when it is available, that makes BITMAPINFO a non-example for the purposes of this thread.

Possibly if the metadata is sophisticated enough, a span-returning member could still be exposed which reads the header fields to determine what length the color table should be. But it won't know how much memory was actually allocated.

@AArnott
Copy link
Member

AArnott commented Feb 20, 2021

Possibly if the metadata is sophisticated enough, a span-returning member could still be exposed which reads the header fields to determine what length the color table should be.

Yes, that's exactly what I was thinking of. And by the time someone has initialized the header's count field, the memory should have already been allocated.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 20, 2021

Cool, so if we tentatively assume that there will still be a span-returning internal property and a private field on newer frameworks, what would the ideal codegen be when Span<T> is not available? Internal property returning a pointer?

internal partial struct BITMAPINFO
{
    internal BITMAPINFOHEADER bmiHeader;

#if NET35
    internal unsafe RGBQUAD* bmiColors => &this.__bmiColors;
#else
    internal unsafe Span<RGBQUAD> bmiColors
    {
        get
        {
            fixed (void *p = &this.__bmiColors)
                return new Span<RGBQUAD>(p, 1);
        }
    }
#endif

    [StructLayout(LayoutKind.Sequential, Pack = 1)]
    private struct __bmiColors_1
    {
        private RGBQUAD _1;
    }

    [DebuggerBrowsable(DebuggerBrowsableState.Never)]
    private __bmiColors_1 __bmiColors;
}

It would probably be better to generate the lowest-common-denominator code for all frameworks you're targeting so that you don't have to use preprocessor in the code that consumes it though.

internal partial struct BITMAPINFO
{
    internal BITMAPINFOHEADER bmiHeader;

    internal unsafe RGBQUAD* bmiColors => &this.__bmiColors;

    [StructLayout(LayoutKind.Sequential, Pack = 1)]
    private struct __bmiColors_1
    {
        private RGBQUAD _1;
    }

    [DebuggerBrowsable(DebuggerBrowsableState.Never)]
    private __bmiColors_1 __bmiColors;
}

@AArnott
Copy link
Member

AArnott commented Feb 20, 2021

A property returning a pointer to another field doesn't go down well, since the CLR cannot guarantee the caller has already pinned the struct. So we would likely do something like the metadata does today: expose the first element and leave it to the caller to get a pointer to it.

@AArnott
Copy link
Member

AArnott commented Feb 20, 2021

We wouldn't need to use #if, since we generate code per compilation and know what its target framework is.

Does your last code snippet even compile? I'm surprised C# would let you return a pointer to your own field like that.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 20, 2021

We wouldn't need to use #if, since we generate code per compilation and know what its target framework is.

How nice is that from the standpoint of <EmbedUntrackedSources>true</EmbedUntrackedSources> or stepping into the code with the debugger, though?

And regardless of that, it would be annoying to have to use #if to use the member, if it varies between Span<RGBQUAD> and RGBQUAD* between your target frameworks. So generating the lowest common denominator seems better, doesn't it?

Does your last code snippet even compile? I'm surprised C# would let you return a pointer to your own field like that.

No, you're right. It would have to be closer to what you have already, just without the new Span part:

    internal unsafe RGBQUAD* bmiColors
    {
        get
        {
            fixed (RGBQUAD* p = &__bmiColors)
                return p;
        }
    }

A property returning a pointer to another field doesn't go down well, since the CLR cannot guarantee the caller has already pinned the struct.

The current span-based code has this same problem, right? Just so I'm understanding.

So we would likely do something like the metadata does today: expose the first element and leave it to the caller to get a pointer to it.

If I'm following this part correctly, you're saying all target frameworks should do this, regardless of whether they have Span<T>:

internal partial struct BITMAPINFO
{
    internal BITMAPINFOHEADER bmiHeader;
    internal RGBQUAD bmiColors;
}

@AArnott
Copy link
Member

AArnott commented Feb 20, 2021

So generating the lowest common denominator seems better, doesn't it?

IMO, not if the lowest common denominator we're talking about is pre-net45 days, particularly if net45+ affords us the opportunity to greatly improve the API and/or improve runtime performance that we'd lose by limiting ourselves to very old APIs and language constructs.

Now, if we can support net35 and simply add members for later TFMs that improve accessibility, that's certainly worth exploring. But in general, we haven't even made a commitment to support prior to net45 nor C# language versions before 9. The fact that they sometimes work is convenient only at this point.

How nice is that from the standpoint of <EmbedUntrackedSources>true</EmbedUntrackedSources> or stepping into the code with the debugger, though?

Just fine, I expect. Whether the generated code is the same or different between TFMs, the code the debugger brings up will match the TFM.

If I'm following this part correctly, you're saying all target frameworks should do this, regardless ...

Yes, I think so. But I'd like to verify that before I say definitely.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 20, 2021

I'm only asking for lowest-common-denominator support among the target frameworks listed in the project. You'd still get Span<T> members in your other target frameworks as soon as you remove net35 from the list of target frameworks.

The alternatives to this are either forcing the use of preprocessor on the project any time it touches that member, because the type of the member varies across target frameworks, or effectively blocking net35 entirely.

@AArnott
Copy link
Member

AArnott commented Feb 20, 2021

I see another alternative: define members that work on net35, and add members when targeting newer frameworks. Never use #if in the generated code, and always generate the best for a given TFM.
The means the consumer can choose whether to write code against the common API or against the API that is unique to the newer ones, just like you can for any other BCL API.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 20, 2021

Nice, I like that even better! The starting point would then be to generate all code without Span<T>, then add that back in later. What would you do in cases where a struct field is currently exposed as a span property?

@AArnott
Copy link
Member

AArnott commented Feb 22, 2021

The starting point would then be to generate all code without Span<T>, then add that back in later.

If net35 was the primary target, I would agree. But since we're prioritizing getting accurate, complete API coverage that targets where most codebases are at now, that isn't where we started, nor where we're going in the short term. We have lots of work to do to help the masses get off the ground such as define more enums, fix the string parameters that show up as ushort*, etc.

Once we get the basics going for the majority of folks, we can circle back and fill in for the more rare cases like net35 targeting. I estimate that's a month or two out, at least. And when we circle back, we won't take away the goodness that net45+ has, but likely just expose the private field that's already there.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 22, 2021

I was going to offer to implement. Is now a bad time?

@AArnott
Copy link
Member

AArnott commented Feb 22, 2021

Thank you. Please wait for #121 to close as that is somewhat disruptive and I don't want anyone to have to deal with merge conflicts. I am working on it now.

When you do it:

  1. please avoid unnecessary refactorings.
  2. I'm not sure that the compiler (and thus the source generator) actually knows the TFM itself, so consider searching the Compilation for the Span<T> symbol to see whether to generate those members. I want to avoid generating #if sections.
  3. Please keep the Span member with the current naming (the one that matches the docs). At least for now, although I might bend on that point. Perhaps you can give the field a name that matches the docs with an added _Ptr suffix or something like that. What are your thoughts?
  4. I assume based on the conversation here that you're focus is on structs. Let's keep it that way for now (i.e. don't change how Span<T> friendly method overloads are generated at this point).

Sound good?

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 22, 2021

Yes, thanks! All points sound good. On 2, that is my understanding also. I'm used to using compilation.GetTypeByMetadataName to decide what to do. (I've contributed to Roslyn analyzers a fair number of times.) Generating preprocessor should not be necessary.

On 3, the suffix idea sounds good. Underscores aren't my favorite, and neither are contractions of words, but it's up to you. What do you think about the suffix Raw or Pointer, or even Field since being a field is a significant thing? Being a field means you can take a ref to it, and I imagine that a span-returning property would never be a ref-returning property. So raw field access could sometimes be what you are looking for.

On 4, I'm good with that but who knows how quickly I (or someone else) will be back asking for Span<T> overloads to be omitted. It all depends on what APIs I might start using in certain projects in the future.

@AArnott
Copy link
Member

AArnott commented Feb 22, 2021

who knows how quickly I (or someone else) will be back asking for Span<T> overloads to be omitted.

Agreed. Friendly overloads are going to be in serious flux to fix #26 and I don't want them any more complicated than they are already. There's also a good chance we can avoid use of span where necessary as part of that work anyway.

All those suffixes are reasonable. I'm not sure which is best, but I'm leaning against field since most members on the structs are fields anyway, so I'd rather we distinguish based on why this is present and possibly redundant with something else. I'm not a fan of underscores in general either, but we want something that is not expected to ever collide with other members in the struct, and ideally be distinguishable as something out of the ordinary in order to attract attention (whether to draw use or warn away, depending on what the user needs). IMO Raw or Pointer meet those goals.

As for the ref use, I think a property that returns a Span<T> can in fact be used to obtain a ref to an individual element within the span.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 22, 2021

I like your reasoning and prefer Raw simply using length as a tiebreaker. Is _Raw then what you would want?

As for the ref use, I think a property that returns a Span<T> can in fact be used to obtain a ref to an individual element within the span.

I meant a ref to the field itself which could be used to set the pointer to some other pointer. in/out/ref. E.g.:

Foo(out someStruct.SomePointer);
// Foo has now assigned a different pointer to the field someStruct.SomePointer

Or:

someStruct.SomePointer = otherPointer;

The point being, if you're looking for something to assign or take a ref (in/ref/out) to, it likely won't be a span-returning property. Because the span-returning property would have to be ref-returning itself, and I don't see how that could be implemented without having an additional span field in the struct which would mess up the layout and certainly not update the pointer field. Maybe there's an Unsafe method that converts a ref to a pointer into a ref to a span though, not sure.

...so, in a (rare?) scenario like this, you'll be specifically expecting and looking for raw field access. That's all.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 22, 2021

@tannergooding noticed that this code is unsafe because the span is assumed to have a lifetime greater than this:

internal unsafe Span<RGBQUAD> bmiColors
{
    get
    {
        fixed (void *p = &this.__bmiColors)
            return new Span<RGBQUAD>(p, 1);
    }
}

You could then end up with a pointer to invalid memory with code like:

var x = new BITMAPINFO();
return x.bmiColors;

This makes me think that the best thing for now is to drop span properties altogether.

@tannergooding
Copy link
Member

From a functionality and usability perspective, my opinion is that the correct thing to do here is to not expose properties for most things and to instead just expose the raw fields publicly.

The type definition in um/wingdi.h is as follows:

typedef struct tagBITMAPINFO {
    BITMAPINFOHEADER    bmiHeader;
    RGBQUAD             bmiColors[1];
} BITMAPINFO, FAR *LPBITMAPINFO, *PBITMAPINFO;

This means that, from a metadata perspective, there is nothing to indicate that bmiColors is anything other than length == 1. However, there is no code (AFAIK) in the runtime where the given length is 1 and it is actually 1.
This also means that [StructLayout(LayoutKind.Sequential, Pack = 1)] is unnecessary as the underlying header does not modify the packing of bmiColors[1]. They are naturally packed, the same as any array containing a T.

win32metadata is itself generated from ClangSharp but turns off several of the helper features I added around spans and ref returns since they are unimportant from the metadata perspective and are instead only relevant from the CsWin32 level perspective.

If you use ClangSharp with all the optional helpers, you will get something similar to:

public partial struct BITMAPINFO
{
    public BITMAPINFOHEADER bmiHeader;

    [NativeTypeName("RGBQUAD [1]")]
    public _bmiColors_e__FixedBuffer bmiColors;

    public partial struct _bmiColors_e__FixedBuffer
    {
        public RGBQUAD e0;

        public ref RGBQUAD this[int index]
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining)]
            get
            {
                return ref AsSpan(int.MaxValue)[index];
            }
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public Span<RGBQUAD> AsSpan(int length) => MemoryMarshal.CreateSpan(ref e0, length);
    }
}

See also: https://source.terrafx.dev/#TerraFX.Interop.Windows/um/wingdi/BITMAPINFO.cs,0a6f50687ac11b42, https://github.com/terrafx/terrafx.interop.windows/blob/main/sources/Interop/Windows/um/wingdi/BITMAPINFO.cs

The key points about this codegen are that:

  1. The bindings are 1-to-1 with the underlying definition and therefore it is blittable
  2. The fields are all publicly exposed allowing near 1-to-1 porting and workarounds at any level
  3. The helpers are all unsafe and keep the onus of correctness on the consumer

I think point 1 is fairly self explanatory. It avoids the need to convert the struct when going from managed to native or vice-versa.

For 2, this allows the user to directly access the field meaning there is no abstraction on top of it.
This namely means they can take the address of said field and so &bitmapinfo.bmiHeader is valid as is &bitmapInfo.bmiColors.e0 or say ((uint*)&bitmapInfo.bmiColors)[5]. This also means that when nested structs are involved, you can just do bitmapInfo.bmiHeader.biSizeor&bitmapInfo.bmiHeader.biPlanes`, you don't have a property which prevents this or which requires additional copies before assignment.

For 3, this is partially a limitation of the language.
Structs are considered unpinned by default because you may have myClass._bitmapInfo.SomeProperty which means this is on the heap. This means you must use fixed to get a pointer from within the context of a property and likewise means any pointer returned might be invalid as soon as the fixed scope has been left.
ref returns (and returns of ref struct) do not currently support the relevant semantics to be able to return a ref to a struct field from within the struct itself. C# 10 "might" be getting some improvements in that area via https://github.com/dotnet/csharplang/blob/master/proposals/low-level-struct-improvements.md, but that will be C# 10+ only.
Without that, the only safe thing to do is to return a ref by utilizing one or more of System.Runtime.CompilerServices.Unsafe, System.Runtime.InteropServices.MemoryMarshal, and using fixed but converting to a ref and returning that while the pointer is still pinned.

The sample I gave above is assuming .NET Core and is utilizing MemoryMarshal to return a span. The C# language rules for that span will allow it to outlive the actual bitmapInfo and so users either need to be aware of this or some kind of analyzer needs to be provided to help maintain the correct scoping. I opted for this approach because it should be the easiest to convert once C# 10 (or later, depending on when it goes in) gets the better scoping support.
If this were to be done on full framework, you can't have the AsSpan method but you can still have the this indexer because you can do something similar to:

public unsafe ref RGBQUAD this[int index]
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    get
    {
        fixed (RGBQUAD* p0 = &e0)
        {
            return ref *(p0 + index);
        }
    }
}

The remaining downside to this it that it doesn't correctly track that the length is actually based on bitmapHeader.biBitCount. You could add this logic, but like with most of the variable-sized buffer, the ability to do this depends on information which isn't available in the C header files or easily extractable from the documentation.

Likewise, the entirety of a variable-sized buffer working is dependent on the containing type (BITMAPINFO) never itself being passed or handled by value. It must always be passed and handled by reference (given this is interop code, this likely means by pointer).
This is important because as far as the type system is concerned (and the same goes for C/C++) the actual struct only contains a single RGBQUAD. The type system has no understanding that the actual struct extends for bitmapHeader.biBitCount bits nor that RGBQUAD may actually be an inline array of ushort

@AArnott
Copy link
Member

AArnott commented Feb 22, 2021

Thanks, @tannergooding.

I think I can adapt to what you're advocating for. I can see how it works well for fixed length arrays and how it is an improvement on CsWin32's current codegen, which is:

internal partial struct MainAVIHeader
{
    internal uint dwMicroSecPerFrame;
    internal uint dwMaxBytesPerSec;
    internal uint dwPaddingGranularity;
    internal uint dwFlags;
    internal uint dwTotalFrames;
    internal uint dwInitialFrames;
    internal uint dwStreams;
    internal uint dwSuggestedBufferSize;
    internal uint dwWidth;
    internal uint dwHeight;
    internal unsafe Span<uint> dwReserved
    {
        get
        {
            fixed (void *p = &this.__dwReserved)
                return new Span<uint>(p, 4);
        }
    }

    [StructLayout(LayoutKind.Sequential, Pack = 1)]
    private struct __dwReserved_4
    {
        private uint _1, _2, _3, _4;
    }

    [DebuggerBrowsable(DebuggerBrowsableState.Never)]
    private __dwReserved_4 __dwReserved;
}

The case @jnm2 and I have spent most of this thread discussing however is one of variable length arrays where the struct only serves as a sort of header. For such, being on the GC heap is invalid, since the GC moving it around would never move the memory after it. What if we define such structs as ref struct so they are not allowed syntactically by C# to be on the managed heap?
If we did that, how do you feel about returning a span from a property on the struct?

@tannergooding
Copy link
Member

What if we define such structs as ref struct so they are not allowed syntactically by C# to be on the managed heap?

I think it might be better to annotate them somehow and to construct an analyzer to help catch misuse.
In particular, that would prevent it from being used as a field in a class, but it would not prevent the equally invalid case of var x = new BITMAPINFO(); or BITMAPINFO x = *bitmapInfo;.
Likewise, ref struct will prevent it from being used with generics, which means cases such as Unsafe.AsRef, Unsafe.SizeOf, Unsafe.SkipInit and friends will all become blocked for these types.

If we did that, how do you feel about returning a span from a property on the struct?

I return a span from all of my own structs so I think it's fine. That being said....

The main "issue" with returning a span referencing a field on the struct is that it is "unsafe". That is, it's the case @jnm2 called out above: #117 (comment)

If you have something like:

internal unsafe Span<RGBQUAD> bmiColors
{
    get
    {
        fixed (void *p = &this.__bmiColors)
            return new Span<RGBQUAD>(p, properLength);
    }
}

You can still do:

var bitmapInfo = new BITMAPINFO();
Span<RGBQUAD> result = bitmapInfo.bmiColors;
return result;

and the lifetime of the result has now outlived x. The same issue exists with pointers if you end up calling the relevant close/free command on the BITMAPINFO* without first ensuring that the Span<RGBQUAD> is still unused. It's slightly less problematic though since it isn't an implicit step for freeing to occur.

@AArnott
Copy link
Member

AArnott commented Feb 22, 2021

I think it might be better to annotate them somehow and to construct an analyzer to help catch misuse.

That's a very interesting idea.

Likewise, ref struct will prevent it from being used with generics

Good point.

In your case of:

public Span<RGBQUAD> AsSpan(int length) => MemoryMarshal.CreateSpan(ref e0, length);

what prevents the result from that being kept by the caller beyond the lifespan of the struct itself? How is that better?

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 24, 2021

@tannergooding
Copy link
Member

tannergooding commented Feb 24, 2021

I know that ref Unsafe.AsRef<RGBQUAD>(p0 + index); is safe, this modifiers a pointer and directly reinterprets that pointer as a ref.

The concern was only with return ref *(p0 + index);, which I'm fairly certain is fine (particularly in the context of p0 being from a fixed statement).
But someone like JaredPar or someone from the runtime team might be able to confirm there are no edge cases like with say *value = new T() where, depending on where value is coming from, it may result in a hidden copy before assignment (rather than directly passing value as this)

@tannergooding
Copy link
Member

I think the thing that makes return ref _1[index]; always fine is that:

  1. It is to a local pointer (it is local because of how fixed works)
  2. The language has scoping rules for ref returns and blocks a ref return to a local, so it can't return a ref to a hidden copy without erroring

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 24, 2021

but there could always be some edge case I'm not remembering and that ref T Unsafe.AsRef<T>(void*) exists to cover

On that point, its existence does allow you to use managed types for T, which I don't think you can do with the other two syntaxes because you'd need impossible pointer types. Maybe there was no other reason for it to exist.

@AArnott
Copy link
Member

AArnott commented Feb 24, 2021

Do you want bounds checking?

Sure. Let's start with that. We can go with your inclination as far as syntax, which sounds to be the shorter one, and check with JaredPar later on the end result. Can you include a sample of the generated code in the PR description and then we can invite him to review it.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 24, 2021

@AArnott I implemented everything except the bounds checks, and I can't keep my eyes open so I'll look in tomorrow. I think you should be able to ask Jared to review sans the bounds checks, right? I updated the PR body with the code taken from the actual test.

@AArnott
Copy link
Member

AArnott commented Feb 24, 2021

Yes, thanks!

@AArnott
Copy link
Member

AArnott commented Feb 24, 2021

just noticed this, which is exciting: Safe fixed-size buffers

@jaredpar
Copy link
Member

But someone like JaredPar or someone from the runtime team might be able to confirm there are no edge cases like with say *value = new T() where, depending on where value is coming from, it may result in a hidden copy before assignment (rather than directly passing value as this)

The case I'm more worried about is the GC holes that this creates. Consider the following:

void M() {
  record R(__dwReserved_4 Prop); 
  var r = new R(default);
  ref r2 = ref r[0]; 
  Other(ref r2); 
}

This is unsafe no matter how you implement the indexer.

@tannergooding
Copy link
Member

tannergooding commented Feb 24, 2021

The case I'm more worried about is the GC holes that this creates

There should be no GC hole because you never have a pointer to unpinned memory, you only have a ref which will be correctly tracked by the GC.

In your scenario, the issue is that record doesn't expose the field and so due to a hidden copy you end up referring to a copy of Prop that is creating in the this indexer you inevitably have to expose for the code to be valid.
It's definitely a consideration for invalid code but is no more fundamentally broken than the following. That is, it leaks stack memory (and should hopefully be solved with the low level struct improvements):

var x = new BITMAPINFO();
return x.bmiColors;

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA+BLAdgFxlLAQwBsACAVywGdCAzGUqnKcsHUgfQ4BMB3AJRhV8ANxjcOKALAAoAN6zSS0gAEAzBWzsOARgA0nAEwGOakygDcsxcvWlYtTblI4AFhioBtLaWzcYCAC6NkoKMsoRpADmMDghkWGRSaS0GAjipAAU5FoAVKQADjqkALykAGS6AJTxyaSJdckqAOz2MI5F3lj+QVbhjUoAvrXKw/1DsmOysJBQ3KT8mVx8gsJQYhIopAAKUBAFVfXxdg5O7G4eXex+AcHjR/cRMXGPyg0DAPQfO3sFvlSkQiFX74HAAT0YEBcrnwDA8gNI7m4/iwpEgBTBIySX0RsJcULcDCYhDAAGsDG5yADCb5ugF8KpmlQsZEcUCHLCsGAGDgCTDGDgSaSUtBSAADZ4cACSdIQYpZERabUcu32XR6gT6dTGETGUxkdhUhlIAGFZO9gRgRIQ8JweAIhKJxJIfvstcpjhpTjlnABxWKCWiZHw3BA1e4WiI4gAq7gBrggvBgYig/xS2BgBmAMDAhCpcJwAHIAUDuBgZi86jiObAuTyoUDUjBiPMII4gUaAHQspWnVUFTu6d0TGSDIA==

@AArnott
Copy link
Member

AArnott commented Feb 24, 2021

@jaredpar suggested on another thread that we declare extension methods to define a workaround indexer:

public static ref int Indexer(this in _theStruct s) { 
  return // indexer magic here 
}

The usage pattern then becomes:

myStruct.InlineArrayField.Indexer(3) = unmanagedValue;
// which is equivalent to:
myStruct.InlineArrayField._3 = unmanagedValue;

At least until this is delivered on, which may or may not work on downlevel runtimes.

@AArnott
Copy link
Member

AArnott commented Feb 24, 2021

But I'd love for @tannergooding and @jaredpar go at it here till they agree, just in case we can do something better.

@tannergooding
Copy link
Member

I think @jaredpar's proposal for using an external indexer is fine. It should allow the compiler to correctly track the lifetime and help avoid bugs until the nicer feature can get added.
I think it needs to be this ref _theStruct s though, since in should be restricting it to only return a ref readonly int...

I was just disagreeing about whether or not the issue was a GC hole vs a potential leak to stack memory (which are both undesirable, but I think one is a bit worse 😄)

@AArnott
Copy link
Member

AArnott commented Feb 24, 2021

@jnm2 How are you feeling at this point? I don't want to abuse your willingness to contribute, and can take over your PR if you would rather not make this design change. But if you want to continue, you're welcome to. It sounds like we'll need to collect extension method declarations into a new collection and emit them into some static class. Maybe named PInvokeExtensions?

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 24, 2021

@AArnott I'm never sorry if someone takes over and frees me up to work on something else 😁 Bit busy at the moment, so I really ought to say go ahead and take over the PR, but I'll keep it moving if it's still there when I come back to it.

@hangy
Copy link

hangy commented Feb 24, 2021

PCWSTR and PWSTR seem to suffer from the same issue:

// internal Span<char> AsSpan() => this.Value is null ? default : new Span<char>(this.Value, this.Length);

If you use this code in a .NET 4.x program, you also need to reference the System.Memory package to get Span<T> and ReadOnlySpan<T>, even if you don't need it. Could one alternative to adding the package might be to surround the AsSpan() method with some #ifdef?

It's a bit weird having to add and distribute the System.Memory package DLLs even if you don't use the Span<T>.

@jaredpar
Copy link
Member

@tannergooding

There should be no GC hole because you never have a pointer to unpinned memory, you only have a ref which will be correctly tracked by the GC.

Yes you're right. I had that example wrong.

There is still a hole here. The returned ref isn't tracked by the compiler and can be freely returned from the method. If it points to a struct on the stack then that is a problem.

@tannergooding
Copy link
Member

There is still a hole here. The returned ref isn't tracked by the compiler and can be freely returned from the method. If it points to a struct on the stack then that is a problem.

Right, its a stack hole rather than a GC hole; which is still bad; just not corrupt the GC bad 😄

@AArnott
Copy link
Member

AArnott commented Feb 24, 2021

@hangy: We want to generate the best APIs available given your compiler version and target framework by default, and adding a System.Memory transitive dependency in our nuget package (when the TFM is at least net45) is how we can accomplish that.

For those who don't want to ship System.Memory with their app, you can always add this to your project file:

<PackageReference Include="System.Memory" Version="4.5.4" IncludeAssets="none" PrivateAssets="all" />

This should prevent the assembly from being referenced or copied to your bin folder.
So long as the generated code doesn't need a Span<T> type, you'll be fine. If you ask for an API that needs it though, you'll get a compile error till you allow the assembly to be referenced. @jnm2's goal with this issue is to make the codegen smarter to suppress references to span when the assembly reference is missing, so that may help support you as well.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 24, 2021

I was going to ask for that next. :-} I'd rather not have System.Memory added by default and have to opt out. I would want to maintain the package reference and version on my own, not tied to the source generator.

Opt-out is better than nothing, but it's also convoluted and not obvious that the source generator considers the dependency to be optional.

@AArnott
Copy link
Member

AArnott commented Feb 24, 2021

I would want to maintain the package reference and version on my own

To be clear, you can set the version yourself, and you can suppress it.

not obvious that the source generator considers the dependency to be optional.

It doesn't consider it to be optional. If you suppress it, some of our codegen will produce compile errors in your code.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 24, 2021

I would like to ask for it to be optional. That's for another thread perhaps.

@AArnott
Copy link
Member

AArnott commented Feb 24, 2021

@jnm2 Modifying all the codegen to accommodate a missing System.Memory reference sounds more broadly scoped than this issue, which has now had several conversation forks, etc. So ya, I suggest we get this one taken care of and then if you hit a case where other System.Memory dependencies show up where you don't want, you can open an issue that expresses the high level goal and gives a concrete example of where you can't meet it.
Again, I'm not sure how or if we'll fold that into what we officially support, as we're still pretty early and a lot is changing, but it's a reasonable request to consider.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 24, 2021

I'll be the workhorse if that makes it more likely to happen. It comes down to my consuming project being a library vs being an app, and not wanting the net4.8 target to drag along a slow span dependency just to create internal helpers that I will not likely even have a reason to use. Feels wasteful. I'll do as you suggest and follow up after this issue is closed.

@AArnott
Copy link
Member

AArnott commented Feb 25, 2021

@jnm2 I will take on finishing your current PR, since we have a defect right now in how we do inline arrays I am eager to fix it and it'll close this issue at the same time.

@jnm2
Copy link
Contributor Author

jnm2 commented Feb 25, 2021

@AArnott Sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants