Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Workaround users having strongly typed HRESULTs on instance methods such as PreserveSig'd COM members #23955

Conversation

jkoritzinsky
Copy link
Member

Some of our users (such as WPF), use a struct to wrap their HRESULT return types on COM members when they use PreserveSig. When we updated CoreCLR to correctly handle the Windows calling convention, we broke this behavior.

This PR adds a workaround to marshal structs that have only one member that is a (struct that has only one member of a...) primitive type as though it was the primitive type.

Add tests verifying behavior.

cc: @jeffschwMSFT @davidwrighton

…uch as PreserveSig'd COM members. Add tests verifying behavior.
@@ -618,6 +618,41 @@ class ILMarshaler
#else
byrefNativeReturn = true;
#endif
// We need to adjust the treatment of a struct with one field to be the type of the field
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to provide some way for the user to indicate they want this marshaled correctly?

Is this only violated for the COM case or also for the explicit CallingConvention.ThisCall case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s for both COM and ThisCall. There’s a pretty easy way to disable this workaround. Make your strict explicit and make a second field of the same type as the first field at the same offset. It’ll match the layout of the original type and opt-out of this workaround.

The test updates in this PR have 2 examples of this workaround.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to provide some way for the user to indicate they want this marshaled correctly?

At this point probably not.

Is this only violated for the COM case or also for the explicit CallingConvention.ThisCall case?

This was discovered in a COM scenario, but would impact the CallingConvention.ThisCall case as well.

Copy link
Member

@tannergooding tannergooding Apr 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make your strict explicit and make a second field of the same type as the first field at the same offset.

Isnt this more expensive for a number of cases? IIRC, we cant optimize this as well and it will cause both fields to be copied/handled in some scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s only more expensive (at least in the interop case) if your structure has one field that is (after unwrapping the single-field structs) a non-blittable primitive. So only structs that have a single field of type bool or an ANSI char.

@tannergooding
Copy link
Member

Have we considered a compat switch for this in particular? It seems like it would be goodness to push people towards doing the correct thing here rather than permanently hardcoding ourselves to do the wrong thing.

CC. @jkotas

@AaronRobinsonMSFT
Copy link
Member

Have we considered a compat switch for this in particular?

At this point, we aren't going to provide a quirk style switch. Our plan is potentially going to involve creating an attribute to alter behavior but that is still TBD. More than likely this will require a spec and an official issue.

@jkotas
Copy link
Member

jkotas commented Apr 13, 2019

This break makes me worried. I think we should consider undoing the fix for COM and leave it broken as it always was. I do not think that there is net-positive value in trying to fix 15 year old bugs in build-in COM that people depend on.

@jkotas
Copy link
Member

jkotas commented Apr 13, 2019

people towards doing the correct thing here rather than permanently hardcoding ourselves to do the wrong thing.

The build-in COM support does what it does. It has number of bugs or designs that do not quite make sense. It is incredibly hard to evolve it or fix anything in it.

I think we should leave it alone, and instead focus our energy on enabling interop solutions that are not built into the runtime, for example https://github.com/SharpGenTools/SharpGenTools.

@jkotas
Copy link
Member

jkotas commented Apr 13, 2019

Is this only violated for the COM case or also for the explicit CallingConvention.ThisCall case

I would keep the fix for non-COM ThisCall. Non-COM ThisCall is used a lot less frequently than COM. I expect that the breakage from it is going to be a lot less severe.

@AaronRobinsonMSFT
Copy link
Member

I do not think that there is net-positive value in trying to fix 15 year old bugs in build-in COM that people depend on.

@jkotas I agree this has the potential to be annoying and we need to monitor it carefully. The general philosophy has been to not guarantee bug for bug compat and as such I am inclined to see where this takes us. I hear what you are saying about impacting existing customers but as we have spoken about before there is very little possibility that COM code coming from .NET Framework to .NET Core will "just work" without some modifications. Documenting these kind of changes through guidance documents for such a transitions seems to be a reasonable cost.

@jkotas
Copy link
Member

jkotas commented Apr 13, 2019

.NET Framework to .NET Core will "just work" without some modifications

I agree that the modifications are likely required around activation type of things by design. This is not that.

With this PR, .NET Framework will have one incorrect behavior and .NET Core will have a different incorrect behavior. I think it would be better for both .NET Framework and .NET Core to have same incorrect behavior. It is much easier to explain.

@AaronRobinsonMSFT
Copy link
Member

.NET Core will have a different incorrect behavior

I agree that .NET Framework will have incorrect but accepted behavior. In the .NET Core scenario the behavior will be entirely correct except for a minor caveat for what we believe to be a not uncommon pattern - we only have WPF at the moment though. The proposal here was to accept we can get .NET Core almost perfectly correct except for a single scenario - which we would be able to alter/skip with a potential attribute or document as a temporary quirk in .NET 3.0. Post 3.0 we have the opportunity to make it 100 % correct with no caveats.

@jkoritzinsky What is the cost to revert the semantics only for COM scenarios to those found in .NET Framework?

@AaronRobinsonMSFT
Copy link
Member

I agree that the modifications are likely required around activation type of things by design. This is not that.

There may be other modifications required other than activation. Particularly around class interface scenarios. That is my general point here that changes will more than likely need to happen regardless of what we do.

@jkotas
Copy link
Member

jkotas commented Apr 13, 2019

Particularly around class interface scenarios.

This falls into activation type of things category in my mind.

@jkoritzinsky
Copy link
Member Author

It’s really cheap to revert to the old behavior for COM scenarios and only do correct behavior for ThisCall. Cheaper than the workaround I implemented here. I can put that change together on Monday if we decide to take that route.

@weltkante
Copy link

Wait, am I reading that right that using a HRESULT struct wapper is a bug not a feature? We are also using a struct around an int (with explicit layout attribute) and were assuming the builtin marshalling just did its magic. Using structs makes the interop code so much more readable.

@AaronRobinsonMSFT
Copy link
Member

@weltkante It is indeed a bug that we have had for a long time. Thank you for saying something though, because you just pushed me over the edge that we can't change this behavior.

@jkoritzinsky Please prep a fix to revert the behavior from applying to COM as @jkotas suggests.

@jkoritzinsky
Copy link
Member Author

Closing in favor of #23974

@saucecontrol
Copy link
Member

Using structs makes the interop code so much more readable.

Wouldn't this

using HRESULT = System.Int32;

allow for the same readability as this

struct HRESULT {
    public int value;
}

without requiring the calling conventions to stay broken for COM?

@weltkante
Copy link

weltkante commented Apr 14, 2019

The point why we did that in our codebase was that you can add members to the struct, which makes code more readable. More importantly it is strongly typed and thus avoids a lot of simple accidental mistakes happening in the first place. You just have much more control over the operations and conversions allowed on it.

Anyways, as I wrote on the follow-up PR I'd really like to know why this prevents fixing a COM interop bug (or what the bug is in the first place). I always assumed that an explicit layout struct with a single member at offset zero is equivalent to that member, as far as memory layout is concerned. Do the calling conventions in question differentiate (on the native side) between plain integers and C structs containing a single integer? Otherwise I don't know what would stand in the way of fixing it (besides the work required to implement it).

@jkotas
Copy link
Member

jkotas commented Apr 14, 2019

Do the calling conventions in question differentiate (on the native side) between plain integers and C structs containing a single integer?

Yes: #23974 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants