Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Champion: by value vs in overload tiebreaker (15.6, as a 7.2 bug fix) #945

Open
2 of 5 tasks
gafter opened this issue Sep 28, 2017 · 32 comments
Open
2 of 5 tasks

Champion: by value vs in overload tiebreaker (15.6, as a 7.2 bug fix) #945

gafter opened this issue Sep 28, 2017 · 32 comments
Assignees
Labels
Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification Proposal champion Proposal
Milestone

Comments

@gafter
Copy link
Member

gafter commented Sep 28, 2017

  • Proposal added
  • Discussed in LDM
  • Decision in LDM
  • Finalized (done, rejected, inactive)
  • Spec'ed

Based on a suggestion by the LDM, we propose to add the following overload resolution tiebreaker in some future version of C#.

Two applicable methods can have identical parameter types, but differ in argument ref kind. This can occur when the ref kind of one is by value, but the other is in. For example, for the methods

void M(in int x) {}
void M(int y) {}

These would be ambiguous at the call site

M(expr);

Proposal (was implemented in 15.6):

We propose a new tie-breaker: if there is an ambiguity and there is no in at the call-site, then the byval overload wins.
This tie-breaker comes after existing tie-breakers.

Alternative proposal:

We propose a new tie-breaker as follows:

  • If the expression is an lvalue, we bind to the overload with an in parameter.
  • Otherwise, we bind to the overload with a value parameter.

Because it is not clear how valuable this tie-breaker would be, we are deferring it to some future time when we can judge the impact of ambiguities of this sort.

In the meantime, users can explicitly select the desired overload by using a named argument, an explicit in at the callsite, or by changing the declarations of the methods to be sufficiently different.

LDM

@HaloFour
Copy link
Contributor

I don't believe that it's mentioned elsewhere on this repo (e.g. #38) or that the LDM notes have been published yet, but the in parameter modifier replaces the previous readonly ref modifiers, correct?

@bbarry
Copy link
Contributor

bbarry commented Sep 29, 2017

Why not:

  • If the expression is an lvalue, we bind to the overload with an in parameter.
  • If the expression is not an lvalue, we bind to the overload with a value parameter.

Since all lvalues are rvalues, expressing the rule as a combination of the two is potentially ambiguous.

@gafter
Copy link
Member Author

gafter commented Sep 29, 2017

@HaloFour Yes
@bbarry Yes

@4creators
Copy link

4creators commented Sep 29, 2017

@4creators Yes

@4creators with regard to in keyword use No

IMO in keyword used instead of ref readonly is an unfortunate choice

@alrz
Copy link
Member

alrz commented Sep 30, 2017

IMO in keyword used instead of ref readonly is an unfortunate choice

Since in is now allowed at the call-site I think it's the least harmful choice -- passing in arg to readonly ref param would be unexpected. allowing both is always possible though, perhaps it could be reconsidered when we have readonly parameters.

@tannergooding
Copy link
Member

I commented this here: dotnet/roslyn#22387 (comment)

I am explicitly talking about the case where:

  • Today a user is consuming MyMethod(MyStruct val) from v1 of MyLibrary.
  • I want to take advantage of ref readonly, but do not want to break back-compat in v2 of my library

I cannot add MyMethod(in MyStruct val) because this forces users to upgrade and explicitly specify in at all callsites. They have no mechanism of staying on the previous byval overload if they want to upgrade/recompile (possibly to take advantage of other new features) but are not ready to switch over to the new overloads.

So, in order to take advantage of the feature I have to:

  • Use extension methods
  • Use an API with a different name (ex: MyMethodRef(in MyStruct val)), etc.
  • Something else

This hurts discoverability of my APIs, increases the number of items displayed in intellisense, makes quick-fixes from the IDE more difficult, etc.

Basically, I feel like not providing this in v1 is going to be a significant pain point for existing library authors who are wanting to upgrade to this feature. Authors shouldn't have to be tweaking the order arguments are passed (which may have been "optimized" for an underlying calling convention), the names of arguments (which may cause more changes required on their end or the consumers end),
etc.

@alrz
Copy link
Member

alrz commented Sep 30, 2017

@tannergooding So in that case I think in arguments wouldn't be needed?

And I think that's not a good idea that they are optional either, for example, you might specify in for just one of arguments and get the desired overload (at the time) that happens to have other in parameters. the single in makes it like other arguments are passed by value.

void M(in int i, in int j) {}
void M(int i, int j) {}

M(in i, j); // not ambiguous!

Perhaps we should require "all or none" of ref readonly arguments to be passed via in or we just have the tiebreaker from the beginning.

PS: this relates to #476 which propose the same for return types.

@tannergooding
Copy link
Member

@alrz, I believe the current stance is (but I may be wrong):

  • void MyMethod(MyStruct val) passes val by value
  • void MyMethod(in MyStruct val) passes val by reference and ensures it is readonly

If you only declare void MyMethod(MyStruct val), then you just call it as MyMethod(val)

If you only declare void MyMethod(in MyStruct val), then you can call it as either:

  • MyMethod(val)
  • MyMethod(in val)

If you declare both void MyMethod(MyStruct val) and void MyMethod(in MyStruct val) then calling

  • MyMethod(val) is "illegal" and the compiler will throw an ambiguity error
  • MyMethod(in val) is "legal" and the compiler will call the ref readonly overload

This means that if you declare both, there is no way to call the "pass by value" overload. This means that adding the new overload is a breaking change and it forces users to modify their code in order to recompile.

The known "workarounds" are:

  • Reorder the method parameters so they are no longer ambiguous
  • Give the method a different name

I don't think this is a good plan and believe it will largely prevent adoption of ref readonly in existing libraries (possibly including CoreFX, although I don't know the plans there, if any).

I don't think that this is something that is easily fixed post initial release either.

For example, imagine you have a new library and you only expose void MyMethod(in MyStruct val). This means users are allowed to do MyMethod(val) and it will still call the ref readonly overload.

The compiler then decides that not being able to call the "pass by value" overload (if one exists) is a problem and they make it so that, on ambiguous matches the "pass by value" overload is called.

Now say you have some customers who cannot call the ref readonly overload because they are using a language that doesn't support it. So you decide to add a "pass by value" overload for them. However, due to the decision to change the semantics post initial release, adding this new overload will cause any downstream consumers who recompile and that were using the ref readonly overload to now consume the "pass by value" overload (depending on how you wrote the overload, this could potentially cause a drop in performance or more work for the JIT to inline the call).

As such, I think the only "safe" thing to do (so that it is not a breaking change to add a ref readonly overload when their is an existing "pass by value" overload, or vice-versa) is to require in to always be specified at the callsite. The downside is that it is more difficult to use the feature.

@alrz
Copy link
Member

alrz commented Sep 30, 2017

This means that if you declare both, there is no way to call the "pass by value" overload. This means that adding the new overload is a breaking change

Here, even though we have an explicit way to "select" the overload via in. the fact that it's an "error" otherwise makes it a lot less useful.

@MadsTorgersen MadsTorgersen added this to the 7.3 Candidate milestone Oct 2, 2017
@VisualMelon
Copy link

VisualMelon commented Oct 28, 2017

I must confess complete confusion why - especially given that in has been tentatively selected as the keyword of choice - there is any support for not requiring the keyword at the call site. So very much misery can be mitigated by making this a requirement (which would be in concert with other features already present in the language which are of undisputed value).

At best I could only find a suggestion that ref readonly is rather long to type (and so people may be disinclined to use the feature); but even so, this is a feature presented for the purpose of performance, and typing a little more (or even requiring a tad more understanding) can surely be accepted. My apologies if I'm missing something, but I can see no practical value in not requiring the keyword at the call site, and nothing but misery for (particularly) those that have to use and maintain existing APIs which wish to employ it in the future.

I agree with just about everything @tannergooding has just said, and I think it only scratches the surface of how miserable life could be if this ambiguity is allowed. It is (genuinely, I lost sleep over this the last time I reviewed it) unthinkable to me this proposal should go through without requirement of the modifier at the call site. The only thing I am unsure I understand in the previous post by @tannergooding is the remark that "The downside is that it is more difficult to use the feature." - perhaps I am reading too much into this, but I would (really) love to hear what is considered difficult (as opposed to requiring typing, which I consider effort, but by no means difficult) about requiring the modifier at the call site.

I apologise for again being late to the party and not having a clue what is going on, but I do find myself completely bewildered by the proposal, and any polite pointers would be greatly appreciated. I believe that readonly refs will be a valuable addition, but it will only be used by a few people, and these are not people that I can imagine complaining about having to type readonly ref (let alone in) in the interest of maintainability and confidence in their own code. It has been recognised that this feature is solving an old problem, and as such (just as with the non-nullable reference type proposals) the 'introduction' into existing APIs must be a forefront consideration, and no happiness will come from silently breaking and changing these APIs.

Having 'complex' rules for tiebreakers (which do not tally with existing methods for selecting type-based overloading) is just going to create misery; I don't understand the proposed tiebreakers, they are non-obvious. Optional modifiers at the call site is an abhorrent concept: now there are 2 ways to write the same piece of code (except when it is different) which look different.

I cannot comprehend using parameter names for disambiguation, that won't always work, and even when it does it remains completely ambiguous to anyone reading the code... which is REALLY bad (an explicit cast (which with recent addition to VS is even more obvious) screams that something is afoot). This is not a solution on any grounds, and the mere suggestion that it could be employed frightens me.

This post is now far too long... but again, I must express my absolute disbelief that a non-required (let alone 'optional'!) call site modifier could persist in this proposal.

P.S. I find myself unable to determine the best place to rant about discuss this with regard to the 7.2 proposal, if anyone would be kind enough to direct me to such a place I would again be most grateful; or is Champion "Readonly ref" (C# 7.2) #38 indeed the right place to discuss this?

@gafter
Copy link
Member Author

gafter commented Oct 29, 2017

I must confess complete confusion why - especially given that in has been tentatively selected as the keyword of choice - there is any support for not requiring the keyword at the call site. So very much misery can be mitigated by making this a requirement (which would be in concert with other features already present in the language which are of undisputed value).

@VisualMelon C# 7.2, which includes in parameters, is pretty much done, unless we discover show-stopper fatal issues. This proposal is about a change in C# 7.3.

Where, exactly, in the syntax would you suggest the receiver of a call to an extension method should have the in keyword? Where would it be used by the caller when invoking a user-defined operator in which one or both operands are declared in?

@Joe4evr
Copy link
Contributor

Joe4evr commented Oct 29, 2017

My 2 cents: in in a parameter list is only a syntactic alias for ref readonly. Since without the readonly, the caller is required to put ref on its argument since C# 1.0, allowing the removal of that requirement when the readonly part is in play (and subsequently making an ambiguity in overload resolution) seems like a step backwards.

@gafter gafter removed their assignment Oct 29, 2017
@VisualMelon
Copy link

VisualMelon commented Oct 29, 2017

@gafter thanks for the response, I think I've finally got the measure of the milestone things... Respectfully, I do not see a trail of discussion to motivate the 7.2 decision, and it bothers me.

Regarding operators, so long as they aren't overloadable (which I believe they necessarily can't be, do correct me if I'm wrong) and can't return a ref readonly themselves then there is no issue with that (horrifying as I would suggest it is to cover over the fact). I do not think it would be unreasonable to simply omit ref readonly operators, or indeed for them to be different so long as the semantics can't go wrong. If they can return a ref readonly, then the semantics are changed and it would be distressing to not indicate this at the call site; however, the more 'visible' concern of overloading won't be present. The fact that we can' disambiguate means we can't disambiguate... it's a self-solving problem.

(Non-extention) methods, however, can go wrong in so very many ways (overloading, mysteriously changing API, non-identical semantics between ref readonly and 'ByVal'.

I also have no real answer for your question regarding extension methods, short of using them in the 'static' configuration (i.e. MyExtentions.Method(in target, args)). However, I would argue that the matter of these being impossible to disambiguate (short of using parameter names) suggest they need special treatment, independent of everything else. If they can be ambiguous, then they should be throwing errors (i.e. force you to use the unambiguous 'static' style). Same self-solving problem as operator overloading.

It doesn't seem reasonable to inflict the misery of ambiguity on 'typical' method calls (which are presumably the predominate use-case, being the most versatile) because we can't (perhaps for good reason) 'nicely' solve it for Extention methods and Operator overloads. I would suggest that where we can't (tidily) disambiguate, then we have to accept something less than 'ideal', but there is a perfectly tidy solution for standard method calls, whilst the current proposal (in 7.2 and furthered here) is terrifying, because it fails to address the ambiguity that can (and almost certainly will) appear.

There is nothing about the horror of extension methods and operator overloading that should necessarily be reflected in 'standard' method calls, so I am still left wondering why this decision seems to have gone rather unchallenged (so far as I can tell). The notes for the meeting over this merely states "Do we need it for ref readonly parameters, where there are no side effects by definition? We believe not.". I do not necessarily expect a thesis on the matter... but this does leave me feeling unfulfilled, because "there are no side effects by definition" is sadly not the whole story (the semantics of readonly ref are not (to my understanding, this was discussed elsewhere) equivalent to ByVal).

I believe that there should be a need for a very good reason to not require an explicit modifier at the call site; by default we should require explicit...ness. It feels like we are looking for reasons why we 'should' require it (of which there are plenty!)

C# is a relatively tight language, and introducing any manner of 'implicit'...ness is undermining the confidence the programmer can have in their code. It seems more than likely that disambiguation will become necessary, and if - when reviewing code - I have to double check which method is actually being called when I (by default!!!!!) omit the in modifier, then confidence in my code will suffer, as will my confidence in understanding other people's code. I genuinely can't think of a worse solution than making 'in' optional. What is so objectionable about typing in when it is what we want? I want the static checker to help me: it can't help me if I don't tell it what I want.

I beg you to reconsider this for 7.2. I will gladly find the time to write up a more tidy document outlining why I am vehemently against the proposal and repost to the other thread if it would be welcomed.

@VisualMelon
Copy link

VisualMelon commented Oct 29, 2017

@Joe4evr the suggestion is that the semantics of ref readonly and a plain-old ByVal call at sufficiently similar that it doesn't need to be made clear at the call site which is happening, because it 'probably wont' matter.

ref, as everyone agrees, absolutely must be recognised at the call-site, but it is only in complex situations (i.e. hard to debug ones) where the differences in semantics between ref readonly and ByVal matter at the call site (excepting threading, where the call-site would surely like to know that this thing it's passing in which it knows could be changed by another thread ought really to be given special treatment. I agree this would be massive step backwards, but it isn't as obvious why; excepting, of course, that this is what C# 'looks like' already, so arguably it is what people will expect, and all the issues with overloading.

@gafter
Copy link
Member Author

gafter commented Oct 29, 2017

@VisualMelon Operators can indeed be overloaded, and can use in on their parameters without any accommodation being required at the use site.

Part of the reason not to require anything at the use site is that we would like people to be able to improve the performance of their code bases by changing some value parameters to in parameters without requiring changes to every invocation site (the semantics do not differ for most purposes except for code bases that traffic in a dangerous combination of mutation and concurrency).

@VisualMelon
Copy link

VisualMelon commented Oct 29, 2017

Operators can indeed be overloaded, and can use in on their parameters without any accommodation being required at the use site.

@gafter My apologies, I was not aware of this specific; still, I maintain that issues with operator overloading should not necessarily influence 'typical' method calls. (This does of course suggest a discussion such as this proposal for 7.3, even if in were to be required in 'typical' method calls.)

Part of the reason not to require anything at the use site is that we would like people to be able to improve the performance of their code bases by changing some value parameters to in parameters without requiring changes to every invocation site (the semantics do not differ for most purposes except for code bases that traffic in a dangerous combination of mutation and concurrency).

That is a fair point. But I would argue a dangerous one. If an API is to change, I would rather like to be informed (and the API is changing, and it will change in places where there is a dangerous combination of mutation and concurrency: exactly the sort of code that this proposal sets out to make faster; one of the original objects to the proposal was that it isn't useful enough to most people). tannergooding has suggested why disambiguation may be necessary for interop purposes (either with language support, or by making breaking API changes, which would be less than ideal).

I feel compelled to present my example again as to why the semantics can be different even without concurrency (though they are a serious concern in their own right, enough at least in my book to present a 'show-stopping' case) when combined with ref readonly return. Behold:

void DoStuff()
{
    int myInt = 42;
    int myOtherInt = 7;

    readonly ref int result = Process(/*ref readonly ????*/ myInt, ref myOtherInt);

    Console.WriteLine(result);

    myInt = -1;

    Console.WriteLine(result); // this value may have changed
}

Even though the semantics within the method are readonly (ignoring concurrency) if returns this reference, then it is making a dodgy promise that the returned reference itself will not change; however, the call-site itself may change it. This is by no means clear, and the programmer should have to opt-in to this possibility. The prospect of having to review the (default) code which doesn't have an in to check that I'm not passing in a reference is, to me, unthinkable.

Regarding concurrency, I presume you are aware of the multitude of ways that this can go wrong, but in the interest of others who may be reading with amusement as I fail to clearly express myself, here is one example of how this may sneakily break existing code:

void MyMethod()
{
    int counter = 0;
    Action go = () => { while (condition) { DoWork(); counter++; } };
    go.BeginInvoke(null, null);

    while (condition)
    {
        // some logic
        UpdateProgress(counter); // this method _probably_ expects the value not to change
                                    // when my predecessor wrote the call 6 years ago, it couldn't
                                    // but now the API has (silently) changed, and I don't know if it will still work (or even that it might not)
    }
}

Regardless of how horrendous this code is, it could well exist (I've certainly bodged stuff like this in the past, and 'forgotten' to get back to it), and now it may be broken, and the compiler can't warn me. Indeed, in the real world, code like this will be much less 'obvious', the threads won't be visible at the call site: debugging this could be nightmarish.

@Joe4evr
Copy link
Contributor

Joe4evr commented Oct 29, 2017

the semantics do not differ for most purposes except for code bases that traffic in a dangerous combination of mutation and concurrency

Which is exactly something I'd be worried about (despite knowing full-well that mutation+concurrency is a recipe for disaster - it's when other (mostly new) devs don't know that they're in some hot water that's worrying). You're talking about pushing a potentially API breaking change through a language update. I don't see that going over very well.

@jcouv
Copy link
Member

jcouv commented Nov 8, 2017

From discussion with @VSadov, @OmarTawfik and @jaredpar, we'd like to propose a more straightforward tie-breaker: if there is an ambiguity and there is no in at the call-site, then the byval overload wins.

The benefit of this approach are that: (1) it's simpler than explaining concepts of lvalue and rvalue, (2) there is no worry for the client that the implementation changed implicitly.
The downside is that: (3) there is no perf gain until the client adds in to their call-sites.

@sharwell
Copy link
Member

sharwell commented Nov 9, 2017

@jcouv we can mitigate 3 with an IDE analyzer 👍

@udlose
Copy link

udlose commented Nov 20, 2017

Hopefully, I'm stating the obvious here....I think that writing code that is explicit and clearly states its intention is of the highest importance. Ambiguity in intention is dangerous.

If it was designed to be optional to avoid it being a breaking change, I'd argue that is the wrong reason. Being clear in intention, especially when a new language feature is added, should be considered more important. I'd prefer it to be breaking so that if my devs go sprinkling in everywhere in method declarations, they have to add it to the callsites as well. Again, making it clear.

IMO, in should be required - not optional.

@jcouv
Copy link
Member

jcouv commented Nov 20, 2017

@udlose Filed feature request for analyzer: dotnet/roslyn#23299

@gafter
Copy link
Member Author

gafter commented Nov 20, 2017

@udlose We try to optimize the usability of the C# language as it appears after a change has been made, not during the transition. We try not to have special rules about the way feature should behave on the basis of them being new. New features are not new forever. See also http://gafter.blogspot.com/2017/06/making-new-language-features-stand-out.html

@udlose
Copy link

udlose commented Nov 21, 2017

@jcouv - thank you for filing the feature request!

@gafter - I understand - my issue is the reasoning why it was not made required. Let me be clear - I am not against change. So many times a language feature is added that finds a way to obscure the code and the intention. I am a strong believer in having language features that encourage clear coding while creating dev efficiency. When something is made optional to prefer convenience over readability/clarity on intention, I think it is encourages devs to be sloppy which makes code less maintainable.

I'd also argue this is not a "breaking change". The reason why is that I have to opt-in to use the in keyword in the method decl. If I choose to use a new language feature, it is quite reasonable to expect that I need to make changes to support the usage of the optional feature.

@jaredpar
Copy link
Member

I'd also argue this is not a "breaking change". The reason why is that I have to opt-in to use the in keyword in the method decl

Forcing in to be required at the call site causes coed that compiles today, with an RTM version of a compiler, to not compile with the next version. That is our definition of a "breaking change".

@udlose
Copy link

udlose commented Nov 21, 2017

@jaredpar - would the API's of BCL be changing? If so, I'd agree. However, if this is now a feature that is available to me, and I can choose to use it or not, I'd respectfully disagree. If I can upgrade to the latest version of C# and still compile (without changing my code), that is a breaking change. If I change my code to use a new feature, keyword, etc. and some of the code then doesn't compile - that is not a breaking change IMO. Just my $.02

@jaredpar
Copy link
Member

@udlose

I've stated our policy on determining whether or not a change is breaking. There is certainly always discussion about the severity of the break. But that doesn't change the definition of whether a change is considered breaking by us. The change you are proposing is considered breaking.

Given that it's both breaking and already following an established pattern there is very little chance we would consider doing this.

@udlose
Copy link

udlose commented Nov 21, 2017

@jaredpar - understood. I appreciate your discussion with me. Thanks.

@OmarTawfik
Copy link
Contributor

@jcouv @jaredpar PR is merged. Should we close this?

@jcouv
Copy link
Member

jcouv commented Dec 8, 2017

@OmarTawfik It should stay open until feature is in official spec. At least that's what we've been doing for other championed language changes so far.

@jcouv jcouv changed the title Champion: by value vs in overload tiebreaker Champion: by value vs in overload tiebreaker (15.6) Dec 8, 2017
@jcouv jcouv changed the title Champion: by value vs in overload tiebreaker (15.6) Champion: by value vs in overload tiebreaker (15.6, as a 7.2 bug fix) Dec 8, 2017
@yaakov-h
Copy link
Member

Should this be marked as Finalized (done, rejected, inactive), considered that it's in Visual Studio 2017 v15.6 Preview 2?

@benaadams
Copy link
Member

I'm confused by this; why if its a readonly struct doesn't in always win?

@jcouv
Copy link
Member

jcouv commented Feb 5, 2018

In case of ambiguity, you can always force the 'in' overload by using the keyword. Any other tie-breaker would mean that you can't invoke one of the overloads even when you want to.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification Proposal champion Proposal
Projects
None yet
Development

No branches or pull requests