Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Expression blocks #3086

Open
cston opened this issue Jan 7, 2020 · 288 comments
Open

Proposal: Expression blocks #3086

cston opened this issue Jan 7, 2020 · 288 comments
Assignees
Milestone

Comments

@cston
Copy link
Member

cston commented Jan 7, 2020

Proposal

Allow a block of statements with a trailing expression as an expression.

Syntax

expression
    : non_assignment_expression
    | assignment
    ;

non_assignment_expression
    : conditional_expression
    | lambda_expression
    | query_expression
    | block_expression
    ;

block_expression
    : '{' statement+ expression '}'
    ;

Examples:

x = { ; 1 };  // expression block
x = { {} 2 }; // expression block

y = new MyCollection[]
  {
      { F(), 3 }, // collection initializer
      { F(); 4 }, // expression block
  };

f = () => { F(); G(); }; // block body
f = () => { F(); G() };  // expression body

Execution

An expression block is executed by transferring control to the first statement.
When and if control reaches the end of a statement, control is transferred to the next statement.
When and if control reaches the end of the last statement, the trailing expression is evaluated and the result left on the evaluation stack.

The evaluation stack may not be empty at the beginning of the expression block so control cannot enter the block other than at the first statement.
Control cannot leave the block other than after the trailing expression unless an exception is thrown executing the statements or the expression.

Restrictions

return, yield break, yield return are not allowed in the expression block statements.

break and continue may be used only in nested loops or switch statements.

goto may be used to jump to other statements within the expression block but not to statements outside the block.

out variable declarations in the statements or expression are scoped to the expression block.

using expr; may be used in the statements. The implicit try / finally surrounds the remaining statements and the trailing expression so Dispose() is invoked after evaluating the trailing expression.

Expression trees cannot contain block expressions.

See also

Proposal: Sequence Expressions #377
LDM 2020-01-22
https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-26.md#discriminated-unions
https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-08-28.md#block-bodied-switch-expression-arms

@CyrusNajmabadi
Copy link
Member

{ F(); 4 }, // expression block

In terms of impl, this will be a shockingly easy mistake to make (i do it all the time myself). We shoudl def invest in catching this and giving a good message to let people know what the problem is and how to fix it. i.e. if we detect not enough expr args, oing in and seeing if replacing with a semicolon with a comma would fix things and pointing peoplt to that as the problem.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Jan 7, 2020

Control cannot leave the block other than after the trailing expression unless an exception is thrown executing the statements or the expression.

Is this for ease of impl, or is there a really important reason this doesn't work at the language level? for example, i don't really see any issues with continuing (to a containing loop) midway through one of these block-exprs.

@YairHalberstadt
Copy link
Contributor

I also don't see the reasons for any of the restrictions TBH, other than expression trees.

@cston
Copy link
Member Author

cston commented Jan 7, 2020

Control cannot leave the block other than after the trailing expression unless an exception is thrown executing the statements or the expression.

Is this for ease of impl, or is there a really important reason this doesn't work at the language level.

The evaluation stack may not be empty at the continue.

int sum = 0;
foreach (int item in items)
{
    sum = sum + { if (item < 3) continue; item };
}

@CyrusNajmabadi
Copy link
Member

The evaluation stack may not be empty at the continue.

Riht... but why would i care (as a user)? From a semantics perpective, it just means: throw away everything done so far and go back to the for-loop.

I can get that this could be complex in terms of impl. If so, that's fine as a reason. But in terms of hte language/semantics for the user, i dont' really see an issue.

@orthoxerox
Copy link

@CyrusNajmabadi as a user I find the example by @cston hard to grok. Yanking the whole conditional statement out of the expression block makes everything MUCH clearer. Do you have a counterexample where return, break or continue work better inside an expression block?

@CyrusNajmabadi
Copy link
Member

In terms of impl, we should look at the work done in TS here. in TS { can start a block, or it can start an object-expr. Because of this, it's really easy to end up with bad parsing as users are in the middle of typing. It important from an impl perspective to do the appropriate lookahead to understand if something should really be thought of as an expression versus a block.

@CyrusNajmabadi
Copy link
Member

Consider the following:

{ a; b; } ;

A block which executes two statements inside, with an empty statement following.

{ a; b };

An expression-statement, whose expression is a block expression, with a statement, then the evaluation of 'b'.

Would we allow a block to be the expression of an expr-statement? Seems a bit wonky and unhelpful to me (since the value of hte block expression would be thrown away).

Should we only allow block expressions in the case where the value will be used?

@gafter gafter added this to the 9.0 candidate milestone Jan 7, 2020
@jcouv
Copy link
Member

jcouv commented Jan 7, 2020

@cston To avoid the look-ahead issue, I would suggest an alternative change:

block
  : '{' statement* expr '}'
  ;

This means that we always parse { ... as a block, even if it has a trailing expression. Then we can disallow in semantic layer.

I think this would solve the look-ahead issue for the compiler, but not so much for humans. I'd still favor @{, ${ or ({ to indicate this is an expression-block.

@CyrusNajmabadi
Copy link
Member

${

yes. I'm very on board with a different (lightweight) sigil to indicate clearly that we have an expr block

@HaloFour
Copy link
Contributor

HaloFour commented Jan 7, 2020

How about ={ 😁

@Joe4evr
Copy link
Contributor

Joe4evr commented Jan 7, 2020

I wonder if the ASP.NET team would lean their preference to @{ since that's already established for a statement block in Razor syntax. 🍝

@spydacarnage
Copy link

spydacarnage commented Jan 7, 2020 via email

@mikernet
Copy link

mikernet commented Jan 7, 2020

This is kinda neat but the syntax definitely bothers me as being too subtle of a difference for block vs expression. I think having $ as a prefix is more sensible and easier to recognize when reading.

@Trayani
Copy link

Trayani commented Jan 7, 2020

I'm not bothered by the semicolon, but understand the potential confusion.

Also, if I undestand correctly, it will not be possible to simply relax the syntax and let the compiler decide whether the block is statement / expression due to lambda type inferrence. Correct?

@MadsTorgersen
Copy link
Contributor

I think this is really promising, and a good starting point.

We've been circling around the possibility of being able to add statements inside expressions for many years. I like the direction of this proposal, because:

  • the {...} is recognizable from statement blocks. I know that curly braces are already somewhat overloaded, and there will be ambiguous contexts, but from a cognitive perspective I think it doesn't make the situation significantly worse, and is preferable to adding some new syntax for statement grouping.
  • It provides natural and easy-to-understand scoping for any variables declared inside, including those declared in the trailing expression (e.g. through out variables).

Within that, I think there are several design discussions for us to have:

  • Should the result be produced by a single expression at the end (as proposed here), or via a result-producing statement (e.g. break expr; has been proposed in Proposal: Block-bodied switch expression arms #3037 and Proposal: Enhanced switch statements #3038)? In the latter case it would be syntactically equivalent to a block statement, and just have different semantic rules (just as the difference between a block used for a void-returning vs a result-returning method). The former may work best for shorter blocks, the latter for bigger ones. Which should we favor?

  • Is the proposed precedence right? This disallows any operators from being applied directly to the statement expression. That's probably good, but needs deliberation. It limits the granularity at which an expression can easily be replaced with a block (though of course you can always parenthesize it, like every other low-precedence expression).

  • Should a block expression be allowed as a statement expression? probably not!

  • The proposal requires there to be at least one statement. That's kind of ok if the statement block is used for prepending statements to your expression! But once it's in the language I can imagine wanting to use it just to scope variables declared in a single contained expression.

  • I don't like the proposals for prepending a character so that you can "tell the difference", but that's another discussion to have. I don't think anyone other than the compiler team wants to "tell the difference". 😁

  • There's a potential "slippery slope" argument to allow other statement forms as expressions somehow. I don't think that's very convincing, since such statements should just be put inside of a block expression! But I can see that coming up.

  • We should make sure we gather the important scenarios. I've heard two really convincing ones:

    • as the branches of switch expressions (and switch statements if we do Proposal: Enhanced switch statements #3038). Switch expressions are themselves so complex that reorganizing the code to get a statement in becomes intrusive.
    • as "let-expressions" where a temporary local variable (or function) is created just for the benefit of one expression.

    Proposal: Block-bodied switch expression arms #3037 has examples of the former. An example of the latter might be:

    var length = { var x = expr1; var y = expr2; Math.Sqrt(x*x + y*y); }

At the end of the day, this is the kind of feature that, even when we've done the best we can on designing it, it just doesn't feel right and we end up not doing it. Putting statements inside expressions may just fundamentally be too clunky to be useful.

@HaloFour
Copy link
Contributor

HaloFour commented Jan 8, 2020

Allow a block of statements with a trailing expression as an expression.

I'd love it if this were possible without requiring a modified syntax. Sure, I understand that this would change the meaning of existing code, but most of the time that change would be that a value is harmlessly discarded. I am aware of at least one situation where this could affect overload resolution for lambda expressions, are there others?

@MgSam
Copy link

MgSam commented Jan 8, 2020

If I'm understanding the proposal correctly this would feel very weird when used with expression-bodied members.

class A 
{
    int Foo() => 5; //Expression

    int Foo2() => { ; 5 } //Expression block?

    int Foo3() => { return 5; } //Not allowed
}

@333fred
Copy link
Member

333fred commented Jan 8, 2020

@MgSam that's what Mads is pointing out with "Should a block expression be allowed as a statement expression? probably not!"

@YairHalberstadt
Copy link
Contributor

If statement expressions are added, and "Control cannot leave the block other than after the trailing expression", there's increased incentive to make conditionals more user friendly, so that it's easier for the result of a block expression to depend on a test.

I find deeply nested conditional expressions highly unreadable. This suggests that we should allow if-else expressions.

This also cuts the other way. With sequence expressions it's much easier to turn an if-else with multiple statements into an expression. All you have to do is remove the final semicolon

In scala and rust it's common for the entirety of a method to consist of a single expression consisting of multiple nested if-else expressions. I find this to be a really nice style.

@0x000000EF
Copy link

If I understand correctly the main motivation of this proposal is only switch statements #3038.

Really I don't see another value benefits from this, much more desirable for me it is something like with operator.

Consider slightly changed @MadsTorgersen example

var length =
{
    var (x, y) = (GetX(), GetY());
    Math.Sqrt(x*x + y*y);
}

much more clear and obvious for me

var (x, y) = (GetX(), GetY());
var length = Math.Sqrt(x*x + y*y);

or hide variables into functional scope

double CalculateDistance(double x, double y) => Math.Sqrt(x*x + y*y);
var length = CalculateDistance(GetX(), GetY());

So, from this point Expression blocks looks for me like a local function body without signature and parameters called immediately

double CalculateDistance()
{
    var (x, y) = (GetX(), GetY());
    return Math.Sqrt(x*x + y*y);
}
var length = CalculateDistance();
var length =
{
    var (x, y) = (GetX(), GetY());
    return Math.Sqrt(x*x + y*y); // it should contains explicit 'return'
}

But I am not sure that this is really important and value feature...

@YairHalberstadt
Copy link
Contributor

YairHalberstadt commented Jan 8, 2020

@0x000000EF

The expression block can take place in a deeply nested expression, where converting it to a set of statements would require significant refactoring.

@ronnygunawan
Copy link

ronnygunawan commented Jan 8, 2020

I think this would solve the look-ahead issue for the compiler, but not so much for humans. I'd still favor @{, ${ or ({ to indicate this is an expression-block.

I think { is good enough. We can always parenthesize it as ({ when needed.

If I understand correctly the main motivation of this proposal is only switch statements #3038.

Really I don't see another value benefits from this, much more desirable for me it is something like with operator.

Ternary operator and object initialization will benefit from this too.

var grid = new Grid {
    Children = {
        ({
            var b = new Button { Text = "Click me" };
            Grid.SetRow(b, 1);
            b
        })
    }
};

@0x000000EF
Copy link

@YairHalberstadt, can you provide an example?

@ronnygunawan, seems looks more clear...

Button CreateClickMeButton()
{
    var b = new Button { Text = "Click me" };
    Grid.SetRow(b, 1);
    return b;
}

var grid = new Grid {
    Children = {
        CreateClickMeButton()
    }
};

@mikernet
Copy link

mikernet commented Jan 8, 2020

@0x000000EF When building deeply nested UIs using code it is often desirable to have the elements declared right where they are in the tree, not split off somewhere else. It mirrors the equivalent XAML/HTML/etc more closely and it's easier to reason about the structure of the UI.

@MadsTorgersen

I don't think anyone other than the compiler team wants to "tell the difference"

I'm not sure what you mean by that. I think it's useful to be able to reason about the difference in behavior between...

f = () => { F(); G(); }; // block body
f = () => { F(); G() };  // expression body

...with something less subtle than just the absence of the semicolon, particularly if the proposal to implicitly type lamdas to Action/Func in the absence of other indicators gains traction. I guess the stylistic nature of the second example just feels a bit odd to me in the context of C# but maybe with time I'd get over that. A keyword before the trailing expression would solve that minor gripe as well but I'm not overly invested either way, just a suggestion to consider.

@0x000000EF
Copy link

@mikernet, it is not a big problem if we have something like with operator

static T With<T>(this T b, Action<T> with)
{
    with(b);
    return b;
}

var grid = new Grid {
    Children = {
        new Button { Text = "Click me" }.With(b => Grid.SetRow(b, 1))
    }
};

@wrexbe
Copy link

wrexbe commented Aug 19, 2023

Special case lambdas?

var a = () => {
return 1;
}();

// Prints 11
a.
{(int numa) => numa+4}.
{(int yay) => {
var numa = 3;
numa += numa+yay;
return numa;
}.
{(int z) => Console.WriteLine(z)};

@piju3
Copy link

piju3 commented Jan 20, 2024

As someone who had to switch from F# to C#, I think this feature should be high priority.

"Everything is an expression" is probably the most essential trait of functional programming. It seems like a minor distinction at first, but when you get used to conditional blocks having a return value, you start structuring all your code in terms of what you want the final value to be rather than the process. Something like:

var userAccess = {
    if(credentials ≠ null)
        if(check(credentials))
            AuthenticationOk;
        else{
            Logger.Warning("Invalid access attempt");
            InvalidCredentials;
        }
    else Anonymous;
}

Which keeps all the logic neatly contained.

And with C# having already implemented most of the other parts of FP (like immutable objects), it hurts to see it so close to allowing functional-style code but still missing the ability to do this.

@dersia
Copy link

dersia commented Feb 1, 2024

I still think it is a bad idea if there wouldn't be a keyword for "this is what comes out of the expression block" and it would be just by convention that the last part of a expression block has to be a expression that is then returned.

I still have big sympathies for an expression with the out keyword. and I also like the idea of using $ before the expression block to clarify that this is a "returning" expression block. From a code reviewer point of view, this makes it very easy to spot and understand what is going on.

return ${ f(); out e; }; 

@mrwensveen
Copy link

@dersia I fully agree. I'd love it if ${ /*something*/ } was syntactic sugar for (() => { /*something*/ })(), i.e. an immediately invoked parameterless delegate/lambda.
I know return seems to be less popular than out because of conflicting syntax inside of other code blocks/scopes, but I think the dollar-sign block would solve this because we can easily detect the special context we're in.

Examples:

var userAccess = ${
    if(credentials ≠ null)
        if(check(credentials))
            return AuthenticationOk;
        else{
            Logger.Warning("Invalid access attempt");
            return InvalidCredentials;
        }
    else return Anonymous;
};

var r = ${
    Span<int> sp = stackalloc int[1];
    ref int r0 = ref sp[0];
    return r0 + x;
};

just translates to

var userAccess = (() => {
    if(credentials ≠ null)
        if(check(credentials))
            return AuthenticationOk;
        else{
            Logger.Warning("Invalid access attempt");
            return InvalidCredentials;
        }
    else return Anonymous;
})();

var r = (() => {
    Span<int> sp = stackalloc int[1];
    ref int r0 = ref sp[0];
    return r0 + x;
})();

@ayende
Copy link

ayende commented Feb 2, 2024

@mrwensveen I would really not like this. Immediately invoked lambda means that you will be allocating.

Consider:

 var isAdmin = $(
             var parts = credentials.Split(':'); 
             out parts[0] == "admin" && parts[1] == "bad-pass";
);

If it is lowered to:

public int M(string credentials) {
    bool isAdmin;
    {
        var parts = credentials.Split(':');
        isAdmin = parts[0] == "admin" && parts[1] == "bad-pass";
    }
    return isAdmin ? 1 : 2;
}

public int N(string credentials) {
    bool isAdmin = ((Func<bool>)(() =>
    {
        var parts = credentials.Split(':');
        return parts[0] == "admin" && parts[1] == "bad-pass";
    }))();
    return isAdmin ? 1 : 2;
}

Look at the second implementation:

    public int M(string credentials)
    {
        string[] array = credentials.Split(':');
        if (!(array[0] == "admin") || !(array[1] == "bad-pass"))
        {
            return 2;
        }
        return 1;
    }

    public int N(string credentials)
    {
        <>c__DisplayClass1_0 <>c__DisplayClass1_ = new <>c__DisplayClass1_0();
        <>c__DisplayClass1_.credentials = credentials;
        if (!new Func<bool>(<>c__DisplayClass1_.<N>b__0)())
        {
            return 2;
        }
        return 1;
    }

Those are two allocations to pay for this feature (one for the state, one for the delegate).
It is also going to be a trap because of the "one closure type per method" which may cause very unexpected behavior.

It should be effectively just lexical scope added, nothing more.

@dersia
Copy link

dersia commented Feb 2, 2024

@mrwensveen I do a lot of code reviews and from that point of view I see problems with using return over out.
You are right, that the $ before a block might make a difference in how that return is handled by the language, but often in reviews you might have blocks that are very long and might span over multiple screens. This would mean, that I have to scroll all the way up, just to see if there is a $ before the block.
When using out on the other hand, I see directly that this is the "return" value of the expression block, since out is today only allowed in parameters/arguments.

There is also another nice trick that we get with out over return. Consider the following:

public int Total { get; set; }

public void FooBar(int number1, int? number2)
{
    int foo = ${
        if (number2 is null)
        {
            return;
        }
        int result = number1 + number2;
        out result;
    }

    Total = foo;
}

this would give me an escape path, if I don't want to finish the block expression by just using return and semantically return would keep its meaning as of today. Same applies to break and continue.

@mrwensveen
Copy link

@dersia Fair point. I stand corrected.
@ayende I didn't consider allocations and I agree that it wouldn't make sense to implement it this way. The analogy with a delegate helps me mentally to think about this. Sorry for the confusion.

@GeirGrusom
Copy link

I don't really see the point of the $ sigil, and it hurts the possibility of introducing this functionality in other contexts such as if-else, try-catch, or classic switch statements. It should be clear from context whether a block is an expression block or not, and the compiler can easily warn/err if there's any issues.

At least for try blocks I think that this feature would be a natural fit, but it would look wonky with the sigil.

@timcassell
Copy link

@GeirGrusom I'm having trouble trying to visualize what you mean. How would you use a block expression in a try block? Why would you need to use block expressions in a switch statement? And why does the presence or absence of $ affect it?

@HaloFour
Copy link
Contributor

I think the desire is that all statements eventually can become expressions in C#, which is something that the language team has expressed not being interested in pursuing. Most of the reason why a method of differentiating expression blocks comes from the fact that it's not possible to implement them as such without breaking changes.

@GeirGrusom
Copy link

GeirGrusom commented May 28, 2024

@GeirGrusom I'm having trouble trying to visualize what you mean. How would you use a block expression in a try block? Why would you need to use block expressions in a switch statement? And why does the presence or absence of $ affect it?

Well, you'd end up with two syntaxes for the same thing.

One example here:

var catfact = try // Should there be a sigil here?....
{
  if(cfg is CatFactFromDb)
    out Db.GetCatFact();
  else
    out Net.GetCatFact();
}
catch(OperationCancelledException)
{
  out Db.GetCatFact();
}

Here there is no sigil, and why would there be.

However if you have an expression block, then it for seemingly no good reason you need a sigil:

var catfact = ${
  if(cfg is CatFactFromDb)
    out Db.GetCatFact();
  else
    out Net.GetCatFact();
};

In my opinion the sigil does absolutely nothing. It's nothing more than pointless ceremony for the sake of pointless ceremony. It does nothing for the compiler that the compiler can't figure out without it, it does not actually add anything of importance to the reader, since it's very clear from the usage that this is going to be an expression block, and if there is any confusion the compiler will produce warnings and errors.

And if the compiler team wants to make more statements into expressions, then it's just going to add confusion, and in my opinion, a wart in the language.

Edit: What I want to ask is, is out only allowed in blocks annotated with the sigil? If not, then what's the point? And even then, it doesn't actually add anything.

I don't think the sigil will noticibly improve readibility, but adds a syntactic notation for something where you'll just end up annoyed that you have to go back to the beginning and add a ceremonial symbol just because that's what the syntax demands.

Edit2: Removed a bit that might have unintentionally seemed a bit hostile.

@GeirGrusom
Copy link

Maybe that's a word salad, but in my opinion it's just natural that syntax are expressions, and the point of this feature is more that the language now allows blocks to be expressions (because that would not have compiled in earlier versions of the language), and add a statement to return a value in those blocks. If there's a sigil then it's a separate thing, and it adds syntax to the language that I don't think actually adds any value. It's just "remember to put this symbol here if you want to use this existing language construct in that context!"

@jamesmh
Copy link

jamesmh commented Jul 23, 2024

Blocks

My thinking starts with "how does a code block work right now"?:

public int SomeMethod()
{
    // Do some stuff

    var c = 0;
    // For some reason, we don't want context in the block to leak out.
    {
        var a = 5;
        var b = 4;
        c = a + b;
    }
    
    // Some more stuff

    return c;
}

From this starting point, what would feel intuitive to me is using a return syntax:

public int SomeMethod()
{
    // Do some stuff

    var c = 
    {
        var a = 5;
        var b = 4;
        return a + b;
    }
    
    // Some more stuff

    return c;
}

Syntactically, this would be close to what has been mentioned by others (an immediately invoked function):

public int SomeMethod()
{
    // Do some stuff

    var c = new Func<int>(() =>
    {
        var a = 5;
        var b = 4;
        return a + b;
    })();
    
    // Some more stuff

    return c;
}

This wouldn't have to be a function under the covers, but it just "feels" like what many developers might already do.

Terseness / If statements

With the following example, I think I do prefer not having return:

public int SomeMethod()
{
    return { var a = 5; var b = 4; a + b; }
}

public int SomeMethod(int val)
{
     if ({ var val2 = val + 2; val2 + 2 } == 10)
     {
         // do stuff
     }
}

Vs.

public int SomeMethod()
{
    return { var a = 5; var b = 4; return a + b; }
}

public int SomeMethod(int val)
{
     if ({ var val2 = val + 2; return val2 + 2 } == 10)
     {
         // do stuff
     }
}

I think I'm at a tie here - sometimes I like having the return and sometimes I don't 🤷

@HaloFour
Copy link
Contributor

@jamesmh

Expanding on code blocks in this manner has been explored before. The problem is that code blocks are already legal syntax, making them into expressions would fundamentally change a lot of existing code in C#. For example, a block in a lambda body indicates that it is an Action, not a Func<T>, and for places that accept overloads that would silently change the meaning of that code.

The other problem is that return within a block is also legal, and breaks out of the entire method. There would need to be a completely different syntax within a block.

@GeirGrusom
Copy link

Return would definitely change the behavior of existing code, which is why out was proposed. I think that actually solves everything since out is a reserved keyword.

int a = { out 123; }; // No go. Expression blocks cannot be used a value.
m({out 123}); // Also nope. Cannot be used as a value.
int b = { out (123); }; // Nope. You can't use out as a name of a method or variable since it's reserved so no sneakiness allowed.
int[] c = { out 123; }; // Nay, but remove the `out` and semicolon then it becomes legal, but standalone values are not legal statements in C# so it's not ambiguous with expression blocks. An expression block has to have statements in them, and at least one `out` statement.
var f = () { out 123; }; // Not legal today due to out not being a valid statement. If the out would then act like return when it's on the lowest level, then this can still be compiled as a lambda expression because `() 123` isn't legal C# today, and I don't think that should change.

The compiler can infer that expressions are code blocks in three ways: they're used as a value, they contain one or more statements and they contain at least one out statement, so array assignments aren't ambiguous. I don't think you can write existing code that this would change the meaning of without some mad hatter macro stuff. Lambdas would still work as they do.

The compiler can issue a warning/error for expression blocks that are not used as a value, and compilation error for using a non-expression block as a value as it does today. I think this could work without sigils or completely different syntax withing a block.

@JoaoVictorVP
Copy link

JoaoVictorVP commented Aug 15, 2024

@jamesmh

Expanding on code blocks in this manner has been explored before. The problem is that code blocks are already legal syntax, making them into expressions would fundamentally change a lot of existing code in C#. For example, a block in a lambda body indicates that it is an Action, not a Func<T>, and for places that accept overloads that would silently change the meaning of that code.

The other problem is that return within a block is also legal, and breaks out of the entire method. There would need to be a completely different syntax within a block.

Is this really all of it? For example, you can infer that if we don't end the block expression with a return value (i.e. without the last value without a comma in the end) then this block expression is of a void type, and it is just natural that a void block expression would fallback into being an action. And it is to question if all blocks need to be expressions, should the scope block of a method be an expression and something like this be allowed in the current C#?

int X()
{
    10
}

If it would not, then it would not be such of a problem to have only blocks not tied to any other scope with an already definite meaning to not to be expressions. Kotlin do something "in these lines" as well and their block expressions only apply when you use it with an if or switch expression (or similars), and the meaning of '{}' as functions is different when the '{}' are being used as scope delimiters in the language (with a Rust like approach this would be a problem, but not with something like Kotlin do).

@AderitoSilva
Copy link

AderitoSilva commented Sep 8, 2024

Yesterday, I added a proposal that is very close to this one. Shame on me for not having found that this one already existed, and even with an identical name. That discussion is fairly closed, due to it being a duplicate. However, I wrote some details on my take on expression blocks, and maybe those can add something useful to this ongoing discussion. For those interested, you can find it here. I must admit that I don't consider my proposal to be ideal, but maybe it can trigger some ideas.

And, I also would like to add that, independently of the syntax you choose to implement such functionality, I would really love to have statements in expressions, or at least a way to not break an expression chain (for example, for the fluent pattern). I know you're considering ways to make that happen for quite a few years, and I have genuine hope that it could come soon (in the foreseeable future). So, I just want to let you know, that you have my +1 for this.

@ltagliaro
Copy link

ltagliaro commented Sep 24, 2024

The use of curly braces for expression blocks would break the current grammar in several cases. I propose the following syntax, which to my knowledge would extend conservatively the current grammar. Moreover, it seems to me somehow more natural.
Expression_block : '(' statement_list ';' expression ')';
Examples:

var x = (var temp = f(); temp * temp + temp);
var x = (Console.WriteLine("Logging..."); var temp = f(); temp * temp + temp);

Typing Rules
In this syntax, the type of the expression block will be determined by the type of the last expression within the block.
Scoping Rules
An expression block introduces a scope. The scope of variables declared within the expression block must be limited to the block itself, similar to how variables in a traditional block {} are scoped.
Evaluation Rules
The final value of the expression block should be the result of the last statement inside the parentheses, allowing it to act like a traditional expression.
Consistency with C and C++ Semantics
The proposed use of the semicolon within an expression block in C# is consistent with the semantics of the semicolon operator in C and C++. In both languages, semicolons are used to separate multiple statements in an expression block (or compound statement), with the value of the entire block being the value of the final expression.
Consistency with Functional Languages
This proposal also aligns with the patterns found in functional languages, such as Haskell. In Haskell, the let ... in construct is used to declare variables in a local scope and then evaluate a final expression based on those bindings.
For example:

let a = 2
    b = 3
in a * b + a

With my proposal would became:
var x = (var a = 2; var b = 3; a * b + a);
Conclusion
The proposed syntax integrates well with conditional expressions and switch expressions, making it a natural extension of the language. Further improvements could include:
Return Statements in Expression Blocks:
Allowing return statements within expression blocks would enable a function to exit early. This return would not define the value of the expression block but would interrupt its evaluation and return from the enclosing function. This behavior would be similar to how the throw statement currently works in conditional, coalesce, and switch expressions.
Try-Catch Expression:
Introducing a try-catch expression, which would resemble the structure of a switch expression, would allow us to handle exceptions within an expression context. This addition would enable nearly any piece of code to be written as an expression, resulting in more concise and expressive code in certain situations.

@CyrusNajmabadi
Copy link
Member

The use of curly braces for expression blocks would break the current grammar in several cases

What cases are these?

@ltagliaro
Copy link

ltagliaro commented Sep 24, 2024

maybe it will not break the grammar, but sometimes it would look just odd.

int i = 42;
int[] a = { i };

why should not '{ i }' be parsed as an expression block with an empty statement list?
ok, because in the rule '{' statement+ expression '}' there is a +, not a *.
but it looks quite odd to me!

    internal static class Class1
    {
        private static object a() => throw new NotImplementedException();
        private static Func<int> b => throw new NotImplementedException();
        internal static void test()
        {
            var x = () => { a(); b(); }; // this is a valid expression. x is Action.
            var y = () => { a(); b() }; // this is currently invalid due to the missing semicolon. were it valid, y would be an int.
        }
    }

a semicolon would make a big difference in the inferred type. also this would look quite odd to me.

@333fred
Copy link
Member

333fred commented Sep 24, 2024

@ltagliaro I don't see how that would break the grammar. It would simply be an error.

@vladd
Copy link

vladd commented Sep 24, 2024

The use of curly braces for expression blocks would break the current grammar in several cases

What cases are these?

This one looks suspicious:

$$$"""{{{{C c = new(); c}}}}"""

@ikegami
Copy link

ikegami commented Sep 24, 2024 via email

@ltagliaro
Copy link

On Tue, Sep 24, 2024 at 12:14 PM ltagliaro @.> wrote: The use of curly braces for expression blocks would break the current grammar in several cases. I propose the following syntax, which to my knowledge would extend conservatively the current grammar. Moreover, it seems to me somehow more natural. Expression_block : '(' statement_list ';' expression ')'; Examples: var x = (var temp = f(); temp * temp + temp); var x = (Console.WriteLine("Logging..."); var temp = f(); temp * temp + temp); That's way too much lookahead.
It's too much lookahead for a whole class of parsers, LR parsers, the one usually used for programming language parsing. It's too much lookahead for a human. And too unobvious. It should be easy to tell whether something is a code block or not. Having to spot a semi-colon buried in code that could be many lines long is a horrible idea. Message ID: @.
>

Isn't it the same for the original proposal?

block_expression
    : '{' statement+ expression '}'
    ;

to tell apart an expression block from a regular one it is necessary to look at the last chunck: if there is a semicolon, it's a regular block; if there is no semicolo, it's a block_expression.

@333fred
Copy link
Member

333fred commented Sep 24, 2024

The use of curly braces for expression blocks would break the current grammar in several cases

What cases are these?

This one looks suspicious:

$$$"""{{{{C c = new(); c}}}}"""

There are multiple things are already disallowed directly inside an interpolation hole: ternaries, for example, must be wrapped in parentheses. This would be no different. Errors are not the same as ambiguities; what your comment suggests to me and Cyrus is that you found some ambiguity, where there could be multiple different valid parses of an expression, or some expression that, while it can technically be parsed, would be extremely difficult to do so.

@vladd
Copy link

vladd commented Sep 24, 2024

@333fred
As of now, $$$"""{{{1}}}""" produces "1", $$$"""{{{{1}}}}""" produces "{1}", and I expected $$$"""{{{{C c = new(); c}}}}""" to produce the same result as $$$"""{{{new C()}}}""" (akin to the first example), because {C c = new(); c} is the same as new C().

@vladd
Copy link

vladd commented Sep 24, 2024

@333fred
Ok, here is another attempt:

using System.Collections.Generic;

var c = new C { X = { null } };

class C
{
    public List<object?> X = new();
}

Is it a direct assignment of the expression {null} to X, or a collection property initializer?

@333fred
Copy link
Member

333fred commented Sep 24, 2024

That's not necessarily an ambiguity; the existing meaning continues to be what it means. However, I do agree that is potentially confusing for a human reader.

@alrz
Copy link
Contributor

alrz commented Sep 26, 2024

Is it a direct assignment of the expression {null} to X

It's sort of the same ambiguity with indexer initializers:

using System.Collections.Generic;
public class C {
    public void M() {
        _ = new List<int[]> { [1,2,3] };    // error; expected `[..] = expr`
        _ = new List<int[]> { ([1,2,3]) };  // ok
    }
}

Block-expressions could be disambiguated the same way, using new C { X = ({ null }) }.

@BN-KS
Copy link

BN-KS commented Oct 1, 2024

I created this which is closely related to this topic: #8471

But I think the syntax is more clean and universal by allowing the usage of of object.{/expressions/} universally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests