-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed changes for Pattern Matching in C# 9.0 - Draft Specification #2850
Comments
GitHub should let me add 🎉and ❤️more than once. I'm especially excited by the prospect of variable patterns being declared in multiple mutually exclusive patterns. F# allows it and it's a powerful feature. |
Would we support these patterns for user-defined types that are implicitly convertible to any of those types? |
A few points:
And there's already some code that could benefit from mutually exclusive pattern variables: avoiding code duplication. |
I would like to voice my disagreement on this. |
if ((e1, e2) is (0, int i) or (int i, 0))
{
M(i);
} If we would allow this case. Would it be possible for Such as switch((e1, e2))
{
case (0, int i):
case (int i, 0):
M(i);
break;
} or maybe #2703 |
This is unexpected. I expect that we don't need to care about the type of object we put into the comparison operator byte b = 99;
if(b >= 0 && b <= 100D)
{
// should be here
}
if(b is >= and <= 100D)
{
// not come here ??
} |
Is this a typo, or a sneak peek at a new keyword? 😄
IMO, no. |
Must be a typo. The last code snippet is written using a |
Whilst likely beyond the scope of C# 9, could I please ask that the ideas in the user-defined positional/active patterns threads (#1047 and #277) and support of |
if (o is int x and >= 0 and < 10) ...
// or
if (o is int x >= 0 and < 10) ... I think the intent of this expression is unclear. Will |
True. I skimmed over that without thinking about it too much. if (o is int x && x >= 0 and < 10) Clearer, but puts |
Possibly playing devil's advocate here, but does it matter? |
Given:
The answer appears to be yes: if the datatype needs to be the same between the thing being matched and the type of the pattern, then there is a difference between |
In my understanding, above is evaluated against This
is evaluated against I think |
Not if we view < 10 as implicitly defining an int check. Then I believe there's no difference and it should be an implementation detail to allow performance optimizations. |
@YairHalberstadt How do you think about float x = 9.9f;
if(x is > 0 and < 10)
{
// is not matched ?
} |
I do think that: float x = 9.9f;
if (x is > 0 and < 10)
{
} Needs to have exactly the same behaviour (both in terms of any compiler warnings/errors, and runtime behaviour) as: float x = 9.9f;
if (x > 0 && x < 10)
{
} |
Note: there is precedence (no pun intended) for the language shipping features that don't behave the same as one would intuit. For example, without using operator-overloading, it's possible to have the following Yes, that's surprising. But it's also NBD in practice. Users rarely (if ever) hit the corners that make this stuff appear. It tends to worry people who think about hte whole language (including myself) and how it all fits together. But it's often overblown as an actual problem for the language and the near total majority of users of it. |
Please explain. |
Did everyone miss this sentence in the draft spec?
|
Sure! int? a = null;
int? b = null;
Console.WriteLine(a >= b);
Console.WriteLine(a > b || a == b); As defined in the language itself, This is a core, built-in, inconsistency with the language and how people might generally intuit things to behave. |
When and why would I prefer |
By itself you probably wouldn't prefer to use the pattern, but combined with recursive patterns it can be quite powerful: if (person is Student { Gpa: >= 3.0 } student) { ... } |
@gafter Admittedly this sentence was too hard to understand and picture all the scenario relate to it (at least for me) until we see the example we then know that this sentence was breaking our common sense's expectation |
Added
To the list of new pattern forms in scope for C# 9.0. |
Fixes #40533 Relates to dotnet/csharplang#2850 Relates to #40727 (test plan for C# 8 patterns)
Will patterns be a 'type' so you can reuse them in different switch statement etc? Like this (contrived example): pattern IsLetter(char c) => c is >= 'a' and <= 'z' or >= 'A' and <= 'Z';
pattern IsPersonId(int i) => i is >=1 and <=100;
var result = obj switch
{
IsLetter x and x=='x' => "X marks the spot" ,
IsLetter => "Not an X",
IsPersonId => "A person id",
_ => "Argle blargle glob glif"
}; |
That would have been a lifesaver, I think this is the only workaround for now: case (0, int x):
case (int x0, 0) when (x = x0) is var _:
Console.WriteLine(x) Ugly but codegen is as efficient as built-in support. |
This doesn't work. In a pattern such as |
Changed
to
|
Moving to a checked-in document via #3361 |
This looks great. if(foo.Bar is > 10 and <10) they will see this just as nicer syntax for making conditional check without repeating variable name and will try to write this: var min = 10;
var max = 100
if(foo.Bar is > min and < max) And they will get error. I see it was briefly discussed (in @HaloFour response to removed user) but I didn't saw it mentioned anywhere else. |
There is nothing planned for that. Sorry! :( |
@CyrusNajmabadi Can I ask if there is some fundamental "problem" with it? I only had bigger experience with pattern matching in F# and there it is not possible (you need to use My guess the reason is that pattern matching in this languages is used mainly with dedicated match/switch/case syntax and there you can use when/is guards to compare against variable. You can also do it in c#: var foo = 11m;
var bar = 22m;
switch (foo)
{
case var _ when foo == bar:
Console.WriteLine("foo is bar");
break;
}
var r = foo switch
{
_ when foo == bar => "foo is bar",
_ => ""
};
Console.WriteLine(r); So not real benefit of explicit "match value of variable" pattern there. But in C# you can use patterns in C# luckily requires |
I know of no fundamental problem with it. We've simply never designed it to do the above. One benefit of the current approach is that it's strictly declarative and isolated from things like order of operations or side-effects. At the point that arbitrary values are allowed, it would be necessary to design what that means. |
It doesn't do the same thing that pattern-matching was designed to do. Specifically, it was designed to work with constants specifically so that the compiler can analyze, diagnose, and optimize the totality of the set of pattern-matching operations. If the compiler doesn't know the value it is matching against, it cannot do any of that. That's not to say it is necessarily a bad idea, but it is a pretty different animal. Do ordinary expressions not satisfy that need? |
thanks for answers guys
I don't see having "match against value of variable" pattern to be a must-have killer feature. It's use will probably be limited to just small patterns in I just think there will be many people who will see using |
Closing this issue as it has been moved to a checked-in document via #3361 |
Do we have a tracking issue for that? Possibly a candidate for 10.0 working set? @gafter |
We are considering a small handful of enhancements to pattern-matching for C# 9.0 that have natural synergy and work well to address a number of common programming problems:
and
patterns that require both of two different patterns to match;or
patterns that require either of two different patterns to match;not
patterns that require a given pattern not to match; andParenthesized Patterns
Parenthesized patterns permit the programmer to put parentheses around any pattern. This is not so useful with the existing patterns in C# 8.0, however the new pattern combinators introduce a precedence that the programmer may want to override.
Relational Patterns
Relational patterns permit the programmer to express that an input value must satisfy a relational constraint when compared to a constant value:
We imagine supporting
<
,<=
,>
, and>=
patterns on all of the built-in types that support such binary relational operators with two operands of the same type in an expression. Specifically, we support all of these relational patterns forsbyte
,byte
,short
,ushort
,int
,uint
,long
,ulong
,char
,float
,double
, anddecimal
.The expression is required to evaluate to a constant value. It is an error if that constant value is
double.NaN
orfloat.NaN
. It is an error if the expression is a null constant and the relational operator is<
,<=
,>
, or>=
.When the input is a type for which a suitable built-in binary relational operator is defined that is applicable with the input as its left operand and the given constant as its right operand, the evaluation of that operator is taken as the meaning of the relational pattern. Otherwise we convert the input to the type of the expression using an explicit nullable or unboxing conversion. It is a compile-time error if no such conversion exists. The pattern is considered not to match if the conversion fails. If the conversion succeeds then the result of the pattern-matching operation is the result of evaluating the expression
e OP v
wheree
is the converted input,OP
is the relational operator, andv
is the constant expression.Open Issue: Should the expression be shift_expression to have precedence corresponding to a relational expression?
Pattern Combinators
Pattern combinators permit matching both of two different patterns using
and
(this can be extended to any number of patterns by the repeated use ofand
), either of two different patterns usingor
(ditto), or the negation of a pattern usingnot
.I expect the most common use of a combinator will be the idiom
More readable than the current idiom
e is object
, this pattern clearly expresses that one is checking for a non-null value.The
and
andor
combinators will be useful for testing ranges of valuesThis example illustrates our expectation that
and
will have a higher parsing priority (i.e. will bind more closely) thanor
. The programmer can use the parenthesized pattern to make the precedence explicit:Like all patterns, these combinators can be used in any context in which a pattern is expected, including nested patterns, the is-pattern-expression, the switch-expression, and the pattern of a switch statement's case label.
Open Issues with Proposed Changes
Syntax for relational operators
Are
and
,or
, andnot
some kind of contextual keyword? If so, is there a breaking change (e.g. compared to their use as a designator in a declaration-pattern).Should we support some combination of declaration pattern along with a relational pattern? For example,
Or will the
and
combinator be sufficient?Semantics (e.g. type) for relational operators
We expect to support all of the primitive types that can be compared in an expression using a relational operator. The meaning in simple cases is clear
But when the input is not such a primitive type, what type do we attempt to convert it to?
We have proposed that when the input type is already a comparable primitive, that is the type of the comparison. However, when the input is not a comparable primitive, we treat the relational as including an implicit type test to the type of the constant on the right-hand-side of the relational. If the programmer intends to support more than one input type, that must be done explicitly:
Flowing type information from the left to the right of
and
It has been suggested that when you write an
and
combinator, type information learned on the left about the top-level type could flow to the right. For exampleHere, the input type to the second pattern is narrowed by the type narrowing requirements of left of the
and
. We would define type narrowing semantics for all patterns as follows. The narrowed type of a patternP
is defined as follows:P
is a type pattern, the narrowed type is the type of the type pattern's type.P
is a declaration pattern, the narrowed type is the type of the declaration pattern's type.P
is a recursive pattern that gives an explicit type, the narrowed type is that type.P
is a constant pattern where the constant is not the null constant and where the expression has no constant expression conversion to the input type, the narrowed type is the type of the constant.P
is a relational pattern where the constant expression has no constant expression conversion to the input type, the narrowed type is the type of the constant.P
is anor
pattern, the narrowed type is the common type of the narrowed type of the subpatterns if such a common type exists. For this purpose, the common type algorithm considers only identity and implicit reference conversions, and it considers all subpatterns of a sequence ofor
patterns (ignoring parenthesized patterns).P
is anand
pattern, the narrowed type is the narrowed type of the right pattern. Moreover, the narrowed type of the left pattern is the input type of the right pattern.P
isP
's input type.Variable definitions and definite assignment
The addition of
or
andnot
patterns creates some interesting new problems around pattern variables and definite assignment. Since variables can normally be declared at most once, it would seem any pattern variable declared on one side of anor
pattern would not be definitely assigned when the pattern matches. Similarly, a variable declared inside anot
pattern would not be expected to be definitely assigned when the pattern matches. The simplest way to address this is to forbid declaring pattern variables in these contexts. However, this may be too restrictive. There are other approaches to consider.One scenario that is worth considering is this
This does not work today because, for an is-pattern-expression, the pattern variables are considered definitely assigned only where the is-pattern-expression is true ("definitely assigned when true").
Supporting this would be simpler (from the programmer's perspective) than also adding support for a negated-condition
if
statement. Even if we add such support, programmers would wonder why the above snippet does not work. On the other hand, the same scenario in aswitch
makes less sense, as there is no corresponding point in the program where definitely assigned when false would be meaningful. Would we permit this in an is-pattern-expression but not in other contexts where patterns are permitted? That seems irregular.Related to this is the problem of definite assignment in a disjunctive-pattern.
We would only expect
i
to be definitely assigned when the input is not zero. But since we don't know whether the input is zero or not inside the block,i
is not definitely assigned. However, what if we permiti
to be declared in different mutually exclusive patterns?Here, the variable
i
is definitely assigned inside the block, and takes it value from the other element of the tuple when a zero element is found.It has also been suggested to permit variables to be (multiply) defined in every case of a case block:
To make any of this work, we would have to carefully define where such multiple definitions are permitted and under what conditions such a variable is considered definitely assigned.
Should we elect to defer such work until later (which I advise), we could say in C# 9
not
oror
, pattern variables may not be declared.Then, we would have time to develop some experience that would provide insight into the possible value of relaxing that later.
Diagnostics, subsumption, and exhaustiveness
These new pattern forms introduce many new opportunities for diagnosable programmer error. We will need to decide what kinds of errors we will diagnose, and how to do so. Here are some examples:
This case can never match (because the input cannot be both an
int
and adouble
). We already have an error when we detect a case that can never match, but its wording ("The switch case has already been handled by a previous case" and "The pattern has already been handled by a previous arm of the switch expression") may be misleading in new scenarios. We may have to modify the wording to just say that the pattern will never match the input.Similarly, this would be an error because a value cannot be both
1
and2
.This case is possible to match, but the
or 1
at the end adds no meaning to the pattern. I suggest we should aim to produce an error whenever some conjunct or disjunct of a compound pattern does not either define a pattern variable or affect the set of matched values.Here,
0 or 1 or
adds nothing to the second case, as those values would have been handled by the first case. This too deserves an error.A switch expression such as this should be considered exhaustive (it handles all possible input values).
In C# 8.0, a switch expression with an input of type
byte
is only considered exhaustive if it contains a final arm whose pattern matches everything (a discard-pattern or var-pattern). Even a switch expression that has an arm for every distinctbyte
value is not considered exhaustive in C# 8. In order to properly handle exhaustiveness of relational patterns, we will have to handle this case too. This will technically be a breaking change, but one no user is likely to notice.The text was updated successfully, but these errors were encountered: