-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is
operator for pattern-matching and binding
#3573
base: master
Are you sure you want to change the base?
Changes from all commits
da6e182
036085d
0b575df
5a3cdf8
99a97ca
1d2925f
95c2879
d4588d3
1d01942
02cfafd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,299 @@ | ||
- Feature Name: `is` | ||
- Start Date: 2024-02-16 | ||
- RFC PR: [rust-lang/rfcs#3573](https://github.com/rust-lang/rfcs/pull/3573) | ||
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||
|
||
# Summary | ||
|
||
Introduce an `is` operator in Rust 2024, to test if an expression matches a | ||
pattern and bind the variables in the pattern. | ||
|
||
# Motivation | ||
|
||
This RFC introduces an `is` operator that tests if an expression matches a | ||
pattern, and if so, binds the variables bound by the pattern and evaluates to | ||
true. This operator can be used as part of any boolean expression, and combined | ||
with boolean operators. | ||
|
||
Previous discussions around `let`-chains have treated the `is` operator as an | ||
alternative on the basis that they serve similar functions, rather than | ||
proposing that they can and should coexist. This RFC proposes that we allow | ||
`let`-chaining *and* add the `is` operator. | ||
|
||
`if`-`let` provides a natural extension of `let` for fallible bindings, and | ||
highlights the binding by putting it on the left, like a `let` statement. | ||
Allowing developers to chain multiple `let` operations and other expressions in | ||
the `if` condition provides a natural extension that simplifies what would | ||
otherwise require complex nested conditionals. As the `let`-chains RFC notes, | ||
this is a feature people already expect to work. | ||
|
||
The `is` operator similarly allows developers to chain multiple match-and-bind | ||
operations and simplify what would otherwise require complex nested | ||
conditionals. However, the `is` operator allows writing and reading a pattern | ||
match from left-to-right, which reads more naturally in many circumstances. For | ||
instance, consider an expression like `x is Some(y) && y > 5`; that boolean | ||
expression reads more naturally from left-to-right than | ||
`let Some(y) = x && y > 5`. | ||
|
||
This is even more true at the end of a longer expression chain, such as | ||
`x.method()?.another_method().await? is Some(y)`. Rust method chaining and `?` | ||
and `.await` all encourage writing code that reads in operation order from left | ||
to right, and `is` fits naturally at the end of such a sequence. | ||
|
||
Having an `is` operator would also help to reduce the proliferation of methods | ||
on types such as `Option` and `Result`, by allowing prospective users of those | ||
methods to write a condition using `is` instead. While any such condition could | ||
equivalently be expressed using `let`-chains, the binding would then move | ||
further away from the condition expression referencing the binding, which would | ||
result in a less natural reading order for the expression. | ||
|
||
Consider the following examples: | ||
|
||
```rust | ||
if expr_producing_option().is_some_and(|v| condition(v)) | ||
|
||
if let Some(v) = expr_producing_option() && condition(v) | ||
|
||
if expr_producing_option() is Some(v) && condition(v) | ||
``` | ||
|
||
The condition using `is` is a natural translation from the `is_some_and` | ||
method, whereas the if-let construction requires reversing the binding of `v` | ||
and the expression producing the option. This seems sufficiently cumbersome in | ||
some cases that the absence of `is` would motivate continued use and | ||
development of helper methods. | ||
|
||
# Guide-level explanation | ||
|
||
Rust provides an `is` operator, which can be used in any expression: | ||
`EXPR is PATTERN` | ||
|
||
This operator tests if the value of `EXPR` matches the specified `PATTERN`; see | ||
<https://doc.rust-lang.org/reference/patterns.html> for details on patterns. | ||
|
||
If the `EXPR` matches the `PATTERN`, the `is` expression evaluates to `true`, | ||
and additionally binds any bindings specified in `PATTERN` in the current scope | ||
for code subsequently executed along the path where the `is` expression | ||
is known to be `true`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In all examples, it's obvious and I'm sure the compiler can figure it out, but I can trivially write code examples where it's non-obvious whether the value is true let is_true = x is Some(y);
let is_true = identity(is_true);
if is_true { y; } What happens here? What about just having a local variable? The compiler needs to draw a line somewhere (as Rust is turing complete), and this doesn't say anything about what that is. I think no matter where it lies, it can be pretty confusing to users. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I get the impression from the RFC text that probably it's not intended for the binding There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I gathered that too and didn't realise that it wasn't stated explicitly. To put it more clearly, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed; like the That said, so long as it can a MIR desugaring, we can then depend on the existing MIR checking to ensure that things are only used once initialized. (Which, yes, will not allow things like nils's example, same as how There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While I do not have a solution it does feel like the binding should last longer than just the expression it is within. Else this Rust equivalent of C# would not be possible: if(foo is {Values: [var value1, var value2]}) {
return (value1, value2);
} |
||
|
||
For example: | ||
|
||
```rust | ||
if an_option is Some(x) && x > 3 { | ||
println!("{x}"); | ||
} | ||
``` | ||
|
||
The bindings in the pattern are not bound along any code path potentially | ||
reachable where the expression did not match: | ||
|
||
```rust | ||
if (an_option is Some(x) && x > 3) || (more_conditions /* x is not bound here*/) { | ||
// x is not bound here | ||
} else { | ||
// x is not bound here | ||
} | ||
// x is not bound here | ||
``` | ||
Comment on lines
+87
to
+97
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. /*1*/ let x = 9999;
/*2*/ let y = Some(4);
/*3*/ if y is Some(x) && x > 0 {
/*4*/ println!("x1 = {x}"); // x1 = 4
/*5*/ }
/*6*/ if y is Some(x) && x > 0 || cheat_code_enabled() {
/*7*/ println!("x2 = {x}"); // x2 = 9999 ⁉️
/*8*/ } Just by adding that For I think that, even if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @kennytm Extremely valid; I'll mention that case and add open questions about appropriate lints. |
||
|
||
The pattern may use alternation (within parentheses), but must have the same | ||
bindings in every alternative: | ||
|
||
```rust | ||
if color is (RGB(r, g, b) | RGBA(r, g, b, _)) && r == b && g < 10 { | ||
println!("condition met") | ||
} | ||
|
||
// ERROR: `a` is not bound in all alternatives of the pattern | ||
if color is (RGB(r, g, b) | RGBA(r, g, b, a)) && r == b && g < 10 { | ||
println!("condition met") | ||
} | ||
``` | ||
|
||
`is` may appear anywhere a boolean expression is accepted: | ||
|
||
```rust | ||
func(x is Some(y) && y > 3); | ||
``` | ||
|
||
The `is` operator may not appear as a statement, because the bindings won't be | ||
usable after the end of the statement, and because the boolean return value is | ||
ignored. The compiler will issue a deny-by-default lint in that case, and | ||
suggest using `let` if the user wants to bind a pattern for the rest of the | ||
scope. | ||
|
||
```rust | ||
// deny-by-default lint: the binding `x` isn't usable after the statement, and the value of `is` is ignored. | ||
// Suggestion: use `let` to introduce a binding in this scope: `let x = an_expression();`. | ||
an_expression() is x; | ||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
# Reference-level explanation | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should cover the drop order in this RFC. My suggestion would be that |
||
|
||
Add a new [operator | ||
expression](https://doc.rust-lang.org/reference/expressions/operator-expr.html), | ||
`IsExpression`: | ||
|
||
> **<sup>Syntax</sup>**\ | ||
> _IsExpression_ :\ | ||
> _Expression_ `is` _PatternNoTopAlt_ | ||
|
||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Add `is` to the [operator | ||
precedence](https://doc.rust-lang.org/reference/expressions.html#expression-precedence) | ||
table, at the same precedence level as `==`, and likewise non-associative | ||
(requiring parentheses). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that it should recommend parentheses, but have a higher precedence than To me, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's no valid expression with If One way to deal with expressions like this is a lint removing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is also discussed in the Rationale and alternatives section here: |
||
|
||
Detect `is` appearing as a top-level statement and produce an error, with a | ||
rustfix suggestion to use `let` instead. | ||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
Comment on lines
+131
to
+147
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This Reference-Level Explanation is too short to explain how I would like to see how that and additionally binds any bindings specified in match x {
Some(y) if y is Some(z) => (z is Some(w)).then(|| w + 1),
_ => ..,
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @kennytm Valid. So, you're suggesting a systematic look at every existing statement type and which parts of the statement the binding is valid in? That seems entirely reasonable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes. Expressions too (well basically all statements except |
||
|
||
# Drawbacks | ||
|
||
Introducing both the `is` operator and `let`-chains would provide two different | ||
ways to do a pattern match as part of a condition. Having more than one way to | ||
do something could lead people to wonder if there's a difference; we would need | ||
to clearly communicate that they serve similar purposes. | ||
|
||
An `is` operator will produce a name conflict with [the `is` method on | ||
`dyn Any`](https://doc.rust-lang.org/std/any/trait.Any.html#method.is) in the | ||
standard library, and with the (relatively few) methods named `is` in the | ||
ecosystem. This will not break any existing Rust code, as the operator will | ||
only exist in the Rust 2024 edition and newer. The Rust standard library and | ||
any other library that wants to avoid requiring the use of `r#is` in Rust 2024 | ||
and newer could provide aliases of these methods under a new name; for | ||
instance, the standard library could additionally provide `Any::is` under a new | ||
name `is_type`. | ||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# Rationale and alternatives | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something that I think is worth exploring here, even though I agree it's worse than the proposal, is the idea of just promoting let patterns to expressions. For example, allowing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it's even good for the language to promote patterns such as f(x is Some(y) && y > 5) That promotes very obscure code which is really hard for people new or even intermediate to the language to even understand what is going on. I'd much rather see patterns such as if x is Some(y) && y > 5 {
f(true);
} else {
f(false);
} which while more verbose is less arcane. I agree that the first one looks prettier but there is a lot of information to unpack in one line, especially if you are new. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Honestly, what you're describing to me is quite a stylistic choice and I don't think it's something that the language itself should have a say in, and maybe something that should be left in clippy lints. What you've described to me is extremely similar to the common case of Like, to be clear, this isn't me saying you're wrong here-- it's a real problem and ignoring it is not a real solution. But in that regard, while failing to dig deep into why people prefer this more expanded version is ignoring it, it's also ignoring it to just say that the expanded version is better and not question it. This is kind of why I think that the solution probably lies somewhere in clippy-- things such as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If let were to be promoted to return a boolean on a successful bind it would both solve let chaining as well as the main problem with let chaining as it is proposed today. I would hate for that to be accepted instead of |
||
|
||
As noted in the [motivation](#motivation) section, adding the `is` operator | ||
allows writing pattern matches from left-to-right, which reads more naturally | ||
in some conditionals and fits well with method chains and similar. As noted | ||
under [prior art](#prior-art), other languages such as C# already have this | ||
exact operator for this exact purpose. | ||
|
||
We could choose not to add this operator, and have *only* `let`-chains. This | ||
would provide equivalent functionality, semantically; however, it would force | ||
pattern-matches to be written with the pattern on the left, which won't read as | ||
naturally in some expressions. Notably, this seems unlikely to do as effective | ||
a job of reducing the desire for `is_variant()` methods and helpers like | ||
`is_some_and(...)`. | ||
|
||
We could choose to add the `is` operator and *not* add `let`-chains. However, | ||
many people *already* expect `let`-chains to work as an obvious extrapolation | ||
from seeing `if let`/`while let` syntax. | ||
|
||
We could add this operator using punctuation instead (e.g. `~`). However, there | ||
is no "natural" operator that conveys "pattern match" to people (the way that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I disagree that there's no natural operator, even though I agree that it would be less clear. For example, we could use tildes as an additional equality operator (
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something else to point to, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For some more prior art, Raku uses There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd like to note that Rust used to use tildes ( As an example, consider a Polish keyboard layout. I would recommend avoiding There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @matthieu-m The keyboard layout you've linked to is an obsolete typewriter layout. Polish computers use a QWERTY-based layout called "Polish programmer's" layout, which despite the name, is the default used by everyone. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. another argument that i have yet to see is that the |
||
`+` is well-known as addition). Using punctuation also seems likely to make the | ||
language more punctuation-heavy, less obvious, and less readable. | ||
|
||
We could add this operator using a different name. Most other names, however, | ||
seem likely to be longer and to fit people's expectations less well, given the | ||
widespread precedent of `is_xyz()` methods. The most obvious choice would be | ||
something like `matches`. | ||
|
||
We could permit top-level alternation in the pattern. However, this seems | ||
likely to produce visual and semantic ambiguity. This is technically a one-way | ||
door, in that `x is true|false` would parse differently depending on our | ||
decision here; however, the use of a pattern-match for a boolean here seems | ||
unlikely, redundant, and in poor style. In any case, the compiler could easily | ||
detect most attempts at top-level alternation and suggest adding parentheses. | ||
|
||
# Prior art | ||
|
||
`let`-chains provide prior art for having this functionality in the language. | ||
|
||
The `matches!` macro similarly provides precedent for having pattern matches in | ||
boolean expressions. `is` would likely be a natural replacement for most uses | ||
of `matches!`. | ||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Many Rust enums provide `is_variant()` functions: | ||
- `is_some()` and `is_none()` for `Option` | ||
- `is_ok()` and `is_err()` for `Result` | ||
- `is_eq()` and `is_lt()` and `is_gt()` for `Ordering` | ||
- `is_ipv4()` and `is_ipv6()` for `SocketAddr` | ||
- `is_break()` and `is_continue()` for `ControlFlow` | ||
- `is_borrowed()` and `is_owned()` for `Cow` | ||
- `is_pending()` and `is_ready()` for `Poll` | ||
joshtriplett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
These functions serve as precedent for using the word `is` for this purpose. | ||
(Note that this RFC does *not* propose to encourage people to switch away from | ||
these methods. Among other things, users may wish to use e.g. `Option::is_some` | ||
as a function pointer rather than having to write a closure using `is`.) | ||
|
||
There's extensive prior art in Rust for having more than one way to accomplish | ||
the same thing. You can write a `for` loop or you can write iterator code. You | ||
can use combinators or write a `match` or write an `if let`. You can write | ||
`let`-`else` or use a `match`. You can write `x > 3` or `3 < x`. You can write | ||
`x + 3` or `3 + x`. Rust does not normatively require one alternative. We do, | ||
in general, avoid adding constructs that are entirely redundant with each | ||
other. However, this RFC proposes that the constructs are *not* redundant: some | ||
code will be more readable with `let`-chains, and some code will be more | ||
readable with `is`. | ||
|
||
[Kotlin has a similar `is` | ||
operator](https://kotlinlang.org/docs/typecasts.html#smart-casts) for casts to | ||
a type, which are similarly flow-sensitive: in the code path where the `is` | ||
test has succeeded, subsequent code can use the tested value as that type. | ||
|
||
[C# has an `is` | ||
operator](https://learn.microsoft.com/en-US/dotnet/csharp/language-reference/operators/is) | ||
for type-matching and pattern matching, which supports the same style of | ||
chaining as the proposed `is` operator for Rust. For instance, the following | ||
are valid C# code: | ||
|
||
```csharp | ||
if (expr is int x && other_expr is int y) | ||
{ | ||
func(x - y); | ||
} | ||
|
||
if (bounding_box is { P1.X: 0 } or { P2.Y: 0 }) | ||
{ | ||
check(bounding_box); | ||
} | ||
|
||
if (GetData() is var data | ||
&& data.Field == value | ||
&& data.OtherField is [2, 4, 6]) | ||
{ | ||
show(data); | ||
} | ||
``` | ||
|
||
# Unresolved questions | ||
|
||
Can we make `x is 10..=20` work without requiring the user to parenthesize the | ||
pattern, or would that not be possible with our precedence? We could | ||
potentially make this work over an edition boundary, but would it be worth the | ||
churn? | ||
|
||
Pattern types propose using `is` for a different purpose, in types rather than | ||
in expressions: `u32 is 1..` would be a `u32` that can never be `0`, and | ||
`Result<T, E> is Err(_)` would be a `Result<T, E>` that can never be the `Ok` | ||
variant. Can we introduce the `is` operator in expressions without conflicting | ||
with its potential use in pattern types? We could require the use of | ||
parentheses in `v as u32 is 1..`, to force it to be parsed as either `(v as | ||
u32) is 1..` or `v as (u32 is 1..)` (assuming that pattern types can be used in | ||
`as` in the first place). | ||
|
||
What new method name should we add to `dyn Any` as an alias for `is`? (This | ||
does not need to be settled in this RFC, but should be handled | ||
contemporaneously with the implementation, and tracked in the tracking issue.) | ||
|
||
Should we add a rustc lint or clippy lint (with rustfix suggestion) to turn | ||
`matches!` into `is`? | ||
|
||
# Future possibilities | ||
|
||
As with `let`-chains, we *could* potentially allow cases involving `||` to bind | ||
the same patterns, such as `expr1 is V1(value) || expr2 is V2(value)`. This RFC | ||
does *not* propose allowing that syntax to bind variables, to avoid confusing | ||
code. | ||
|
||
*If* in a future edition we decide to allow this for `let` chains, we should | ||
similarly allow it for `is`. This RFC does not make or recommend such a future | ||
proposal. | ||
|
||
We could choose to offer a clippy lint or even a rustc lint, with a rustfix | ||
suggestion, to turn `matches!` into `is`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned about this. This would introduce the possibility of doing the same thing in 2 different ways on a language level. IMHO this is a bad idea, as it opens the door for mixed-style code bases, that just get harder to read.
For tooling, this is also a problem: Clippy will most likely get (restriction) lint requests for not allowing
is
OR not allowinglet
-chains.Another problem I see here is: What should Clippy do when producing suggestions? If we have the policy to always suggest
is
overlet
-chains, that might pollute code bases wherelet
-chains are preferred (and vice versa). We also can't really check things like "is this alet
-chain code base" or "are we in anis
-chain expression" when producing suggestions. One lint suggestingis
and another suggestinglet
will make this problem even worse, and that is almost impossible to avoid with changing contributors and team members.We recently had the situation described above with suggesting the new-ish
_ =
binding overlet _ =
. We decided to suggestlet _ =
as we don't have to check the MSRV before producing the suggestion that way.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust already has many different ways to do the same thing. You can write a
for
loop or you can write iterator code. You can use combinators or write amatch
or write anif let
. You can writelet
-else
or use amatch
. You can writex > 3
or3 < x
. You can writex + 3
or3 + x
.In this RFC, I'm proposing that both of them have value, and that it's entirely valid for a codebase to use both, for different purposes.
if let PAT = EXPR && ...
emphasizes the pattern and its binding. It seems appropriate for clear division into cases based primarily on the pattern, by writingif let ... else
.if EXPR is PAT && ...
leads with the expression, then the pattern, then the next condition. It feels more appropriate for cases where you expect the reader to find it easiest to process in order of the sequence of operations from left to right: "run this EXPR, see if it matches PAT, check the next condition ..."I personally expect to find myself writing both, in different cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me those are not really comparable:
for
loop, the standard library gives you the option/power to do this with iterator method chains.match
is rather if you want to match one expression to multiple variants, whileif let
is for checking if the expression is that exact variant (there's a Clippystyle
lint for this).let
-else
was introduced for a specific use case to save some lines of code over using amatch
The proposed
is
language construct doesn't do the same:let
-chains andis
are provided as a language construct.is
vslet
. It's a pure style choice IMO.The second point is the biggest problem for tooling: It is impossible to determine what to suggest. With the other examples it's usually clear, because the alternative is more concise/readable/idiomatic/....
The focus on expression vs pattern I can see and think is a valid point. But to that, I want to point out the
equatable_if_let
Clippy lint, that tried to address something similar, but never got out ofnursery
as we (mainly I) couldn't agree whenexpr == pat
is preferable overpat == expr
/let pat = expr
. rust-lang/rust-clippy#7777So I see the addition of the
is
as giving the user a choice between two styles and not much more. IMO this is not worth the downsides that come with this. But that is my opinion and millage may vary obviously.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to also link and quote one of my comments below: #3573 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If
let
-chain is to be scraped, this RFC should really have a section to refute the counterarguments made in RFC 2497.