From 96e709575e464c3cf8d285cebf80ae2e7bc0c67e Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 00:08:16 +0100 Subject: [PATCH 01/16] Initial description of undisambiguated_generics --- text/0000-undisambiguated-generics.md | 208 ++++++++++++++++++++++++++ 1 file changed, 208 insertions(+) create mode 100644 text/0000-undisambiguated-generics.md diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md new file mode 100644 index 00000000000..0fc89a72487 --- /dev/null +++ b/text/0000-undisambiguated-generics.md @@ -0,0 +1,208 @@ +- Feature Name: `undisambiguated_generics` +- Start Date: 2018-09-14 +- RFC PR: +- Rust Issue: + +# Summary +[summary]: #summary + +Make disambiguating generic arguments in expressions with `::` optional, allowing generic arguments +to be specified without `::` (making the "turbofish" notation no longer necessary). +This makes the following valid syntax: + +```rust +struct Nooper(T); + +impl Nooper { + fn noop(&self, _: U) {} +} + +fn id(t: T) -> T { + t +} + +fn main() { + id(0u32); // ok + let n = Nooper<&str>(":)"); // ok + n.noop<()>(()); // ok +} +``` + +# Motivation +[motivation]: #motivation + +The requirement to write `::` before generic arguments in expressions is an unexpected corner case +in the language, violating the principle of least surprise. There were historical reasons for its +necessity in the past, acting as a disambiguator for other uses of `<` and `>` in expressions. +However, now the ambiguity between generic arguments and comparison operators has been reduced to a +single edge case that is very unlikely to appear in Rust code (and has been demonstrated to occur in +[none of the existing crates](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443) +in the Rust ecosystem as of 2018-09-14). Making `::` optional in expressions takes a step towards +eliminating an oddity in the Rust syntax, making it more uniform and less confusing (e.g. +[1](https://users.rust-lang.org/t/why-cant-i-specify-type-parameters-directly-after-the-type/2365), +[2](https://users.rust-lang.org/t/type-parameter-syntax-when-defining-vs-calling-functions/15037), +[3](https://github.com/rust-lang/book/issues/385), +[4](https://www.reddit.com/r/rust/comments/73pm5e/whats_the_rationale_behind_for_type_parameters/), +[5](https://matematikaadit.github.io/posts/rust-turbofish.html)) to beginners. + +There have been two historical reasons to require `::` before generic arguments in expressions. + +## Syntax ambiguity +Originally, providing generic arguments without `::` meant that some expressions were ambiguous in +meaning. + +```rust +// Take the following: +a < b > ( c ); +// Is this a generic function call..? +a(c); +// Or a chained comparison? +(a < b) > (c); +``` + +However, chained comparisons are [now banned in Rust](https://github.com/rust-lang/rfcs/pull/558): +the previous example results in an error. + +```rust +a < b > ( c ); // error: chained comparison operators require parentheses +``` + +This syntax is therefore no longer ambiguous and we can determine whether `<` is a comparison +operator or the start of a generic argument list during parsing. + +There is, however, one case in which the syntax is currently ambiguous. + +```rust +// The following: +(a < b, c > (d)); +// Could be a generic function call... +( a(d) ); +// Or a pair of comparisons... +(a < b, c > (d)); +``` + +Ultimately, this case does not seem occur naturally in Rust code. A +[Crater run on over 20,000 crates](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443) +determined that no crates regress if the ambiguity is resolved in favour of a generic expression +rather than tuples of comparisons of this form. We propose that resolving this ambiguity in favour +of generic expressions to eliminate `::` is worth this small alteration to the existing parse. + +## Performance +Apart from parsing ambiguity, the main concern regarding allowing `::` to be omitted was the +potential performance implications. Although by the time we reach the closing angle bracket `>` we +know whether we're parsing a comparison or a generic argument list, when we initially encounter `<`, +we are not guaranteed to know which case we're parsing. To solve this problem, we need to +first start parsing a generic argument list and then backtrack if this fails (or use a parser that +can deal with ambiguous grammars). We generally prefer to avoid backtracking, as it can be slow. +However, up until now, the concern with using backtracking for `<`-disambiguation was purely +theoretical, without any empirical testing to validate it. + +[A recent experiment](https://github.com/rust-lang/rust/pull/53511) to allow generic arguments +without `::`-disambiguation [showed no performance regressions](https://github.com/rust-lang/rust/pull/53511#issuecomment-414172984) +using the backtracking technique. This indicates that in existing codebases, allowing `::` to be +omitted is unlikely to lead to any performance regressions. + +Similarly, the performance implications of deleting all occurrences of `::` (and simply using +generic arguments directly) +[also showed no performance regressions](https://github.com/rust-lang/rust/pull/53511#issuecomment-414360849). +This is likely to be due to the relative uncommonness of providing explicit generic arguments and +using comparison operators in the cases of ambiguous prefixes, relative to typical codebases. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +To explicitly pass generic arguments to a type, value or method, you may write the lifetime, type +and const arguments in angle brackets (`<` and `>`) directly after the expression. (Note that the +"turbofish" notation is no longer necessary.) + +```rust +struct Nooper(T); + +impl Nooper { + fn noop(&self, _: U) {} +} + +fn id(t: T) -> T { + t +} + +fn main() { + id(0u32); + let n = Nooper<&str>(":)"); + n.noop<()>(()); +} +``` + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +An initial implementation is present in https://github.com/rust-lang/rust/pull/53578, upon which the +implementation may be based. The parser will now attempt to parse generic argument lists without +`::`, falling back on attempting to parse a comparison if that fails. + +The feature will initially be gated (e.g. `#![feature(undisambiguated_generics)]`). However, +note that the parser changes will be present regardless of whether the feature is enabled or not, +because feature detection occurs after parsing. However, because it has been shown that there are +little-to-no performance regressions when modifying the parser and without taking advantage of `::` +optionality, this should not be a problem. + +When `undisambiguated_generics` is not enabled, the parser modifications will allow us to +provide better diagnostics: specifically, we'll be able to correctly suggest (in a +machine-applicable manner) using `::` whenever the user has actually typed undisambiguated generic +arguments. The current diagnostic suggestions suggesting the use of `::` trigger whenever there are +chained comparisons, which has false positives and does not provide a fix suggestion. + +An allow-by-default lint `disambiguated_generics` will be added to suggest removing `::` when +the feature is enabled. This is undesirable in most existing codebases, as the number of +linted expressions is likely to be large, but could be useful for new codebases and in the future. + +Note that, apart from for those users who explicitly increase the level of the lint, no steps are +taken to discourage the use of `::` at this stage (including in tools, such as rustfmt). + +# Drawbacks +[drawbacks]: #drawbacks + +The primary drawback is that resolving ambiguities in favour of generics means changing the +interpretation of `(a(d))` from a pair of tuples to a generic function call. However, in +practice, this has been demonstrated +([1](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443)) not to cause issues in +practice (the syntax is unnatural for Rust and is actively warned against by the compiler). + +Additionally, there is potential for performance regressions due to backtracking. However, +empirical evidence ([1](https://github.com/rust-lang/rust/pull/53511#issuecomment-414172984) and +[2](https://github.com/rust-lang/rust/pull/53511#issuecomment-414360849)) suggests this should not +be a problem. Although it is probable that a pathological example could be constructed that does +result in poorer performance, such an example would not be representative of typical Rust code and +therefore is not helpful to seriously consider. Backtracking is already used for some cases in the +parser. + +The other potential drawback is that other parsers for Rust's syntax (for example in external tools) +would also have to implement some form of backtracking (or similar) to handle this case. However, +backtracking is straightforward to implement in many forms of parser (such as recursive decent) and +it is likely this will not cause significant problems. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +If we want to allow `::` to be omitted, there are two solutions: +- Backtracking, as suggested here. +- Using a parser for nondeterministic grammars, such as GLL. + +Although using a more sophisticated parser would come with its own advantages, it's an overly +complex solution to this particular problem. Backtracking seems to work well in typical codebases +and provides an immediate solution to the problem. + +Alternatively we could continue to require `::`. This would ensure there would be no performance +implications, but would leave the nonconformal and surprising syntax in place. We could potentially +use backtracking to provide the improved diagnostic suggestions to use `::`, while still preventing +`::` from being omitted. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +- Should we warn against the ambiguous case initially? This would be more conservative, but +considering that this pattern has not been encountered in the wild, this is probably unnecessary. +- Should `(a < b, c > d)` parse as a pair of comparisons? In the aforementioned Crater run, this +syntax was also resolved as a generic expression followed by `d` (also causing no regressions), but +we could hypothetically parse this unambiguously as a pair (though this would probably require more +complex backtracking). From 576da47487a67c4ee1fe0b7b26e6b594a3bbaaa4 Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 12:13:28 +0100 Subject: [PATCH 02/16] Add note about effect of generalised type ascription --- text/0000-undisambiguated-generics.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 0fc89a72487..921276e21df 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -197,6 +197,15 @@ implications, but would leave the nonconformal and surprising syntax in place. W use backtracking to provide the improved diagnostic suggestions to use `::`, while still preventing `::` from being omitted. +## Future frequency of disambiguated generic expressions +It is likely that should the +[generalised type ascription](https://github.com/rust-lang/rfcs/pull/2522) RFC be accepted and +implemented, the number of cases where generic type arguments have to be provided is reduced, making +users less likely to encounter the `::` construction. However, the +[const generics](https://github.com/rust-lang/rfcs/pull/2000) feature, currently in implementation, +is conversely likely to increase the number of cases (specifically where const generic arguments +are not used as parameters in types). + # Unresolved questions [unresolved-questions]: #unresolved-questions From 42ab5a710c0f68529a1d59b93bc282587577da49 Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 12:14:04 +0100 Subject: [PATCH 03/16] Add note about module ambiguity --- text/0000-undisambiguated-generics.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 921276e21df..256edb91bc3 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -215,3 +215,6 @@ considering that this pattern has not been encountered in the wild, this is prob syntax was also resolved as a generic expression followed by `d` (also causing no regressions), but we could hypothetically parse this unambiguously as a pair (though this would probably require more complex backtracking). +- `[a << B as C > ::D, E < S >> (1)];` is likewise ambiguous in the Rust 2015 Edition. Should this +feature be gated on Rust 2018 Edition and above? Note that this case too was not encountered in the +Crater run. From f89a7994fc499b738b82dedff2431524d2911e6d Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 12:26:25 +0100 Subject: [PATCH 04/16] Make << ambiguity more obvious --- text/0000-undisambiguated-generics.md | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 256edb91bc3..1832a4dcdd7 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -70,7 +70,8 @@ a < b > ( c ); // error: chained comparison operators require parentheses This syntax is therefore no longer ambiguous and we can determine whether `<` is a comparison operator or the start of a generic argument list during parsing. -There is, however, one case in which the syntax is currently ambiguous. +There are, however, two cases in which the syntax is currently ambiguous (correspoding to the same +ambiguity with `<` and `<<`). ```rust // The following: @@ -79,14 +80,24 @@ There is, however, one case in which the syntax is currently ambiguous. ( a(d) ); // Or a pair of comparisons... (a < b, c > (d)); + +// The following: +`(a << B as C > ::D, E < F >> (g));` +// Could be a generic function call (with two arguments)... +( a<::D, E>(g) ); +// Or a pair of bit-shifted comparisons... +(a << B as C > ::D, E < F >> (g)); ``` -Ultimately, this case does not seem occur naturally in Rust code. A +Ultimately, these cases do not seem occur naturally in Rust code. A [Crater run on over 20,000 crates](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443) determined that no crates regress if the ambiguity is resolved in favour of a generic expression rather than tuples of comparisons of this form. We propose that resolving this ambiguity in favour of generic expressions to eliminate `::` is worth this small alteration to the existing parse. +In addition, the latter case is forbidden in the Rust 2018 edition and so is only an ambiguity in +the Rust 2015 Edition. + ## Performance Apart from parsing ambiguity, the main concern regarding allowing `::` to be omitted was the potential performance implications. Although by the time we reach the closing angle bracket `>` we @@ -215,6 +226,6 @@ considering that this pattern has not been encountered in the wild, this is prob syntax was also resolved as a generic expression followed by `d` (also causing no regressions), but we could hypothetically parse this unambiguously as a pair (though this would probably require more complex backtracking). -- `[a << B as C > ::D, E < S >> (1)];` is likewise ambiguous in the Rust 2015 Edition. Should this -feature be gated on Rust 2018 Edition and above? Note that this case too was not encountered in the +- Should we gate this feature on the Rust 2018 Edition to avoid the second syntactic ambiguity +(namely `(a << B as C > ::D, E < F >> (g));`)? Note that this case too was not encountered in the Crater run. From 4f0ce8d125992b967149e08e645ec007c3f3c725 Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 12:27:52 +0100 Subject: [PATCH 05/16] Address some minor typos --- text/0000-undisambiguated-generics.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 1832a4dcdd7..67a835aafed 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -1,5 +1,5 @@ - Feature Name: `undisambiguated_generics` -- Start Date: 2018-09-14 +- Start Date: 2018-09-15 - RFC PR: - Rust Issue: @@ -209,12 +209,12 @@ use backtracking to provide the improved diagnostic suggestions to use `::`, whi `::` from being omitted. ## Future frequency of disambiguated generic expressions -It is likely that should the +It is likely that, should the [generalised type ascription](https://github.com/rust-lang/rfcs/pull/2522) RFC be accepted and implemented, the number of cases where generic type arguments have to be provided is reduced, making users less likely to encounter the `::` construction. However, the [const generics](https://github.com/rust-lang/rfcs/pull/2000) feature, currently in implementation, -is conversely likely to increase the number of cases (specifically where const generic arguments +is conversely likely to *increase* the number of cases (especially where const generic arguments are not used as parameters in types). # Unresolved questions From 66650e7ac88a12d58dabb793546997e046b6a542 Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 12:36:40 +0100 Subject: [PATCH 06/16] Add note on interaction with other features --- text/0000-undisambiguated-generics.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 67a835aafed..3249c81fe16 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -217,6 +217,12 @@ users less likely to encounter the `::` construction. However, the is conversely likely to *increase* the number of cases (especially where const generic arguments are not used as parameters in types). +## Interaction with future features +Note that this proposal does not conflict with: +- Intuitive chained comparisons: i.e. `a < b < c` as being shorthand for `a < b && b < c` (and +similar). +- Specifying const generic arguments in expressions without `::`. + # Unresolved questions [unresolved-questions]: #unresolved-questions From 6efca686ecff175e2cc2f1d02a8f3671af99052f Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 12:36:46 +0100 Subject: [PATCH 07/16] Add prior art section --- text/0000-undisambiguated-generics.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 3249c81fe16..f010fa9cde8 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -223,6 +223,12 @@ Note that this proposal does not conflict with: similar). - Specifying const generic arguments in expressions without `::`. +# Prior art +[prior-art]: #prior-art + +Kotlin has a similar ambiguity with generic arguments in expressions and chooses to resolve this +ambiguity in favour of generic arguments, using a similar technique to that proposed here. + # Unresolved questions [unresolved-questions]: #unresolved-questions From b52b8241b7aa5769c92080746e6a199413224de6 Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 12:54:31 +0100 Subject: [PATCH 08/16] Fix awkward sentence --- text/0000-undisambiguated-generics.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index f010fa9cde8..fe55fd8ea3f 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -174,10 +174,10 @@ taken to discourage the use of `::` at this stage (including in tools, such as r [drawbacks]: #drawbacks The primary drawback is that resolving ambiguities in favour of generics means changing the -interpretation of `(a(d))` from a pair of tuples to a generic function call. However, in -practice, this has been demonstrated -([1](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443)) not to cause issues in -practice (the syntax is unnatural for Rust and is actively warned against by the compiler). +interpretation of `(a(d))` from a pair of tuples to a generic function call. However this has +been demonstrated ([1](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443)) not to +cause issues in practice (the syntax is unnatural for Rust and is actively warned against by the +compiler). Additionally, there is potential for performance regressions due to backtracking. However, empirical evidence ([1](https://github.com/rust-lang/rust/pull/53511#issuecomment-414172984) and From 9e0b658d2f1a8b3ae6f227c1adfdb66d2c6d2fad Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 12:56:26 +0100 Subject: [PATCH 09/16] Add reference to C# --- text/0000-undisambiguated-generics.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index fe55fd8ea3f..e131c14b77e 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -226,8 +226,9 @@ similar). # Prior art [prior-art]: #prior-art -Kotlin has a similar ambiguity with generic arguments in expressions and chooses to resolve this -ambiguity in favour of generic arguments, using a similar technique to that proposed here. +Kotlin and C# 7.0 both have similar ambiguities with generic arguments in expressions and choose to +resolve this ambiguity in favour of generic arguments, using a similar technique to that proposed +here. # Unresolved questions [unresolved-questions]: #unresolved-questions From 775de3fbef2b2d26e0235d80b6b1687c9969a360 Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 16:22:54 +0100 Subject: [PATCH 10/16] Remove reference to 2018 Edition --- text/0000-undisambiguated-generics.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index e131c14b77e..acb494b3e09 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -95,9 +95,6 @@ determined that no crates regress if the ambiguity is resolved in favour of a ge rather than tuples of comparisons of this form. We propose that resolving this ambiguity in favour of generic expressions to eliminate `::` is worth this small alteration to the existing parse. -In addition, the latter case is forbidden in the Rust 2018 edition and so is only an ambiguity in -the Rust 2015 Edition. - ## Performance Apart from parsing ambiguity, the main concern regarding allowing `::` to be omitted was the potential performance implications. Although by the time we reach the closing angle bracket `>` we @@ -239,6 +236,3 @@ considering that this pattern has not been encountered in the wild, this is prob syntax was also resolved as a generic expression followed by `d` (also causing no regressions), but we could hypothetically parse this unambiguously as a pair (though this would probably require more complex backtracking). -- Should we gate this feature on the Rust 2018 Edition to avoid the second syntactic ambiguity -(namely `(a << B as C > ::D, E < F >> (g));`)? Note that this case too was not encountered in the -Crater run. From ad3ea6598c582fbf688068afd5d4f8abee427c84 Mon Sep 17 00:00:00 2001 From: varkor Date: Sat, 15 Sep 2018 18:36:27 +0100 Subject: [PATCH 11/16] Tweak sentence structure --- text/0000-undisambiguated-generics.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index acb494b3e09..dcc250ce232 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -67,11 +67,9 @@ the previous example results in an error. a < b > ( c ); // error: chained comparison operators require parentheses ``` -This syntax is therefore no longer ambiguous and we can determine whether `<` is a comparison -operator or the start of a generic argument list during parsing. - -There are, however, two cases in which the syntax is currently ambiguous (correspoding to the same -ambiguity with `<` and `<<`). +This chained comparison syntax is therefore no longer ambiguous. There are, however, two cases in +which the syntax is currently ambiguous (arguably these are a single case, correspoding to the same +ambiguity with `<` and `<<` respectively). ```rust // The following: @@ -217,8 +215,8 @@ are not used as parameters in types). ## Interaction with future features Note that this proposal does not conflict with: - Intuitive chained comparisons: i.e. `a < b < c` as being shorthand for `a < b && b < c` (and -similar). -- Specifying const generic arguments in expressions without `::`. +similar), should this syntax be proposed in the future. +- Specifying const generic arguments in expressions without `::`, once they have been implemented. # Prior art [prior-art]: #prior-art @@ -233,6 +231,6 @@ here. - Should we warn against the ambiguous case initially? This would be more conservative, but considering that this pattern has not been encountered in the wild, this is probably unnecessary. - Should `(a < b, c > d)` parse as a pair of comparisons? In the aforementioned Crater run, this -syntax was also resolved as a generic expression followed by `d` (also causing no regressions), but +syntax was resolved as a generic expression followed by `d` (also causing no regressions), but we could hypothetically parse this unambiguously as a pair (though this would probably require more complex backtracking). From fd3fb2e5e38f6a3aff6eb01a22fddfb26dc4fdf6 Mon Sep 17 00:00:00 2001 From: varkor Date: Sun, 16 Sep 2018 12:06:37 +0100 Subject: [PATCH 12/16] Address some comments --- text/0000-undisambiguated-generics.md | 28 ++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index dcc250ce232..1accf9481e4 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -23,6 +23,7 @@ fn id(t: T) -> T { fn main() { id(0u32); // ok + let _: fn(u8) -> u8 = id; // ok let n = Nooper<&str>(":)"); // ok n.noop<()>(()); // ok } @@ -80,7 +81,7 @@ ambiguity with `<` and `<<` respectively). (a < b, c > (d)); // The following: -`(a << B as C > ::D, E < F >> (g));` +(a << B as C > ::D, E < F >> (g)); // Could be a generic function call (with two arguments)... ( a<::D, E>(g) ); // Or a pair of bit-shifted comparisons... @@ -99,7 +100,7 @@ potential performance implications. Although by the time we reach the closing an know whether we're parsing a comparison or a generic argument list, when we initially encounter `<`, we are not guaranteed to know which case we're parsing. To solve this problem, we need to first start parsing a generic argument list and then backtrack if this fails (or use a parser that -can deal with ambiguous grammars). We generally prefer to avoid backtracking, as it can be slow. +can deal with ambiguous grammars). We generally prefer to avoid [backtracking](https://en.wikipedia.org/wiki/Backtracking), as it can be slow. However, up until now, the concern with using backtracking for `<`-disambiguation was purely theoretical, without any empirical testing to validate it. @@ -134,6 +135,7 @@ fn id(t: T) -> T { fn main() { id(0u32); + let _: fn(u8) -> u8 = id; let n = Nooper<&str>(":)"); n.noop<()>(()); } @@ -163,7 +165,8 @@ the feature is enabled. This is undesirable in most existing codebases, as the n linted expressions is likely to be large, but could be useful for new codebases and in the future. Note that, apart from for those users who explicitly increase the level of the lint, no steps are -taken to discourage the use of `::` at this stage (including in tools, such as rustfmt). +taken to discourage the use of `::` at this stage (including in tools, such as rustfmt). (In the +future we could consider raising the level to warn-by-default.) # Drawbacks [drawbacks]: #drawbacks @@ -174,25 +177,26 @@ been demonstrated ([1](https://github.com/rust-lang/rust/pull/53578#issuecomment cause issues in practice (the syntax is unnatural for Rust and is actively warned against by the compiler). -Additionally, there is potential for performance regressions due to backtracking. However, -empirical evidence ([1](https://github.com/rust-lang/rust/pull/53511#issuecomment-414172984) and +Additionally, there is potential for performance regressions due to backtracking (this change means +that in theory parsing Rust requires unlimited lookahead, because ambiguous sequences of tokens +could potentially be unlimited in length). However, empirical evidence +([1](https://github.com/rust-lang/rust/pull/53511#issuecomment-414172984) and [2](https://github.com/rust-lang/rust/pull/53511#issuecomment-414360849)) suggests this should not be a problem. Although it is probable that a pathological example could be constructed that does result in poorer performance, such an example would not be representative of typical Rust code and -therefore is not helpful to seriously consider. Backtracking is already used for some cases in the -parser. +therefore is not helpful to seriously consider. The other potential drawback is that other parsers for Rust's syntax (for example in external tools) would also have to implement some form of backtracking (or similar) to handle this case. However, -backtracking is straightforward to implement in many forms of parser (such as recursive decent) and -it is likely this will not cause significant problems. +backtracking is straightforward to implement in many forms of parser (such as recursive decent or +combinatory parsers) and it is likely this will not cause significant problems. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives If we want to allow `::` to be omitted, there are two solutions: - Backtracking, as suggested here. -- Using a parser for nondeterministic grammars, such as GLL. +- Using a parser for nondeterministic grammars, such as [GLL](http://dotat.at/tmp/gll.pdf). Although using a more sophisticated parser would come with its own advantages, it's an overly complex solution to this particular problem. Backtracking seems to work well in typical codebases @@ -207,7 +211,9 @@ use backtracking to provide the improved diagnostic suggestions to use `::`, whi It is likely that, should the [generalised type ascription](https://github.com/rust-lang/rfcs/pull/2522) RFC be accepted and implemented, the number of cases where generic type arguments have to be provided is reduced, making -users less likely to encounter the `::` construction. However, the +users less likely to encounter the `::` construction. However, type ascription can still be more +verbose than explicitly specifying type arguments when the respective type parameters appear in +nested type constructors. On top of that, the [const generics](https://github.com/rust-lang/rfcs/pull/2000) feature, currently in implementation, is conversely likely to *increase* the number of cases (especially where const generic arguments are not used as parameters in types). From 39577104fd451ea117dc9d0d3ba3cc0ef01431f5 Mon Sep 17 00:00:00 2001 From: varkor Date: Sun, 16 Sep 2018 12:14:38 +0100 Subject: [PATCH 13/16] Add bit-shift comparison example --- text/0000-undisambiguated-generics.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 1accf9481e4..0a9ddcbbdde 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -234,9 +234,11 @@ here. # Unresolved questions [unresolved-questions]: #unresolved-questions -- Should we warn against the ambiguous case initially? This would be more conservative, but +- Should we warn against the ambiguous case to begin with? This would be more conservative, but considering that this pattern has not been encountered in the wild, this is probably unnecessary. - Should `(a < b, c > d)` parse as a pair of comparisons? In the aforementioned Crater run, this syntax was resolved as a generic expression followed by `d` (also causing no regressions), but we could hypothetically parse this unambiguously as a pair (though this would probably require more -complex backtracking). +complex backtracking). A similar example is `a < b >> c`, which currently parses as a bit-shift +followed by a comparison, but which the reference implementation attempts to parse as a generic +expression followed by a comparison. From e784085f6800ab2fd3d4cb0709339d6be7d88166 Mon Sep 17 00:00:00 2001 From: varkor Date: Sun, 16 Sep 2018 12:38:50 +0100 Subject: [PATCH 14/16] Warn against `a > c` --- text/0000-undisambiguated-generics.md | 51 ++++++++++++++++++--------- 1 file changed, 34 insertions(+), 17 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 0a9ddcbbdde..0455bea2254 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -69,29 +69,43 @@ a < b > ( c ); // error: chained comparison operators require parentheses ``` This chained comparison syntax is therefore no longer ambiguous. There are, however, two cases in -which the syntax is currently ambiguous (arguably these are a single case, correspoding to the same -ambiguity with `<` and `<<` respectively). +which the syntax is currently ambiguous. +First: ```rust // The following: -(a < b, c > (d)); +(a < b, c > (d)) // Could be a generic function call... -( a(d) ); +( a(d) ) // Or a pair of comparisons... -(a < b, c > (d)); +(a < b, c > (d)) -// The following: -(a << B as C > ::D, E < F >> (g)); +// This is true with both `<` and `<<`: +(a << B as C > ::D, E < F >> (g)) // Could be a generic function call (with two arguments)... -( a<::D, E>(g) ); +( a<::D, E>(g) ) // Or a pair of bit-shifted comparisons... -(a << B as C > ::D, E < F >> (g)); +(a << B as C > ::D, E < F >> (g)) +``` + +Second: +```rust +// The following: +a < b >> c +// Could be a comparison of a generic expression... +a > c +// Or a bit-shift followed by a comparison... +a < b >> c ``` Ultimately, these cases do not seem occur naturally in Rust code. A [Crater run on over 20,000 crates](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443) determined that no crates regress if the ambiguity is resolved in favour of a generic expression -rather than tuples of comparisons of this form. We propose that resolving this ambiguity in favour +rather than tuples of comparisons of this form. However, there are some occurrences of syntax +similar to the second ambiguity ([1](https://sourcegraph.com/github.com/dropbox/rust-brotli/-/blob/src/enc/backward_references.rs#L1257:32), +[2](https://sourcegraph.com/github.com/dropbox/rust-brotli/-/blob/src/enc/encode.rs#L1905:46)). +These ambiguities may always be resolved by adding parentheses if ambiguities are resolved in favour +of generic expresions. We propose that resolving this ambiguity in favour of generic expressions to eliminate `::` is worth this small alteration to the existing parse. ## Performance @@ -148,6 +162,11 @@ An initial implementation is present in https://github.com/rust-lang/rust/pull/5 implementation may be based. The parser will now attempt to parse generic argument lists without `::`, falling back on attempting to parse a comparison if that fails. +The ambiguous case `a < b >> c` will be warn-by-default linted against (suggesting the form +`a < (b >> c)`). Note that we can restrict this lint to the `>>` token, so the standard formatting +of the generic expression `a > c` will not be warned against. This syntax was not encountered in +the Crater run, so this is a safe change to make. + The feature will initially be gated (e.g. `#![feature(undisambiguated_generics)]`). However, note that the parser changes will be present regardless of whether the feature is enabled or not, because feature detection occurs after parsing. However, because it has been shown that there are @@ -172,10 +191,10 @@ future we could consider raising the level to warn-by-default.) [drawbacks]: #drawbacks The primary drawback is that resolving ambiguities in favour of generics means changing the -interpretation of `(a(d))` from a pair of tuples to a generic function call. However this has -been demonstrated ([1](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443)) not to -cause issues in practice (the syntax is unnatural for Rust and is actively warned against by the -compiler). +interpretation of the two cases described above. However this has been demonstrated +([1](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443)) not to +cause issues in practice (in the former case particularly the syntax is unnatural and is actively +warned against by the compiler). Additionally, there is potential for performance regressions due to backtracking (this change means that in theory parsing Rust requires unlimited lookahead, because ambiguous sequences of tokens @@ -239,6 +258,4 @@ considering that this pattern has not been encountered in the wild, this is prob - Should `(a < b, c > d)` parse as a pair of comparisons? In the aforementioned Crater run, this syntax was resolved as a generic expression followed by `d` (also causing no regressions), but we could hypothetically parse this unambiguously as a pair (though this would probably require more -complex backtracking). A similar example is `a < b >> c`, which currently parses as a bit-shift -followed by a comparison, but which the reference implementation attempts to parse as a generic -expression followed by a comparison. +complex backtracking). From 048982e0324fd76301ad68fee9b68c49064d15e4 Mon Sep 17 00:00:00 2001 From: varkor Date: Sun, 16 Sep 2018 23:40:44 +0100 Subject: [PATCH 15/16] Give an example of where type ascription falls short --- text/0000-undisambiguated-generics.md | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 0455bea2254..065a16983f3 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -33,8 +33,9 @@ fn main() { [motivation]: #motivation The requirement to write `::` before generic arguments in expressions is an unexpected corner case -in the language, violating the principle of least surprise. There were historical reasons for its -necessity in the past, acting as a disambiguator for other uses of `<` and `>` in expressions. +in the language, violating the [principle of least surprise](https://en.wikipedia.org/wiki/Principle_of_least_astonishment). +There were historical reasons for its necessity in the past, acting as a disambiguator for other +uses of `<` and `>` in expressions. However, now the ambiguity between generic arguments and comparison operators has been reduced to a single edge case that is very unlikely to appear in Rust code (and has been demonstrated to occur in [none of the existing crates](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443) @@ -102,8 +103,8 @@ Ultimately, these cases do not seem occur naturally in Rust code. A [Crater run on over 20,000 crates](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443) determined that no crates regress if the ambiguity is resolved in favour of a generic expression rather than tuples of comparisons of this form. However, there are some occurrences of syntax -similar to the second ambiguity ([1](https://sourcegraph.com/github.com/dropbox/rust-brotli/-/blob/src/enc/backward_references.rs#L1257:32), -[2](https://sourcegraph.com/github.com/dropbox/rust-brotli/-/blob/src/enc/encode.rs#L1905:46)). +similar to the second ambiguity ([1](https://github.com/dropbox/rust-brotli/blob/ce7b3618f9df942e9340bf3e767b2d3e3caea4b3/src/enc/backward_references.rs#L1257), +[2](https://github.com/dropbox/rust-brotli/blob/ce7b3618f9df942e9340bf3e767b2d3e3caea4b3/src/enc/encode.rs#L2842)). These ambiguities may always be resolved by adding parentheses if ambiguities are resolved in favour of generic expresions. We propose that resolving this ambiguity in favour of generic expressions to eliminate `::` is worth this small alteration to the existing parse. @@ -232,7 +233,18 @@ It is likely that, should the implemented, the number of cases where generic type arguments have to be provided is reduced, making users less likely to encounter the `::` construction. However, type ascription can still be more verbose than explicitly specifying type arguments when the respective type parameters appear in -nested type constructors. On top of that, the +nested type constructors. For example: + +```rust +// Given the following... +fn foo() -> Vec> { /* ... */ } +// With type ascription we have: +let x = foo(): Vec>; +// Whereas using generic arguments we have: +let x = foo(); +``` + +On top of that, the [const generics](https://github.com/rust-lang/rfcs/pull/2000) feature, currently in implementation, is conversely likely to *increase* the number of cases (especially where const generic arguments are not used as parameters in types). From e3e8d9f7f9c7e406603558e6494b2adaadebf9d6 Mon Sep 17 00:00:00 2001 From: varkor Date: Mon, 17 Sep 2018 15:21:33 +0100 Subject: [PATCH 16/16] Make some final adjustments --- text/0000-undisambiguated-generics.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-undisambiguated-generics.md b/text/0000-undisambiguated-generics.md index 065a16983f3..6c6f2873362 100644 --- a/text/0000-undisambiguated-generics.md +++ b/text/0000-undisambiguated-generics.md @@ -1,5 +1,5 @@ - Feature Name: `undisambiguated_generics` -- Start Date: 2018-09-15 +- Start Date: 2018-09-17 - RFC PR: - Rust Issue: @@ -36,8 +36,8 @@ The requirement to write `::` before generic arguments in expressions is an unex in the language, violating the [principle of least surprise](https://en.wikipedia.org/wiki/Principle_of_least_astonishment). There were historical reasons for its necessity in the past, acting as a disambiguator for other uses of `<` and `>` in expressions. -However, now the ambiguity between generic arguments and comparison operators has been reduced to a -single edge case that is very unlikely to appear in Rust code (and has been demonstrated to occur in +However, now the ambiguity between generic arguments and comparison operators has been reduced to +two edge cases that are very unlikely to appear in Rust code (and have been demonstrated to occur in [none of the existing crates](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443) in the Rust ecosystem as of 2018-09-14). Making `::` optional in expressions takes a step towards eliminating an oddity in the Rust syntax, making it more uniform and less confusing (e.g. @@ -265,7 +265,7 @@ here. # Unresolved questions [unresolved-questions]: #unresolved-questions -- Should we warn against the ambiguous case to begin with? This would be more conservative, but +- Should we warn against the ambiguous pair case initially? This would be more conservative, but considering that this pattern has not been encountered in the wild, this is probably unnecessary. - Should `(a < b, c > d)` parse as a pair of comparisons? In the aforementioned Crater run, this syntax was resolved as a generic expression followed by `d` (also causing no regressions), but