-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the turbofish syntax redundant #2544
Closed
Closed
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
96e7095
Initial description of undisambiguated_generics
varkor 576da47
Add note about effect of generalised type ascription
varkor 42ab5a7
Add note about module ambiguity
varkor f89a799
Make << ambiguity more obvious
varkor 4f0ce8d
Address some minor typos
varkor 66650e7
Add note on interaction with other features
varkor 6efca68
Add prior art section
varkor b52b824
Fix awkward sentence
varkor 9e0b658
Add reference to C#
varkor 775de3f
Remove reference to 2018 Edition
varkor ad3ea65
Tweak sentence structure
varkor fd3fb2e
Address some comments
varkor 3957710
Add bit-shift comparison example
varkor e784085
Warn against `a<b> > c`
varkor 048982e
Give an example of where type ascription falls short
varkor e3e8d9f
Make some final adjustments
varkor File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,273 @@ | ||
- Feature Name: `undisambiguated_generics` | ||
- Start Date: 2018-09-17 | ||
- RFC PR: | ||
- Rust Issue: | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Make disambiguating generic arguments in expressions with `::` optional, allowing generic arguments | ||
to be specified without `::` (making the "turbofish" notation no longer necessary). | ||
This makes the following valid syntax: | ||
|
||
```rust | ||
struct Nooper<T>(T); | ||
|
||
impl<T> Nooper<T> { | ||
fn noop<U>(&self, _: U) {} | ||
} | ||
|
||
fn id<T>(t: T) -> T { | ||
t | ||
} | ||
|
||
fn main() { | ||
id<u32>(0u32); // ok | ||
let _: fn(u8) -> u8 = id<u8>; // ok | ||
let n = Nooper<&str>(":)"); // ok | ||
n.noop<()>(()); // ok | ||
} | ||
``` | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
The requirement to write `::` before generic arguments in expressions is an unexpected corner case | ||
in the language, violating the [principle of least surprise](https://en.wikipedia.org/wiki/Principle_of_least_astonishment). | ||
There were historical reasons for its necessity in the past, acting as a disambiguator for other | ||
uses of `<` and `>` in expressions. | ||
However, now the ambiguity between generic arguments and comparison operators has been reduced to | ||
two edge cases that are very unlikely to appear in Rust code (and have been demonstrated to occur in | ||
[none of the existing crates](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443) | ||
in the Rust ecosystem as of 2018-09-14). Making `::` optional in expressions takes a step towards | ||
eliminating an oddity in the Rust syntax, making it more uniform and less confusing (e.g. | ||
[1](https://users.rust-lang.org/t/why-cant-i-specify-type-parameters-directly-after-the-type/2365), | ||
[2](https://users.rust-lang.org/t/type-parameter-syntax-when-defining-vs-calling-functions/15037), | ||
[3](https://github.com/rust-lang/book/issues/385), | ||
[4](https://www.reddit.com/r/rust/comments/73pm5e/whats_the_rationale_behind_for_type_parameters/), | ||
[5](https://matematikaadit.github.io/posts/rust-turbofish.html)) to beginners. | ||
|
||
There have been two historical reasons to require `::` before generic arguments in expressions. | ||
|
||
## Syntax ambiguity | ||
Originally, providing generic arguments without `::` meant that some expressions were ambiguous in | ||
meaning. | ||
|
||
```rust | ||
// Take the following: | ||
a < b > ( c ); | ||
// Is this a generic function call..? | ||
a<b>(c); | ||
// Or a chained comparison? | ||
(a < b) > (c); | ||
``` | ||
|
||
However, chained comparisons are [now banned in Rust](https://github.com/rust-lang/rfcs/pull/558): | ||
the previous example results in an error. | ||
|
||
```rust | ||
a < b > ( c ); // error: chained comparison operators require parentheses | ||
``` | ||
|
||
This chained comparison syntax is therefore no longer ambiguous. There are, however, two cases in | ||
which the syntax is currently ambiguous. | ||
|
||
First: | ||
```rust | ||
// The following: | ||
(a < b, c > (d)) | ||
// Could be a generic function call... | ||
( a<b, c>(d) ) | ||
// Or a pair of comparisons... | ||
(a < b, c > (d)) | ||
|
||
// This is true with both `<` and `<<`: | ||
(a << B as C > ::D, E < F >> (g)) | ||
// Could be a generic function call (with two arguments)... | ||
( a<<B as C>::D, E<F>>(g) ) | ||
// Or a pair of bit-shifted comparisons... | ||
(a << B as C > ::D, E < F >> (g)) | ||
``` | ||
|
||
Second: | ||
```rust | ||
// The following: | ||
a < b >> c | ||
// Could be a comparison of a generic expression... | ||
a<b> > c | ||
// Or a bit-shift followed by a comparison... | ||
a < b >> c | ||
``` | ||
|
||
Ultimately, these cases do not seem occur naturally in Rust code. A | ||
[Crater run on over 20,000 crates](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443) | ||
determined that no crates regress if the ambiguity is resolved in favour of a generic expression | ||
rather than tuples of comparisons of this form. However, there are some occurrences of syntax | ||
similar to the second ambiguity ([1](https://github.com/dropbox/rust-brotli/blob/ce7b3618f9df942e9340bf3e767b2d3e3caea4b3/src/enc/backward_references.rs#L1257), | ||
[2](https://github.com/dropbox/rust-brotli/blob/ce7b3618f9df942e9340bf3e767b2d3e3caea4b3/src/enc/encode.rs#L2842)). | ||
These ambiguities may always be resolved by adding parentheses if ambiguities are resolved in favour | ||
of generic expresions. We propose that resolving this ambiguity in favour | ||
of generic expressions to eliminate `::` is worth this small alteration to the existing parse. | ||
|
||
## Performance | ||
Apart from parsing ambiguity, the main concern regarding allowing `::` to be omitted was the | ||
potential performance implications. Although by the time we reach the closing angle bracket `>` we | ||
know whether we're parsing a comparison or a generic argument list, when we initially encounter `<`, | ||
we are not guaranteed to know which case we're parsing. To solve this problem, we need to | ||
first start parsing a generic argument list and then backtrack if this fails (or use a parser that | ||
can deal with ambiguous grammars). We generally prefer to avoid [backtracking](https://en.wikipedia.org/wiki/Backtracking), as it can be slow. | ||
However, up until now, the concern with using backtracking for `<`-disambiguation was purely | ||
theoretical, without any empirical testing to validate it. | ||
|
||
[A recent experiment](https://github.com/rust-lang/rust/pull/53511) to allow generic arguments | ||
without `::`-disambiguation [showed no performance regressions](https://github.com/rust-lang/rust/pull/53511#issuecomment-414172984) | ||
using the backtracking technique. This indicates that in existing codebases, allowing `::` to be | ||
omitted is unlikely to lead to any performance regressions. | ||
|
||
Similarly, the performance implications of deleting all occurrences of `::` (and simply using | ||
generic arguments directly) | ||
[also showed no performance regressions](https://github.com/rust-lang/rust/pull/53511#issuecomment-414360849). | ||
This is likely to be due to the relative uncommonness of providing explicit generic arguments and | ||
using comparison operators in the cases of ambiguous prefixes, relative to typical codebases. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
To explicitly pass generic arguments to a type, value or method, you may write the lifetime, type | ||
and const arguments in angle brackets (`<` and `>`) directly after the expression. (Note that the | ||
"turbofish" notation is no longer necessary.) | ||
|
||
```rust | ||
struct Nooper<T>(T); | ||
|
||
impl<T> Nooper<T> { | ||
fn noop<U>(&self, _: U) {} | ||
} | ||
|
||
fn id<T>(t: T) -> T { | ||
t | ||
} | ||
|
||
fn main() { | ||
id<u32>(0u32); | ||
let _: fn(u8) -> u8 = id<u8>; | ||
let n = Nooper<&str>(":)"); | ||
n.noop<()>(()); | ||
} | ||
``` | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
An initial implementation is present in https://github.com/rust-lang/rust/pull/53578, upon which the | ||
implementation may be based. The parser will now attempt to parse generic argument lists without | ||
`::`, falling back on attempting to parse a comparison if that fails. | ||
|
||
The ambiguous case `a < b >> c` will be warn-by-default linted against (suggesting the form | ||
`a < (b >> c)`). Note that we can restrict this lint to the `>>` token, so the standard formatting | ||
of the generic expression `a<b> > c` will not be warned against. This syntax was not encountered in | ||
the Crater run, so this is a safe change to make. | ||
|
||
The feature will initially be gated (e.g. `#![feature(undisambiguated_generics)]`). However, | ||
note that the parser changes will be present regardless of whether the feature is enabled or not, | ||
because feature detection occurs after parsing. However, because it has been shown that there are | ||
little-to-no performance regressions when modifying the parser and without taking advantage of `::` | ||
optionality, this should not be a problem. | ||
|
||
When `undisambiguated_generics` is not enabled, the parser modifications will allow us to | ||
provide better diagnostics: specifically, we'll be able to correctly suggest (in a | ||
machine-applicable manner) using `::` whenever the user has actually typed undisambiguated generic | ||
arguments. The current diagnostic suggestions suggesting the use of `::` trigger whenever there are | ||
chained comparisons, which has false positives and does not provide a fix suggestion. | ||
|
||
An allow-by-default lint `disambiguated_generics` will be added to suggest removing `::` when | ||
the feature is enabled. This is undesirable in most existing codebases, as the number of | ||
linted expressions is likely to be large, but could be useful for new codebases and in the future. | ||
|
||
Note that, apart from for those users who explicitly increase the level of the lint, no steps are | ||
taken to discourage the use of `::` at this stage (including in tools, such as rustfmt). (In the | ||
future we could consider raising the level to warn-by-default.) | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
The primary drawback is that resolving ambiguities in favour of generics means changing the | ||
interpretation of the two cases described above. However this has been demonstrated | ||
([1](https://github.com/rust-lang/rust/pull/53578#issuecomment-421475443)) not to | ||
cause issues in practice (in the former case particularly the syntax is unnatural and is actively | ||
warned against by the compiler). | ||
|
||
Additionally, there is potential for performance regressions due to backtracking (this change means | ||
that in theory parsing Rust requires unlimited lookahead, because ambiguous sequences of tokens | ||
could potentially be unlimited in length). However, empirical evidence | ||
([1](https://github.com/rust-lang/rust/pull/53511#issuecomment-414172984) and | ||
[2](https://github.com/rust-lang/rust/pull/53511#issuecomment-414360849)) suggests this should not | ||
be a problem. Although it is probable that a pathological example could be constructed that does | ||
result in poorer performance, such an example would not be representative of typical Rust code and | ||
therefore is not helpful to seriously consider. | ||
|
||
The other potential drawback is that other parsers for Rust's syntax (for example in external tools) | ||
would also have to implement some form of backtracking (or similar) to handle this case. However, | ||
backtracking is straightforward to implement in many forms of parser (such as recursive decent or | ||
combinatory parsers) and it is likely this will not cause significant problems. | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
If we want to allow `::` to be omitted, there are two solutions: | ||
- Backtracking, as suggested here. | ||
- Using a parser for nondeterministic grammars, such as [GLL](http://dotat.at/tmp/gll.pdf). | ||
|
||
Although using a more sophisticated parser would come with its own advantages, it's an overly | ||
complex solution to this particular problem. Backtracking seems to work well in typical codebases | ||
and provides an immediate solution to the problem. | ||
|
||
Alternatively we could continue to require `::`. This would ensure there would be no performance | ||
implications, but would leave the nonconformal and surprising syntax in place. We could potentially | ||
use backtracking to provide the improved diagnostic suggestions to use `::`, while still preventing | ||
`::` from being omitted. | ||
|
||
## Future frequency of disambiguated generic expressions | ||
It is likely that, should the | ||
[generalised type ascription](https://github.com/rust-lang/rfcs/pull/2522) RFC be accepted and | ||
implemented, the number of cases where generic type arguments have to be provided is reduced, making | ||
users less likely to encounter the `::` construction. However, type ascription can still be more | ||
verbose than explicitly specifying type arguments when the respective type parameters appear in | ||
nested type constructors. For example: | ||
|
||
```rust | ||
// Given the following... | ||
fn foo<T>() -> Vec<HashSet<T>> { /* ... */ } | ||
// With type ascription we have: | ||
let x = foo(): Vec<HashSet<u8>>; | ||
// Whereas using generic arguments we have: | ||
let x = foo<u8>(); | ||
``` | ||
|
||
On top of that, the | ||
[const generics](https://github.com/rust-lang/rfcs/pull/2000) feature, currently in implementation, | ||
is conversely likely to *increase* the number of cases (especially where const generic arguments | ||
are not used as parameters in types). | ||
|
||
## Interaction with future features | ||
Note that this proposal does not conflict with: | ||
- Intuitive chained comparisons: i.e. `a < b < c` as being shorthand for `a < b && b < c` (and | ||
similar), should this syntax be proposed in the future. | ||
- Specifying const generic arguments in expressions without `::`, once they have been implemented. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
Kotlin and C# 7.0 both have similar ambiguities with generic arguments in expressions and choose to | ||
resolve this ambiguity in favour of generic arguments, using a similar technique to that proposed | ||
here. | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
- Should we warn against the ambiguous pair case initially? This would be more conservative, but | ||
considering that this pattern has not been encountered in the wild, this is probably unnecessary. | ||
- Should `(a < b, c > d)` parse as a pair of comparisons? In the aforementioned Crater run, this | ||
syntax was resolved as a generic expression followed by `d` (also causing no regressions), but | ||
we could hypothetically parse this unambiguously as a pair (though this would probably require more | ||
complex backtracking). |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a language argument, but you are underestimating the work required to do this correctly.
On encountering
<
parser needs to enter some kind of new speculative mode suppressing all side effects, including non-fatal diagnostics.We can ignore these details if the rollback is done only as a part of best-effort error recovery, but not if it's a part of the language, so the feature has some global effect on parser implementation rather than just calling
self.clone()
in one place.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, parser can accept pretty strange things and then report "semantic" errors later, but we won't be aware of these errors when disambiguating with backtracking.
This problem already exists in the formulation "what exactly is accepted under
cfg(false)
?", but in that case it's probably easier to fix e.g. by doing AST validation during expansion.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this was necessary with rust-lang/rust@307ea60, though I imagined this wasn't sufficient for a final implementation. The language in the RFC wasn't intended to insinuate how difficult the task would be: the proof of concept is definitely only that!