Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supertrait item shadowing v2 #3624

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions text/0000-supertrait-item-shadowing-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
- Feature Name: `supertrait_item_shadowing`
- Start Date: 2024-05-04
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

When name resolution encounters an ambiguity between 2 trait methods when both traits are in scope, if one trait is a sub-trait of the other then select that method instead of reporting an ambiguity error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we call this "method selection" or "method resolution", here and all other instances below. There's a pretty big difference between name resolution (which is performed pre-type-check) and method resolution (which is performed during typeck).


# Motivation
[motivation]: #motivation


The libs-api team would like to stabilize `Iterator::intersperse` but has a problem. The `itertools` crate already has:

```rust
// itertools
trait Itertools: Iterator {
fn intersperse(self, element: Self::Item) -> Intersperse<Self>;
}
```

This method is used in crates with code similar to the following:

```rust
use core::iter::Iterator; // Implicit import from prelude

use itertools::Itertools as _;

fn foo() -> impl Iterator<Item = &'static str> {
"1,2,3".split(",").intersperse("|")
// ^ This is ambiguous: it could refer to Iterator::intersperse or Itertools::intersperse
}
```

This code actually works today because `intersperse` is an unstable API, which works because the compiler already has [logic](https://github.com/rust-lang/rust/pull/48552) to prefer stable methods over unstable methods when an amiguity occurs.

Attempts to stabilize `intersperse` have failed with a large number of regressions [reported by crater](https://github.com/rust-lang/rust/issues/88967) which affect many popular crates. Even if these were to be manually corrected (since ambiguity is considered allowed breakage) we would have to go through this whole process again every time a method from `itertools` is uplifted to the standard library.

# Proposed solution
[proposed-solution]: #proposed-solution

This RFC proposes to change name resolution to resolve the ambiguity in the following specific circumstances:
- All method candidates are trait methods. (Inherent methods are already prioritized over trait methods)
- One trait is transitively a sub-trait of all other traits in the candidate list.

When this happens, the sub-trait method is selected instead of reporting an ambiguity error.

Note that this only happens when *both* traits are in scope since this is required for the ambiguity to occur in the first place.

# Drawbacks
[drawbacks]: #drawbacks

This behavior can be surprising: adding a method to a sub-trait can change which function is called in unrelated code. A lint could be emitted to warn users about the potential ambiguity.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

If we choose not to accept this RFC then there doesn't seem to be a reasonable path for adding new methods to the `Iterator` trait if such methods are already provided by `itertools` without a lot of ecosystem churn.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is picking a different name not a reasonable path? The unavoidable bikeshedding around a new name is annoying, but it seems to me like a small, one-time cost compared to the permanent additional language complexity of this feature.

I also wonder how many times we anticipate to run into this problem in the future. Are there more examples aside from Itertools::intersperse? If we only ran into this problem once within 9 years of Rust being stable, the benefit of this feature seems very limited.

Copy link

@ripytide ripytide May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Picking a new name also has disadvantages other than than the time taken to pick it, users then have to learn the new method name and that it is semantically identical to the Itertools::intersperse() yet has a different name. This is not a one-time cost as it will effect all future users when, for example, searching docs for this method. This would be the case for all methods stabilized from itertools to std which might be quite a few if we had a reliable way to do such a migration.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users then have to learn the new method name

I don't see the problem. People who are using intersperse from itertools shouldn't expect the standard library to use the exact same name. And I assume few people rely on the unstable version in std for production code, it would be unwise to opt into breaking changes for what amounts to a small ergonomics improvement (which a library can also provide).

For regular Rust users, the function doesn't exist right now. In the future it may - its name is an open question.

This would be the case for all methods stabilized from itertools to std which might be quite a few

I agree that this is something to consider, we don't want to have to pick the "second best" name for many functions in std. But first we should think concretely about which methods from itertools might actually make the jump into std. itertools has been around for a long time and I'm not aware of a big push to upstream many of its methods. Which indicates to me, there isn't that much need. But I'm happy to be convinced otherwise. Examples from other libraries besides itertools count as well.

Copy link
Contributor

@digama0 digama0 May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that because itertools is used widely, std cannot upstream methods from itertools without causing lots of breakage in all crates currently using itertools. This is a perverse incentive which forces the "second best" issue you mentioned, and it's not limited to itertools, it happens whenever std lacks a function, someone implements an extension trait to add it in a crate (as they should), and everyone picks it up because it is really useful (as they should). The exact sequence of events which leads to strong evidence that something should be in std is also the sequence of events that blocks it from being added to std.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that. The reason I'm pushing back is that to me, the feature doesn't seem to align with Rust's design principles. This new implicit default doesn't seem obviously correct to me. If I call a method that has two implementations, I would generally prefer the compiler yell at me rather than pick one without telling me. That's why I think we should be certain that we'll make good use of this feature before adding it.

But I'll admit this is a theoretical objection. In practice, the problem may never show up and then it's fine to add the feature. Pragmatism comes first. I guess I just agree with this comment. We should think about edge cases where this could go wrong.

Copy link

@cynecx cynecx May 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would generally prefer the compiler yell at me rather than pick one without telling me

The rfc mentions this:

This behavior can be surprising: adding a method to a sub-trait can change which function is called in unrelated code. A lint could be emitted to warn users about the potential ambiguity.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a lint would make sense if the migration-situation with intersperse is the only use case we expect this feature to be used in practice. But maybe some users will use the shadowing to implement a form of specialization? Or any other unrelated use case I can't think of right now. If that happens, it might lead to a discussion about whether the lint should be enabled by default or not. And if it ends up allow-by-default, it loses much of its value.


## Only doing this for specific traits

One possible alternative to a general change to the name resolution rules would be to only do so on a case-by-case basis for specific methods in standard library traits. This could be done by using a perma-unstable `#[shadowable]` attribute specifically on methods like `Iterator::intersperse`.

There are both advantages and inconvenients to this approach. While it allows most Rust users to avoid having to think about this issue for most traits, it does make the `Iterator` trait more "magical" in that it doesn't follow the same rules as the rest of the language. Having a consistent rule for how name resolution works is easier to teach people.

## Preferring the supertrait method instead

In cases of ambiguity between a subtrait method and a supertrait method, there are 2 ways of resolving the ambiguity. This RFC proposes to resolve in favor of the subtrait since this is most likely to avoid breaking changes in practice.

Consider this situation:

- library A has trait `Foo`
- crate B, depending on A, has trait `FooExt` with `Foo` as a supertrait
- A adds a new method to `Foo`, but it has a default implementation so it's not breaking. B has a pre-existing method with the same name.

In this general case, the reason this cannot be resolved in favor of the supertrait is that the method signatures are not necessarily compatible.

[In code](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=b3919f7a8480c445d40b18a240936a07):

```rust
#![allow(unused)]

mod a {
pub trait Int {
// fn call(&self) -> u32 {
// 0
// }
}
impl Int for () {}
}

mod b {
pub trait Int: super::a::Int {
fn call(&self) -> u8 {
0
}
}
impl Int for () {}
}

use a::Int as _;
use b::Int as _;

fn main() {
let val = ().call();
println!("{}", std::any::type_name_of_val(&val));
}
```

Resolving in favor of `a` is a breaking change; in favor of `b` is not. The only other option is the status quo: not compiling. `a` simply cannot happen lest we violate backwards compatibility and the status quo is not ideal.

# Prior art
[prior-art]: #prior-art

### RFC 2845

RFC 2845 was a previous attempt to address this problem, but it has several drawbacks:
- It doesn't fully address the problem since it only changes name resolution when trait methods are resolved due to generic bounds. In practice, most of the amiguity from stabilizing `intersperse` comes from non-generic code.
- It adds a lot of complexity because name resolution depends on the specific trait bounds that have been brought into scope.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

None

# Future possibilities
[future-possibilities]: #future-possibilities

None