Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edition Based Method Disambiguation: Preventing inference ambiguity breakages with extension trait methods #3240

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
250 changes: 250 additions & 0 deletions text/0000-edition-based-method-disambiguation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
- Feature Name: `edition_based_method_disambiguation`
- Start Date: 2022-03-04
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

*Note*: **The changes proposed in this RFC do not tie methods to editions**. It
only turns currently allowed breakages between editions into a breakage at an
edition boundary, if there is no ambiguity the method is still callable
immediately upon stabilization as things work today. This RFC only changes
behavior when there is an ambiguity to make things that are currently errors
into warnings until the next edition boundary.

This RFC proposes a way to introduce new trait methods that conflict with
pre-edition[^1] trait methods in downstream crates in a backwards compatible fashion.
We do so by annotating new methods with the edition they're introduced in. Then
yaahc marked this conversation as resolved.
Show resolved Hide resolved
when ambigutity is detected between a new method in the standard library and an
pre-edition downstream method the compiler will check if the crate edition matches
the edition that the method was introduced in. If it does we pick the
pre-edition method and output a warning that there was
an ambigutity with a newly introduced std method and that this warning will be
promoted to a hard error in the next edition.

# Motivation
[motivation]: #motivation

Rust has had a long standing issue with breaking changes caused by introducing
new methods that conflict with pre-edition downstream methods. This issue is best
exemplified with the recent attempt to move `Itertools::intersperse` into the
`Iterator` trait which [broke a large number of
crates](https://github.com/rust-lang/rust/issues/88967). Continuing as we have
been and managing these breakages on a case by case level is not aligned with
our strict stability guarantees. The libs-api team needs a robust solution to
introduce methods like these without causing any breakage.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

*written as though this feature is already implemented and stable*

The rust standard library recently added support for adding new methods to
traits that conflict with pre-edition[^1] methods with the same name on other
traits without causing breakage due to ambiguity. Since adding this feature you
may start running into errors that look like this

```
warning[E0034]: multiple applicable items in scope
--> src/lib.rs:23:10
|
23 | it.intersperse(sep)
| ^^^^^^^^^^^ multiple `intersperse` found
|
note: candidate #1 is defined in an impl of the trait `Iterator` for the type `MyIter`
--> src/lib.rs:7:1
|
7 | impl Iterator for MyIter {
| ^^^^^^^^^^^^^^^^^^^^^^^^
= note: this was introduced in the current edition and has been deprioritized to prevent breakage
= warning: in the next edition this warning will become an error
note: candidate #2 is defined in an impl of the trait `Itertools` for the type `MyIter`
--> src/lib.rs:16:5
|
16 | fn intersperse(self, separator: Self::Item) -> Intersperse<Self> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= note: to avoid the ambiguity, this candidate was selected to prevent breakage in edition < 20XX
help: disambiguate the associated function for candidate #1
|
23 | Iterator::intersperse(it, sep)
|
help: disambiguate the associated function for candidate #2
|
23 | Itertools::intersperse(it, sep)
|
```

These errors are an expected stage of the standard library development
lifecycle, where methods are experimented within 3rd party crates and then
moved into the standard library once they've been thorougly tested. A classic
example of this is the `Itertools` crate for experimenting with extentions to
the `Iterator` trait. However this problem isn't restricted to extension traits
of pre-edition standard library traits, and can indeed become a problem whenever
any two methods have the same name.

You can fix issues like this by manually editing the code to select the
specific version of the method you wish to use or, in certain common cases, you
can use cargo fix. cargo fix will make assumptions about how the methods relate
depending on if you're using cargo fix for an edition upgrade or not.

* **Within same edition** cargo fix will assume that the new method is a drop
in replacement of the pre-edition downstream one and will disambiguate by
selecting the upstream method defined in `std`.
Comment on lines +91 to +93
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we investigated how often this is actually a correct assumption? If I remember correctly, quite a few times we've stabilized something that's slightly different from something that already exists in the ecosystem. For example, std's new version might return things by reference where the version from the ecosystem might return by value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not but that seems like a good thing to check. I wasn't super confident about suggesting this behavior for rustfix so if it turns out we should only have the cross edition one and have users disambiguate manually for these cases that seems like a reasonable outcome as well. I'll try to find as many examples of these breakages as I can and start keeping a list.

(tossing in this comment for now, will move later)

* **As part of an edition upgrade** cargo fix will prioritize maintaining the
same behavior, and will disambiguate by selecting the pre-edition method that
was being used previously.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, this is somewhat counterintuitive to me. I understand where it's coming from, but the workflow seems surprising.


To run cargo fix within the same edition run:

```
cargo fix
```

In the example above this would replace the ambiguous code with
`Iterator::intersperse(it, sep)`, selecting the new implementation.

To run cargo fix as part of an edition upgrade run:

```
cargo fix --edition
```

In the example above this would replace the ambiguous code with
`Itertools::intersperse(it, sep)`, maintaining the pre-edition behavior.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

This feature will be implemented by modifying the `rustc_stable` attribute to
support an additional optional `edition` field.
Comment on lines +119 to +120
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a separate edition and since field? Would we ever have e.g. a since = "1.56.0" that's not editon = "2021"? Or a since = "1.55.0" that's not edition = "2018"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had initially considered not even having an edition field and having it be derived from the compiler version, but decided against it based on some advice from @Manishearth. His thinking if I recall correctly was that we want the ability to leave out the edition field for APIs that otherwise couldn't have caused breakage, because otherwise we might add a new type with some new methods and then people could add conflicting methods and get warnings, when really that conflicting method should have been an error from the start. We may very well want to enforce correctness by cross referencing the edition and since fields to ensure consistency, but I think we do want both of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think that makes sense. Then the only question left is whether we'll actually be able to judge properly if a change can cause breakage. And would use this only for known cases of breakage, or for all things that can theoretically cause breakage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think that makes sense. Then the only question left is whether we'll actually be able to judge properly if a change can cause breakage. And would use this only for known cases of breakage, or for all things that can theoretically cause breakage?

My expectation is that it's probably fine if we leave off the edition field when we're unsure, and if we run into reports of breakage or see them on crater we can always go back and add the edition field to the offending API to resolve that breakage immediately. At a minimum though I imagine we'd want to always put an edition field on stability attributes for new methods on existing types and traits, and on any new Display impls for types that already have Debug impls.


During method resolution, when we detect an ambiguity we should then check if
one of the methods in question is a standard library method with an `edition`
field. When the edition field exists in the stability attribute and the edition
field of that method matches the current crate's edition we ignore that method
Copy link
Member

@pnkfelix pnkfelix Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text should clarify what the word "matches" means above, or use different phrasing. (When I first read it, I thought "matches" meant "equals" and thought that the text was only covering the cases of {stdlib-method < crate, stdlib-method = crate} and not stdlib-method > crate.

(At this point I think the intent is that "matches" is meant to be read as stdlib-method >= crate, as in "the edition field of the standard library method is greater than or equal to the current crate's edition". I'll post a separate comment about that supposed intent.)

and select the pre-edition method that conflicted with it and generate a
warning. If the edition field in the stability attribute is an earlier edition
than the crate's edition we continue as normal and emit an error for the
ambiguity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems plausible, but as noted elsewhere this reasoning would like only apply to one level of autoderef. An alternative might be:

  • If a method with this "dependent" character exists and we are in the "compatibility" mode:
    • Run the method selection probe once without the method in the list.
    • If this gets a "no such method" error, run again WITH the method in the list.

This seems like it would be strictly non-breaking.

My hunch is that it makes sense -- at this RFC stage -- to leave this unspecified and instead to focus the RFC on enumerated interesting examples and how we would ideally like them to behave. I can come up with various patterns to think about. Then we worry about how to implement it.


This flag should be usable to resolve the following forms of breakage:

* A new method on an pre-edition trait ([e.g.
itertools::intersperse](https://github.com/rust-lang/rust/issues/88967))
* A new trait implementation of an pre-edition trait on an pre-edition type ([e.g.
ErrorKind Display](https://github.com/rust-lang/rust/issues/94507))
* A new inherent method on an pre-edition type (no recent examples)

# Drawbacks
[drawbacks]: #drawbacks

## Disambiguation can require invasive changes

In simple cases where you only call a single method in an expression switching
from the ambiguous method call syntax, `self.method(args...)` to the
unambiguous function call syntax `Trait::method(self, args...)` is an easy
change. In longer method call chains however there isn't a way to disambiguate
the trait a method is associated with when calling it without splitting up that
expression into multiple expressions, which can change drop behavior and
prevent temporaries for living as long as they need to.

This RFC intentionally avoids solving this problem or even proposing strawmen
versions of the syntax to avoid distracting from the core issue, but at the
same time it increases the need for a language syntax extension for quickly
disambiguating the trait a method call should come from.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

[This
comment](https://github.com/rust-lang/rust/issues/88967#issuecomment-938024847)
on the `Iterator::intersperse` issue details a few alternatives that were
discussed by the libs team when we encountered that specific issue. These
include both short term and long term suggestions.

**Short Term**

* use build scripts in crates where this breakage is expected to detect when
the breakage is introduced by the compiler and disable the conflicting APIs
internally using `#[cfg]` directives.
* use `#[cfg(accessible)]` in the crate where the breakage is expected to
automatically remove the conflicting API when the upstream API is introduced.

These solutions don't solve the problem generally, and instead address the
specific breakage that are known or expected. We can and do catch many such
issues via crater but we cannot test all crates via crater and we still end up
breaking people's crates on nightly before the crater runs have had a chance to
catch any breakage.

**Longer Term**

* `rust-version` based visibility filtering - make it a hard error to use APIs
that were introduced in later versions of Rust than your current Minimum
Supported Rust Version (MSRV) as specified in the `rust-version` field.
* [Supertrait item shadowing](https://github.com/rust-lang/rfcs/pull/2845)

These proposals are not actually alternatives, but rather complementary
features that help reduce breakages and which should be persued alongside this
RFC.

The `rust-version` approach would prevent many breakages by not ever resolving
ambiguous method calls to new methods when those new methods are introduced in
later versions than your MSRV, but it would not be a complete solution by
itself since otherwise it would turn bumping MSRV into a breaking change. This
is counter to our stability policy which promises to only introduce breaking
changes at edition boundaries.

The Supertrait item shadowing RFC would prevent breakages where traits have a
supertrait/subtrait relationship such as in the `Iterator`/`Itertools` case and
would give us a better fallback, where we can immediately resolve methods to
the supertrait instance within the same edition rather than producing the
warning, but it does not help with situations like the `Display`/`Debug`
breakage or with new inherent methods where a supertrait/subtrait relationship
does not exist.

# Prior art
[prior-art]: #prior-art

- [previous discussion on irlo](https://internals.rust-lang.org/t/idea-paths-in-method-names/6834/14?u=scottmcm)

# Unresolved questions
[unresolved-questions]: #unresolved-questions

## Interaction with autoderef

We already have logic for preferring stable methods to unstable methods but it
breaks[^2] if the auto deref for one requires one more level of indirection
than the other. We should be careful to consider how autoderef behavior can
affect edition based method disambiguation.
yaahc marked this conversation as resolved.
Show resolved Hide resolved

As a prior example, the addition of `into_iter` for arrays was done via special
case treatment in the compiler because of this exact sort of breakage in
autoderef precidence. If we can make this edition based disambiguation properly
handle autoderef precidence we maybe able to remove that special case handling
for array's `into_iter` impl and replace it with an `edition = "2018"` field in
its stability attribute.

# Future possibilities
yaahc marked this conversation as resolved.
Show resolved Hide resolved
[future-possibilities]: #future-possibilities

## Unambiguous method call syntax

As this RFC previously pointed out in the drawbacks section, introducing a new
syntax for unambiguous method calls for a specific trait would significantly
improve the experience of resolving these warnings.

## Extension to 3rd party crates ecosystem

The lang teams is already persuing the possibility of [stabilizing stability
attributes](https://github.com/rust-lang/lang-team/blob/master/design-meeting-minutes/2022-02-16-libs-most-wanted.md#make-stable-and-unstable-available-for-third-party-crates)
to allow 3rd party crates to mark APIs as `#[stable]` or `#[unstable]`. We
would likely need to consider how this disambiguation functionality would be
extended along with the stability attributes. How it would interact with semver
and editions, and whether we could better support crates that take a similar
perma-1.0 stability policy to that of `std`.

[^1]: Definition: Pre-edition methods are methods that could legally have been
introduced during the current crate's edition which do not conflict with any
methods that existed during the initial release of that edition.
[^2]: https://github.com/rust-lang/rust/issues/86682