Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uniquifying region constraints #30

Open
lcnr opened this issue Jun 12, 2023 · 1 comment
Open

uniquifying region constraints #30

lcnr opened this issue Jun 12, 2023 · 1 comment
Labels

Comments

@lcnr
Copy link
Contributor

lcnr commented Jun 12, 2023

Given a goal like u32: Trait<'a, 'a>, should this be canonicalized to exists<'0> u32: Trait<'0, '0> or exists<'0, '1> u32: Trait<'0, '1>? Using the first variant causes issues in MIR typeck (#27). Going with the second approach will force us to use a semantic lookup for opaque types (#17).

https://rust-lang.zulipchat.com/#narrow/stream/364551-t-types.2Ftrait-system-refactor/topic/mir.20typeck.20and.20relying.20on.20region.20equality

We ended up going back to uniquifying region constraints in rust-lang/rust#114117. This feels like the only way to avoid the ICEs in MIR typeck from #27

@lcnr
Copy link
Contributor Author

lcnr commented Jul 24, 2024

Uniquifying region constraints is also causing hangs as we can otherwise (mostly) cache in all type folders. We've tried to readd the a type-size limit in rust-lang/rust#125507 to eagerly error in these cases instead of getting hangs and later reverted that change as it caused breakage in the wild rust-lang/rust#127670. We therefore potentially have to handle such large types in the trait solver to avoid hangs.

If we don't uniquify, we have to make sure that regions still do not impact behavior. For #27 this could be achieved by supporting OR region constraints. However, we would also be forced to stop preferring trivial candidates over others: https://github.com/rust-lang/rust/blob/08a9ca7c18a30a23a72a43b65be616c9a6a36a5a/compiler/rustc_next_trait_solver/src/solve/mod.rs#L237-L243

If we have one ambiguous candidate and one trivially true candidate which relates two regions, then this trivial true candidate can be preferred if the two regions are known to be the same, but we're unable to do so otherwise.

bors added a commit to rust-lang-ci/rust that referenced this issue Dec 4, 2024
rework winnowing to sensibly handle global where-bounds

There may be multiple ways to prove a given trait-bound. In case there are multiple such applicable candidates we need to somehow merge them or bail with ambiguity. When merging candidates we prefer some over others for multiple reasons:

- we want to guide inference during typeck, even if not strictly necessary
- to avoid ambiguity if there if there are at most lifetime differences
    - old solver needs exactly one candidate
    - new solver only needs to handle lifetime differences
- we disable normalization via impls if the goal is proven by using a where-bound

## The approach in this PR[^1]

- always prefer trivial builtin impls[^6]
- then prefer non-global[^global] where-bounds
    - if there exists exactly one where-bound, guide inference
    - if there are multiple where-bounds even if some of them are global, ambig
- then prefer alias bounds[^2] and builtin trait object candidates[^3][^2]
- merge everything ignoring global where-bounds
- if there are no other candidates, try using global where-bounds[^5]

**We disable normalization via impls when using non-global where-bounds or alias-bounds, even if we're unable to normalize by using the where-bound.**

[^1]: see the source for more details
[^2]: [we arbitrary select a single object and alias-bound candidate in case multiple apply and they don't impact inference](https://github.com/rust-lang/rust/blob/a4cedecc9ec76b46dcbb954750068c832cf2dd43/compiler/rustc_trait_selection/src/traits/select/mod.rs#L1906-L1911). This should be unnecessary in the new solver.
[^3]: Necessary for `dyn Any` and rust-lang#57893
[^global]: a where-bound is global if it is not higher-ranked and doesn't contain any generic parameters, `'static` is ok
[^5]: global where-bounds are only used if they are unsatisfiable, i.e. no impl candidate exists
[^6]: they don't constrain inference and don't add any lifetime constraints

## Why this behavior?

### inference guidance via where-bounds and alias-bounds

#### where-bounds

```rust
fn method_selection<T: Into<u64>>(x: T) -> Option<u32> {
    x.into().try_into().ok()
    // prove `T: Into<?0>` and then select a method `?0`,
    // needs eager inference.
}
```

While the above pattern exists in the wild, I think that most inference guidance due to where-bounds is actually unintended. I believe we may want to restrict inference guidance in the future, e.g. limit it to where-bounds whose self-type is a param.

#### alias-bounds

```rust
pub trait Dyn {
    type Word: Into<u64>;
    fn d_tag(&self) -> Self::Word;
    fn tag32(&self) -> Option<u32> {
        self.d_tag().into().try_into().ok()
        // prove `Self::Word: Into<?0>` and then select a method
        // on `?0`, needs eager inference.
    }
}
```

### Disable normalization via impls when using where-bounds

cc rust-lang/trait-system-refactor-initiative#125

```rust
trait Trait<'a> {
    type Assoc;
}

impl<T> Trait<'static> for T {
    type Assoc = ();
}

// normalizing requires `'a == 'static`, the trait bound does not.
fn foo<'a, T: Trait<'a>>(_: T::Assoc) {}
```

If an impl adds constraints not required by a where-bound, using the impl may cause compilation failure avoided by treating the associated type as rigid.

This is also why we can always use trivial builtin impls, even for normalization. They are guaranteed to never add any requirements.

### Lower priority for global where-bounds

A where-bound is considered global if it does not refer to any generic parameters and is not higher-ranked. It may refer to `'static`.

This means global where-bounds are either always fully implied by an impl or unsatisfiable. We don't really care about the inference behavior of unsatisfiable where-bounds :3

If a where-bound is fully implied then using an applicable impl for normalization cannot result in additional constraints. As this is the - afaict only - reason why we disable normalization via impls in the first place, we don't have to disable normalization via impls when encountering global where-bounds.

### Consider global where-bounds at all

Given that we just use impls even if there exists a global where-bounds, you may ask why we don't just ignore these global where-bounds entirely: we use them to weaken the inference guidance from non-global where-bounds.

Without a global where-bound, we currently prefer non-global where bounds even though there would be an applicable impl as well. By adding a non-global where-bound, this *unnecessary* inference guidance is disabled, allowing the following to compile:
```rust
fn check<Color>(color: Color)
where
    Vec: Into<Color> + Into<f32>,
{
    let _: f32 = Vec.into();
    // Without the global `Vec: Into<f32>`  bound we'd
    // eagerly use the non-global `Vec: Into<Color>` bound
    // here, causing this to fail.
}

struct Vec;
impl From<Vec> for f32 {
    fn from(_: Vec) -> Self {
        loop {}
    }
}
```
[There exist multiple crates which rely on this behavior](rust-lang#124592 (comment)).

## Design considerations

We would like to be able to normalize via impls as much as possible. Disabling normalization simply because there exists a where-bound is undesirable.

For the sake of backwards compatability I intend to mostly mirror the current inference guidance rules and then explore possible improvements once the new solver is done. I do believe that removing unnecessary inference guidance where possible is desirable however.

Whether a where-bound is global depends on whether used lifetimes are `'static`. The where-bound `u32: Trait<'static>` is either entirely implied by an impl, meaning that it does not have to disable normalization via impls, **while `u32: Trait<'a>` needs to disable normalization via impls as the impl may only hold for `'static`**. Considering all where-bounds to be non-global once they contain any region is unfortunately a breaking change.

## How does this differ from stable

The currently stable approach is order dependent:
- it prefers impls over global where-bounds: impl > global
- it prefers non-global where-bounds over impls: non-global > impl
- it treats all where-bounds equally: global = non-global

This means that whether we bail with ambiguity or simply use the non-global where bound depending on the *order of where-clauses* and *number of applicable impl candidates*. See the tests added in the first commit for more details. With this PR we now always bail with ambiguity.

I've previously tried to always use the non-global candidate, causing unnecessary inference guidance and undesirable breakage. This already went through an FCP in rust-lang#124592. However, I consider the new approach to be preferable as it exclusively removes incompleteness. It also doesn't cause any crater breakage.

## How to support this in the new solver :o

**This is separately implemented in rust-lang#133643 and not part of this FCP!**

To implement the global vs non-global where-bound distinction, we have to either keep `'static` in the `param_env` when canonicalizing, or eagerly distinguish global from non-global where-bounds and provide that information to the canonical query.

The old solver currently keeps `'static` only the `param_env`, replacing it with an inference variable in the `value`.
https://github.com/rust-lang/rust/blob/a4cedecc9ec76b46dcbb954750068c832cf2dd43/compiler/rustc_infer/src/infer/canonical/canonicalizer.rs#L49-L64

I dislike that based on *vibes* and it may end up being a problem once we extend the environment inside of the solver as [we must not rely on `'static` in the `predicate` as it would get erased in MIR typeck](rust-lang/trait-system-refactor-initiative#30).

An alternative would be to eagerly detect trivial where-bounds when constructing the `ParamEnv`. We can't entirely drop them [as explained above](https://hackmd.io/qoesqyzVTe2v9cOgFXd2SQ#Consider-true-global-where-bounds-at-all), so we'd instead replace them with a new clause kind `TraitImpliedByImpl` which gets entirely ignored except when checking whether we should eagerly guide inference via a where-bound. This approach can be extended to where-bounds which are currently not considered global to stop disabling normalization for them as well.

Keeping `'static` in the `param_env` is the simpler solution here and we should be able to move to the second approach without any breakage. I therefore propose to keep `'static` in the environment for now.

---

r? `@compiler-errors`
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 17, 2024
rework winnowing to sensibly handle global where-bounds

There may be multiple ways to prove a given trait-bound. In case there are multiple such applicable candidates we need to somehow merge them or bail with ambiguity. When merging candidates we prefer some over others for multiple reasons:

- we want to guide inference during typeck, even if not strictly necessary
- to avoid ambiguity if there if there are at most lifetime differences
    - old solver needs exactly one candidate
    - new solver only needs to handle lifetime differences
- we disable normalization via impls if the goal is proven by using a where-bound

## The approach in this PR[^1]

- always prefer trivial builtin impls[^6]
- then prefer non-global[^global] where-bounds
    - if there exists exactly one where-bound, guide inference
    - if there are multiple where-bounds even if some of them are global, ambig
- then prefer alias bounds[^2] and builtin trait object candidates[^3][^2]
- merge everything ignoring global where-bounds
- if there are no other candidates, try using global where-bounds[^5]

**We disable normalization via impls when using non-global where-bounds or alias-bounds, even if we're unable to normalize by using the where-bound.**

[^1]: see the source for more details
[^2]: [we arbitrary select a single object and alias-bound candidate in case multiple apply and they don't impact inference](https://github.com/rust-lang/rust/blob/a4cedecc9ec76b46dcbb954750068c832cf2dd43/compiler/rustc_trait_selection/src/traits/select/mod.rs#L1906-L1911). This should be unnecessary in the new solver.
[^3]: Necessary for `dyn Any` and rust-lang#57893
[^global]: a where-bound is global if it is not higher-ranked and doesn't contain any generic parameters, `'static` is ok
[^5]: global where-bounds are only used if they are unsatisfiable, i.e. no impl candidate exists
[^6]: they don't constrain inference and don't add any lifetime constraints

## Why this behavior?

### inference guidance via where-bounds and alias-bounds

#### where-bounds

```rust
fn method_selection<T: Into<u64>>(x: T) -> Option<u32> {
    x.into().try_into().ok()
    // prove `T: Into<?0>` and then select a method `?0`,
    // needs eager inference.
}
```

While the above pattern exists in the wild, I think that most inference guidance due to where-bounds is actually unintended. I believe we may want to restrict inference guidance in the future, e.g. limit it to where-bounds whose self-type is a param.

#### alias-bounds

```rust
pub trait Dyn {
    type Word: Into<u64>;
    fn d_tag(&self) -> Self::Word;
    fn tag32(&self) -> Option<u32> {
        self.d_tag().into().try_into().ok()
        // prove `Self::Word: Into<?0>` and then select a method
        // on `?0`, needs eager inference.
    }
}
```

### Disable normalization via impls when using where-bounds

cc rust-lang/trait-system-refactor-initiative#125

```rust
trait Trait<'a> {
    type Assoc;
}

impl<T> Trait<'static> for T {
    type Assoc = ();
}

// normalizing requires `'a == 'static`, the trait bound does not.
fn foo<'a, T: Trait<'a>>(_: T::Assoc) {}
```

If an impl adds constraints not required by a where-bound, using the impl may cause compilation failure avoided by treating the associated type as rigid.

This is also why we can always use trivial builtin impls, even for normalization. They are guaranteed to never add any requirements.

### Lower priority for global where-bounds

A where-bound is considered global if it does not refer to any generic parameters and is not higher-ranked. It may refer to `'static`.

This means global where-bounds are either always fully implied by an impl or unsatisfiable. We don't really care about the inference behavior of unsatisfiable where-bounds :3

If a where-bound is fully implied then using an applicable impl for normalization cannot result in additional constraints. As this is the - afaict only - reason why we disable normalization via impls in the first place, we don't have to disable normalization via impls when encountering global where-bounds.

### Consider global where-bounds at all

Given that we just use impls even if there exists a global where-bounds, you may ask why we don't just ignore these global where-bounds entirely: we use them to weaken the inference guidance from non-global where-bounds.

Without a global where-bound, we currently prefer non-global where bounds even though there would be an applicable impl as well. By adding a non-global where-bound, this *unnecessary* inference guidance is disabled, allowing the following to compile:
```rust
fn check<Color>(color: Color)
where
    Vec: Into<Color> + Into<f32>,
{
    let _: f32 = Vec.into();
    // Without the global `Vec: Into<f32>`  bound we'd
    // eagerly use the non-global `Vec: Into<Color>` bound
    // here, causing this to fail.
}

struct Vec;
impl From<Vec> for f32 {
    fn from(_: Vec) -> Self {
        loop {}
    }
}
```
[There exist multiple crates which rely on this behavior](rust-lang#124592 (comment)).

## Design considerations

We would like to be able to normalize via impls as much as possible. Disabling normalization simply because there exists a where-bound is undesirable.

For the sake of backwards compatability I intend to mostly mirror the current inference guidance rules and then explore possible improvements once the new solver is done. I do believe that removing unnecessary inference guidance where possible is desirable however.

Whether a where-bound is global depends on whether used lifetimes are `'static`. The where-bound `u32: Trait<'static>` is either entirely implied by an impl, meaning that it does not have to disable normalization via impls, **while `u32: Trait<'a>` needs to disable normalization via impls as the impl may only hold for `'static`**. Considering all where-bounds to be non-global once they contain any region is unfortunately a breaking change.

## How does this differ from stable

The currently stable approach is order dependent:
- it prefers impls over global where-bounds: impl > global
- it prefers non-global where-bounds over impls: non-global > impl
- it treats all where-bounds equally: global = non-global

This means that whether we bail with ambiguity or simply use the non-global where bound depending on the *order of where-clauses* and *number of applicable impl candidates*. See the tests added in the first commit for more details. With this PR we now always bail with ambiguity.

I've previously tried to always use the non-global candidate, causing unnecessary inference guidance and undesirable breakage. This already went through an FCP in rust-lang#124592. However, I consider the new approach to be preferable as it exclusively removes incompleteness. It also doesn't cause any crater breakage.

## How to support this in the new solver :o

**This is separately implemented in rust-lang#133643 and not part of this FCP!**

To implement the global vs non-global where-bound distinction, we have to either keep `'static` in the `param_env` when canonicalizing, or eagerly distinguish global from non-global where-bounds and provide that information to the canonical query.

The old solver currently keeps `'static` only the `param_env`, replacing it with an inference variable in the `value`.
https://github.com/rust-lang/rust/blob/a4cedecc9ec76b46dcbb954750068c832cf2dd43/compiler/rustc_infer/src/infer/canonical/canonicalizer.rs#L49-L64

I dislike that based on *vibes* and it may end up being a problem once we extend the environment inside of the solver as [we must not rely on `'static` in the `predicate` as it would get erased in MIR typeck](rust-lang/trait-system-refactor-initiative#30).

An alternative would be to eagerly detect trivial where-bounds when constructing the `ParamEnv`. We can't entirely drop them [as explained above](https://hackmd.io/qoesqyzVTe2v9cOgFXd2SQ#Consider-true-global-where-bounds-at-all), so we'd instead replace them with a new clause kind `TraitImpliedByImpl` which gets entirely ignored except when checking whether we should eagerly guide inference via a where-bound. This approach can be extended to where-bounds which are currently not considered global to stop disabling normalization for them as well.

Keeping `'static` in the `param_env` is the simpler solution here and we should be able to move to the second approach without any breakage. I therefore propose to keep `'static` in the environment for now.

---

r? `@compiler-errors`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant