Skip to content

Commit

Permalink
update new trait solver docs (#1802)
Browse files Browse the repository at this point in the history
* rewrite requirements/invariants

* add some more info about the trait solver

* CI

* review
  • Loading branch information
lcnr authored Oct 5, 2023
1 parent b7e20c4 commit b98af7d
Show file tree
Hide file tree
Showing 4 changed files with 212 additions and 78 deletions.
1 change: 1 addition & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@
- [Goals and clauses](./traits/goals-and-clauses.md)
- [Canonical queries](./traits/canonical-queries.md)
- [Next-gen trait solving](./solve/trait-solving.md)
- [Invariants of the type system](./solve/invariants.md)
- [The solver](./solve/the-solver.md)
- [Canonicalization](./solve/canonicalization.md)
- [Coinduction](./solve/coinduction.md)
Expand Down
145 changes: 145 additions & 0 deletions src/solve/invariants.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
# Invariants of the type system

FIXME: This file talks about invariants of the type system as a whole, not only the solver

There are a lot of invariants - things the type system guarantees to be true at all times -
which are desirable or expected from other languages and type systems. Unfortunately, quite
a few of them do not hold in Rust right now. This is either a fundamental to its design or
caused by bugs and something that may change in the future.

It is important to know about the things you can assume while working on - and with - the
type system, so here's an incomplete and inofficial list of invariants of
the core type system:

- ✅: this invariant mostly holds, with some weird exceptions, you can rely on it outside
of these cases
- ❌: this invariant does not hold, either due to bugs or by design, you must not rely on
it for soundness or have to be incredibly careful when doing so

### `wf(X)` implies `wf(normalize(X))`

If a type containing aliases is well-formed, it should also be
well-formed after normalizing said aliases. We rely on this as
otherwise we would have to re-check for well-formedness for these
types.

This is unfortunately broken for `<fndef as FnOnce<..>>::Output` due to implied bounds,
resulting in [#114936].

### applying inference results from a goal does not change its result ❌

TODO: this invariant is formulated in a weird way and needs to be elaborated.
Pretty much: I would like this check to only fail if there's a solver bug:
https://github.com/rust-lang/rust/blob/2ffeb4636b4ae376f716dc4378a7efb37632dc2d/compiler/rustc_trait_selection/src/solve/eval_ctxt.rs#L391-L407

If we prove some goal/equate types/whatever, apply the resulting inference constraints,
and then redo the original action, the result should be the same.

This unfortunately does not hold - at least in the new solver - due to a few annoying reasons.

### The trait solver has to be *locally sound*

This means that we must never return *success* for goals for which no `impl` exists. That would
mean we assume a trait is implemented even though it is not, which is very likely to result in
actual unsoundness. When using `where`-bounds to prove a goal, the `impl` will be provided by the
user of the item.

This invariant only holds if we check region constraints. As we do not check region constraints
during implicit negative overlap check in coherence, this invariant is broken there. As this check
relies on *completeness* of the trait solver, it is not able to use the current region constraints
check - `InferCtxt::resolve_regions` - as its handling of type outlives goals is incomplete.

### Normalization of semantically equal aliases in empty environments results in a unique type ✅

Normalization for alias types/consts has to have a unique result. Otherwise we can easily
implement transmute in safe code. Given the following function, we have to make sure that
the input and output types always get normalized to the same concrete type.

```rust
fn foo<T: Trait>(
x: <T as Trait>::Assoc
) -> <T as Trait>::Assoc {
x
}
```

Many of the currently known unsound issues end up relying on this invariant being broken.
It is however very difficult to imagine a sound type system without this invariant, so
the issue is that the invariant is broken, not that we incorrectly rely on it.

### Generic goals and their instantiations have the same result ✅

Pretty much: If we successfully typecheck a generic function concrete instantiations
of that function should also typeck. We should not get errors post-monomorphization.
We can however get overflow errors at that point.

TODO: example for overflow error post-monomorphization

This invariant is relied on to allow the normalization of generic aliases. Breaking
it can easily result in unsoundness, e.g. [#57893](https://github.com/rust-lang/rust/issues/57893)

### Trait goals in empty environments are proven by a unique impl ✅

If a trait goal holds with an empty environment, there should be a unique `impl`,
either user-defined or builtin, which is used to prove that goal. This is
necessary to select a unique method. It

We do however break this invariant in few cases, some of which are due to bugs,
some by design:
- *marker traits* are allowed to overlap as they do not have associated items
- *specialization* allows specializing impls to overlap with their parent
- the builtin trait object trait implementation can overlap with a user-defined impl:
[#57893]

### The type system is complete ❌

The type system is not complete, it often adds unnecessary inference constraints, and errors
even though the goal could hold.

- method selection
- opaque type inference
- handling type outlives constraints
- preferring `ParamEnv` candidates over `Impl` candidates during candidate selection
in the trait solver

#### The type system is complete during the implicit negative overlap check in coherence ✅

During the implicit negative overlap check in coherence we must never return *error* for
goals which can be proven. This would allow for overlapping impls with potentially different
associated items, breaking a bunch of other invariants.

This invariant is currently broken in many different ways while actually something we rely on.
We have to be careful as it is quite easy to break:
- generalization of aliases
- generalization during subtyping binders (luckily not exploitable in coherence)

### Trait solving must be (free) lifetime agnostic ✅

Trait solving during codegen should have the same result as during typeck. As we erase
all free regions during codegen we must not rely on them during typeck. A noteworthy example
is special behavior for `'static`.

We also have to be careful with relying on equality of regions in the trait solver.
This is fine for codegen, as we treat all erased regions as equal. We can however
lose equality information from HIR to MIR typeck.

The new solver "uniquifies regions" during canonicalization, canonicalizing `u32: Trait<'x, 'x>`
as `exists<'0, '1> u32: Trait<'0, '1>`, to make it harder to rely on this property.

### Removing ambiguity makes strictly more things compile ❌

Ideally we *should* not rely on ambiguity for things to compile.
Not doing that will cause future improvements to be breaking changes.

Due to *incompleteness* this is not the case and improving inference can result in inference
changes, breaking existing projects.

### Semantic equality implies structural equality ✅

Two types being equal in the type system must mean that they have the
same `TypeId` after instantiating their generic parameters with concrete
arguments. This currently does not hold: [#97156].

[#57893]: https://github.com/rust-lang/rust/issues/57893
[#97156]: https://github.com/rust-lang/rust/issues/97156
[#114936]: https://github.com/rust-lang/rust/issues/114936
73 changes: 66 additions & 7 deletions src/solve/the-solver.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,71 @@ approach.

[chalk]: https://rust-lang.github.io/chalk/book/recursive.html

The basic structure of the solver is a pure function
`fn evaluate_goal(goal: Goal<'tcx>) -> Response`.
While the actual solver is not fully pure to deal with overflow and cycles, we are
going to defer that for now.
## A rough walkthrough

To deal with inference variables and to improve caching, we use
[canonicalization](./canonicalization.md).
The entry-point of the solver is `InferCtxtEvalExt::evaluate_root_goal`. This
function sets up the root `EvalCtxt` and then calls `EvalCtxt::evaluate_goal`,
to actually enter the trait solver.

TODO: write the remaining code for this as well.
`EvalCtxt::evaluate_goal` handles [canonicalization](./canonicalization.md), caching,
overflow, and solver cycles. Once that is done, it creates a nested `EvalCtxt` with a
separate local `InferCtxt` and calls `EvalCtxt::compute_goal`, which is responsible for the
'actual solver behavior'. We match on the `PredicateKind`, delegating to a separate function
for each one.

For trait goals, such a `Vec<T>: Clone`, `EvalCtxt::compute_trait_goal` has
to collect all the possible ways this goal can be proven via
`EvalCtxt::assemble_and_evaluate_candidates`. Each candidate is handled in
a separate "probe", to not leak inference constraints to the other candidates.
We then try to merge the assembled candidates via `EvalCtxt::merge_candidates`.


## Important concepts and design pattern

### `EvalCtxt::add_goal`

To prove nested goals, we don't directly call `EvalCtxt::compute_goal`, but instead
add the goal to the `EvalCtxt` with `EvalCtxt::all_goal`. We then prove all nested
goals together in either `EvalCtxt::try_evaluate_added_goals` or
`EvalCtxt::evaluate_added_goals_and_make_canonical_response`. This allows us to handle
inference constraints from later goals.

E.g. if we have both `?x: Debug` and `(): ConstrainToU8<?x>` as nested goals,
then proving `?x: Debug` is initially ambiguous, but after proving `(): ConstrainToU8<?x>`
we constrain `?x` to `u8` and proving `u8: Debug` succeeds.

### Matching on `TyKind`

We lazily normalize types in the solver, so we always have to assume that any types
and constants are potentially unnormalized. This means that matching on `TyKind` can easily
be incorrect.

We handle normalization in two different ways. When proving `Trait` goals when normalizing
associated types, we separately assemble candidates depending on whether they structurally
match the self type. Candidates which match on the self type are handled in
`EvalCtxt::assemble_candidates_via_self_ty` which recurses via
`EvalCtxt::assemble_candidates_after_normalizing_self_ty`, which normalizes the self type
by one level. In all other cases we have to match on a `TyKind` we first use
`EvalCtxt::try_normalize_ty` to normalize the type as much as possible.

### Higher ranked goals

In case the goal is higher-ranked, e.g. `for<'a> F: FnOnce(&'a ())`, `EvalCtxt::compute_goal`
eagerly instantiates `'a` with a placeholder and then recursively proves
`F: FnOnce(&'!a ())` as a nested goal.

### Dealing with choice

Some goals can be proven in multiple ways. In these cases we try each option in
a separate "probe" and then attempt to merge the resulting responses by using
`EvalCtxt::try_merge_responses`. If merging the responses fails, we use
`EvalCtxt::flounder` instead, returning ambiguity. For some goals, we try
incompletely prefer some choices over others in case `EvalCtxt::try_merge_responses`
fails.

## Learning more

The solver should be fairly self-contained. I hope that the above information provides a
good foundation when looking at the code itself. Please reach out on zulip if you get stuck
while doing so or there are some quirks and design decisions which were unclear and deserve
better comments or should be mentioned here.
71 changes: 0 additions & 71 deletions src/solve/trait-solving.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,77 +39,6 @@ which does not have any nested goals. Therefore `Vec<T>: Clone` holds.
The trait solver can either return success, ambiguity or an error as a [`CanonicalResponse`].
For success and ambiguity it also returns constraints inference and region constraints.

## Requirements

Before we dive into the new solver lets first take the time to go through all of our requirements
on the trait system. We can then use these to guide our design later on.

TODO: elaborate on these rules and get more precise about their meaning.
Also add issues where each of these rules have been broken in the past
(or still are).

### 1. The trait solver has to be *sound*

This means that we must never return *success* for goals for which no `impl` exists. That would
simply be unsound by assuming a trait is implemented even though it is not. When using predicates
from the `where`-bounds, the `impl` will be proved by the user of the item.

### 2. If type checker solves generic goal concrete instantiations of that goal have the same result

Pretty much: If we successfully typecheck a generic function concrete instantiations
of that function should also typeck. We should not get errors post-monomorphization.
We can however get overflow as in the following snippet:

```rust
fn foo<T: Trait>(x: )
```

### 3. Trait goals in empty environments are proven by a unique impl

If a trait goal holds with an empty environment, there is a unique `impl`,
either user-defined or builtin, which is used to prove that goal.

This is necessary for codegen to select a unique method.
An exception here are *marker traits* which are allowed to overlap.

### 4. Normalization in empty environments results in a unique type

Normalization for alias types/consts has a unique result. Otherwise we can easily implement
transmute in safe code. Given the following function, we have to make sure that the input and
output types always get normalized to the same concrete type.
```rust
fn foo<T: Trait>(
x: <T as Trait>::Assoc
) -> <T as Trait>::Assoc {
x
}
```

### 5. During coherence trait solving has to be complete

During coherence we never return *error* for goals which can be proven. This allows overlapping
impls which would break rule 3.

### 6. Trait solving must be (free) lifetime agnostic

Trait solving during codegen should have the same result as during typeck. As we erase
all free regions during codegen we must not rely on them during typeck. A noteworthy example
is special behavior for `'static`.

We also have to be careful with relying on equality of regions in the trait solver.
This is fine for codegen, as we treat all erased regions as equal. We can however
lose equality information from HIR to MIR typeck.

### 7. Removing ambiguity makes strictly more things compile

We *should* not rely on ambiguity for things to compile.
Not doing that will cause future improvements to be breaking changes.

### 8. semantic equality implies structural equality

Two types being equal in the type system must mean that they have the same `TypeId`.


[solve]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/solve/index.html
[`Goal`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/traits/solve/struct.Goal.html
[`Predicate`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Predicate.html
Expand Down

0 comments on commit b98af7d

Please sign in to comment.