Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle subtyping in inference through obligations #40570

Merged

Conversation

nikomatsakis
Copy link
Contributor

We currently store subtyping relations in the TypeVariables structure as a kind of special case. This branch uses normal obligations to propagate subtyping, thus converting our inference variables into normal fallback. It also does a few other things:

  • Removes the (unstable, outdated) support for custom type inference fallback.
    • It's not clear how we want this to work, but we know that we don't want it to work the way it currently does.
    • The existing support was also just getting in my way.
  • Fixes Trait selection caching does not handle subtyping properly #30225, which was caused by the trait caching code pretending type variables were normal unification variables, when indeed they were not (but now are).

There is one fishy part of these changes: when computing the LUB/GLB of a "bivariant" type parameter, I currently return the a value. Bivariant type parameters are only allowed in a very particular situation, where the type parameter is only used as an associated type output, like this:

pub struct Foo<A, B>
    where A: Fn() -> B
{
    data: A
}

In principle, if one had T=Foo<A, &'a u32> and U=Foo<A, &'b u32> and (e.g.) A: for<'a> Fn() -> &'a u32, then I think that computing the LUB of T and U might do the wrong thing. Probably the right behavior is just to create a fresh type variable. However, that particular example would not compile (because the where-clause is illegal; 'a does not appear in any input type). I was not able to make an example that would compile and demonstrate this shortcoming, and handling the LUB/GLB was mildly inconvenient, so I left it as is. I am considering whether to revisit this or what.

I have started a crater run to test the impact of these changes.

@rust-highfive
Copy link
Collaborator

r? @arielb1

(rust_highfive has picked a reviewer for you, use r? to override)

@arielb1
Copy link
Contributor

arielb1 commented Mar 16, 2017

A: for<'a> Fn() -> &'a u32 is an RFC 447 violation (#32330).

However, with LUB, I think LUB(Foo<fn() -> &'a u32, 'a>, Foo<fn() -> &'b u32, 'b>) (notice the covariant "little fn") should be Foo<fn() -> &'c u32, 'c> where 'c = lub('a, 'b), and not 'a or 'b by themselves.

@nikomatsakis
Copy link
Contributor Author

@arielb1

A: for<'a> Fn() -> &'a u32 is an RFC 447 violation

Yes, as I wrote.

However, with LUB, I think LUB(Foo<fn() -> &'a u32, 'a>, Foo<fn() -> &'b u32, 'b>) (notice the covariant "little fn") should be Foo<fn() -> &'c u32, 'c> where 'c = lub('a, 'b), and not 'a or 'b by themselves.

Yes, as I wrote.

I could not, however, find a test case to exercise that, which is partly why I didn't bother to make it work properly. Doesn't seem challenging, though, I think we just want to introduce a fresh, unconstrained variable.

@nikomatsakis
Copy link
Contributor Author

@arielb1 Sorry if I sounded testy in that last reply; my notes in the comment were a bit vague, and I agree with your clarifications. That said, I wanted to expand on one point:

However, with LUB, I think LUB(Foo<fn() -> &'a u32, 'a>, Foo<fn() -> &'b u32, 'b>) should be Foo<fn() -> &'c u32, 'c> where 'c = lub('a, 'b)

I think you are correct about the eventual result but I wouldn't expect it to be derived immediately from the LUB computation. Since the last parameter is considered "bivariant" (i.e., derived from constraints on the others), I'd expect that the last parameter simply gets a fresh variable, and that trait selection will constrain it (ultimately) to be 'c. In other words, the result from LUB itself would be Foo<fn() -> &'?0 u32, '?1> where 'a: '?0 and 'b: '?0, but enforcing the WF requirements would ultimately constraint '?1 = '?0.

However, it occurs to me that I'm not entirely sure where these WF requirements would get enforced. Much of the code assumes that the resulting types from LUB operations etc are well-formed, and hence I'm not sure when we would add such an obligation. (One possibility is to have the LUB/GLB operation itself produce an obligation, at least if bivariance is involved, given that this is now a possibility.)

Can you think of a test case that would exercise these paths? I wasn't able to. Given that I can't produce a bug, I'm tempted to leave this to a FIXME -- but I definitely think an issue is warranted, if nothing else.

@nikomatsakis
Copy link
Contributor Author

Ah, so I realized that maybe the actual example you gave would work for making such a test case. =) I'll try that out probably on Monday. I was more experimenting with types like F: Fn() rather than using function types.

@nikomatsakis
Copy link
Contributor Author

My continued attempts to craft such a test case have failed. Here is my latest iteration (it is accepted by both this branch and master):

#![allow(dead_code)]

use std::cell::Cell;
use std::marker::PhantomData;

struct Foo<F, A, R>
    where F: Fn(A) -> R
{
    f: F,
    d: PhantomData<fn(A)>,
}

macro_rules! foo_ty {
    ($a:ty, $b:ty) => {
        Foo<fn($a) -> $b, $a, $b>
    }
}

trait Extract {
    type Out;
    fn create(self) -> Self::Out;
}

impl<F, A, R> Extract for Foo<F, A, R>
    where F: Fn(A) -> R
{
    type Out = R;

    fn create(self) -> Self::Out { panic!() }
}

fn foo<'a, 'b, 'small, 'big>(a: foo_ty!(&'a (), &'a ()),
                             b: foo_ty!(&'b (), &'b ()))
                             -> Cell<&'small ()>
    where 'a: 'small, 'b: 'small, 'big: 'a, 'big: 'b,
{
    let x = match 22 { 0 => a, 1 => a, 2 => a, _ => b };
    let y = Cell::new(Extract::create(x));
    y
}

fn main() {
}

You can see my various attempts to squash coercion and introduce invariance. I have to comb through the debug logs some more, I was doing this the other day and hence the results are out of cache for me. (I had a bunch of other iterations of course.)

@@ -224,16 +230,15 @@ impl<'infcx, 'gcx, 'tcx> CombineFields<'infcx, 'gcx, 'tcx> {
// Generalize type if necessary.
let generalized_ty = match dir {
EqTo => self.generalize(a_ty, b_vid, false),
BiTo | SupertypeOf | SubtypeOf => self.generalize(a_ty, b_vid, true),
SupertypeOf | SubtypeOf => self.generalize(a_ty, b_vid, true),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a FIXME(#18653) to generalize?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not needed anymore.

@arielb1
Copy link
Contributor

arielb1 commented Mar 30, 2017

The PR looks nice, except I'm also not that sure about the bivariant bit. Not that I think this should block the PR.

Also, there are a few TODOs in the PR that need to be fixed before it lands.

@bors
Copy link
Contributor

bors commented Mar 30, 2017

☔ The latest upstream changes (presumably #40224) made this pull request unmergeable. Please resolve the merge conflicts.

@@ -384,8 +384,8 @@ pub enum Sign {
/// It can be either `b""`, `b"+"` or `b"-"`.
fn determine_sign(sign: Sign, decoded: &FullDecoded, negative: bool) -> &'static [u8] {
match (*decoded, sign) {
(FullDecoded::Nan, _) => b"",
(FullDecoded::Zero, Sign::Minus) => b"",
(FullDecoded::Nan, _) => b"XXX",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Apr 11, 2017

@arielb1 so this now fixes #18653 (and #40951). The main commit is "generalize type variables too"; the commit message there detail some of the non-obvious interactions. I did this work atop this branch because it made it easier to handle the fudge_regions_if_ok code; I'm not sure what to do about backporting, to be honest. The fudge_regions_if_ok code does not expect new type variables; it'd be annoying to accommodate them in the existing system, I think, because of the possibility of implicit "subvariable" edges. But maybe that's not so important, since this only affects the expected type, in which case I could probably backport some of this logic in a relatively straightforward fashion. Anyway, take a look. Before I cleaned up the history, ./x.py test was passing locally, but I've not attempted a crater run.

@nikomatsakis nikomatsakis force-pushed the inference-subtype-through-obligation branch from ac90cec to 874c7bc Compare April 11, 2017 21:18
This just limits ourselves to the "old school" defaults: diverging
variables and integer variables.
There is one fishy part of these changes: when computing the LUB/GLB of
a "bivariant" type parameter, I currently return the `a`
value. Bivariant type parameters are only allowed in a very particular
situation, where the type parameter is only used as an associated type
output, like this:

```rust
pub struct Foo<A, B>
    where A: Fn() -> B
    {
        data: A
        }
        ```

In principle, if one had `T=Foo<A, &'a u32>` and `U=Foo<A, &'b u32>`
and (e.g.) `A: for<'a> Fn() -> &'a u32`, then I think that computing the
LUB of `T` and `U` might do the wrong thing. Probably the right behavior
is just to create a fresh type variable. However, that particular
example would not compile (because the where-clause is illegal; `'a`
does not appear in any input type). I was not able to make an example
that *would* compile and demonstrate this shortcoming, and handling the
LUB/GLB was mildly inconvenient, so I left it as is. I am considering
whether to revisit this.
@nikomatsakis nikomatsakis force-pushed the inference-subtype-through-obligation branch from 874c7bc to 392c1e3 Compare April 12, 2017 00:28
The most interesting place is the hinting mechanism; once we start
having subtyping obligations, it's important to see those through.
In some cases, we give multiple primary spans, in which case we would
report one `//~` annotation per primary span. That was very confusing
because these things are reported to the user as a single error.

UI tests would be better here.
In some specific cases, the new scheme was failing to learn as much from
a LUB/GLB operaiton as the old code, which caused coercion to go awry. A
slight ordering hack fixes this.
When we are generalizing a super/sub-type, we have to replace type
variables with a fresh variable (and not just region variables).  So if
we know that `Box<?T> <: ?U`, for example, we instantiate `?U` with
`Box<?V>` and then relate `Box<?T>` to `Box<?V>` (and hence require that
`?T <: ?V`).

This change has some complex interactions, however:

First, the occurs check must be updated to detect constraints like `?T
<: ?U` and `?U <: Box<?T>`. If we're not careful, we'll create a
never-ending sequence of new variables. To address this, we add a second
unification set into `type_variables` that tracks type variables related
through **either** equality **or** subtyping, and use that during the
occurs-check.

Second, the "fudge regions if ok" code was expecting no new type
variables to be created. It must be updated to create new type variables
outside of the probe. This is relatively straight-forward under the new
scheme, since type variables are now independent from one another, and
any relations are moderated by pending subtype obliations and so forth.
This part would be tricky to backport though.

cc rust-lang#18653
cc rust-lang#40951
@nikomatsakis nikomatsakis force-pushed the inference-subtype-through-obligation branch from 392c1e3 to 7832db8 Compare April 12, 2017 00:33
Copy link
Contributor

@arielb1 arielb1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically ok mod. the stack nit. How important do you think is the instability thing?

@@ -182,6 +181,8 @@ impl<'infcx, 'gcx, 'tcx> CombineFields<'infcx, 'gcx, 'tcx> {
a_is_expected: bool)
-> RelateResult<'tcx, ()>
{
use self::RelationDir::*;

// We use SmallVector here instead of Vec because this code is hot and
// it's rare that the stack length exceeds 1.
let mut stack = SmallVector::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we even need the stack and the loop here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, I think I meant to remove that! Thanks.

fn fold_ty(&mut self, ty: Ty<'tcx>) -> Ty<'tcx> {
match ty.sty {
ty::TyInfer(ty::InferTy::TyVar(vid)) => {
match self.type_variables.get(&vid) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is already a total hack, but I'm afraid this makes behavior depend on the mood of the union-find algorithm: if a set of (possibly-preexisting) type variables is equated and returned, we'll return one of them "at random".

For example, in

use std::mem;

fn main() {
    let v = Vec::new();
    fn example<'a, U>(x: Option<&'a mut &[u8]>, u: U) -> &'a mut U { loop {} }
    let mut x: Option<&mut $0> = None;
    let mut y = None;
    mem::swap(&mut x, &mut y); // equate types, so `y: Option<&mut $0>`.
    *x.unwrap() = *example::<$1>(y, &v);
}

When this is checked, EIfEO unifies $0 and $1 within the snapshot (a unification that is then discarded), and checks u with either of the type variables as the expected var, depending on what union-find picks as the root. $0 is unified later on with &[u8], so if $0 is picked then the coercion from &Vec<u8> to &[u8] occurs before the call to example, and if $1 is picked the coercion occurs after the call.

This might lead to instability when union-find picks different roots. Not sure how much important is this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes, I don't relish the thought of coercions happening or not happening based on very small changes. I was hoping that we could do away with this function and just sometimes unify the return type before the arguments, but that's not precisely equivalent to what it does today. (In particular, the would force the return type to be a subtype of the output type, whereas this generates an expected type based on the assumption that it ought to be so, but doesn't add a hard constraint.)

An obvious option is not to consider the "roots" of variables -- basically any appearance of some variable ?T that was introduced during the snapshot would be mapped to a fresh variable outside the snapshot. I'm not sure just why I didn't do this; it seems like it ought to be harmless.

At the time when I wrote this code, I was nervous about "losing" the connection, but having thought more about it I think it should be safe for us to play more-or-less arbitrary games with the expected type without (at least) compromising soundness. (This function mildly gives me the willies: e.g., it supplies an "expected-has-type" hint to the arguments, but that's stronger than is needed, etc.)

I think in my ideal world the "expected type" would be allowed to have holes (rather than fresh type variables per se), that have no effect during unification, and we would then translate all unbound type variables to these holes. But doing that would be a big refactor and of dubious (or maybe even negative) value anyway.

Anyway, so long story short, what I could do is to try just replacing all newly created, unbound type variables with newly created, unbound type variables (ignoring the union-find effects). This would probably be backportable too, I imagine. We would lose the "subtype connection" in some cases for these variables, but that shouldn't matter, since this just affects the expected type, and the subtypes should be re-established when the ultimate type is produced.

@@ -755,6 +755,9 @@ pub enum Predicate<'tcx> {
/// for some substitutions `...` and T being a closure type.
/// Satisfied (or refuted) once we know the closure's kind.
ClosureKind(DefId, ClosureKind),

/// `T1 <: T2`
Subtype(PolySubtypePredicate<'tcx>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can subtype predicates ever be higher-ranked (aka for<'a> &'a u32 <: &'b u32) ? I don't think so after this patch. Are you trying to support that in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, perhaps not. I was thinking that in the future a case like for<'a> fn(?T) <: for<'a> fn(?U) might cause such a predicate, but that's not really true (the ?T and ?U are bound outside the for). It would be possible to have higher-ranked subregion constraints, in contrast, but those are a separate mechanism.

@nikomatsakis
Copy link
Contributor Author

@arielb1 removed the stack, simplified fudge_regions_if_ok to be less grungy, but I left the binder for now.

pub root_vid: ty::TyVid,
pub root_origin: TypeVariableOrigin,
}
pub type TypeVariableMap = FxHashMap<ty::TyVid, TypeVariableOrigin>;
Copy link
Contributor

@arielb1 arielb1 Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FxHashMap indexed by integers? This is actually OK.

@arielb1
Copy link
Contributor

arielb1 commented Apr 12, 2017

@bors r+

@bors
Copy link
Contributor

bors commented Apr 12, 2017

📌 Commit 1cc7621 has been approved by arielb1

@bors
Copy link
Contributor

bors commented Apr 13, 2017

⌛ Testing commit 1cc7621 with merge d2e2ad5...

bors added a commit that referenced this pull request Apr 13, 2017
…ion, r=arielb1

Handle subtyping in inference through obligations

We currently store subtyping relations in the `TypeVariables` structure as a kind of special case. This branch uses normal obligations to propagate subtyping, thus converting our inference variables into normal fallback. It also does a few other things:

- Removes the (unstable, outdated) support for custom type inference fallback.
    - It's not clear how we want this to work, but we know that we don't want it to work the way it currently does.
    - The existing support was also just getting in my way.
- Fixes #30225, which was caused by the trait caching code pretending type variables were normal unification variables, when indeed they were not (but now are).

There is one fishy part of these changes: when computing the LUB/GLB of a "bivariant" type parameter, I currently return the `a` value. Bivariant type parameters are only allowed in a very particular situation, where the type parameter is only used as an associated type output, like this:

```rust
pub struct Foo<A, B>
    where A: Fn() -> B
{
    data: A
}
```

In principle, if one had `T=Foo<A, &'a u32>` and `U=Foo<A, &'b u32>` and (e.g.) `A: for<'a> Fn() -> &'a u32`, then I think that computing the LUB of `T` and `U` might do the wrong thing. Probably the right behavior is just to create a fresh type variable. However, that particular example would not compile (because the where-clause is illegal; `'a` does not appear in any input type). I was not able to make an example that *would* compile and demonstrate this shortcoming, and handling the LUB/GLB was mildly inconvenient, so I left it as is. I am considering whether to revisit this or what.

I have started a crater run to test the impact of these changes.
@bors
Copy link
Contributor

bors commented Apr 13, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: arielb1
Pushing d2e2ad5 to master...

@bors bors merged commit 1cc7621 into rust-lang:master Apr 13, 2017
@arielb1 arielb1 added the relnotes Marks issues that should be documented in the release notes of the next release. label Apr 13, 2017
@arielb1
Copy link
Contributor

arielb1 commented Apr 13, 2017

This is moderately likely to cause regressions. Relnoting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants