Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing multiple interfaces with a single impl definition #4566

Open
josh11b opened this issue Nov 21, 2024 · 4 comments
Open

Implementing multiple interfaces with a single impl definition #4566

josh11b opened this issue Nov 21, 2024 · 4 comments
Labels
leads question A question for the leads team

Comments

@josh11b
Copy link
Contributor

josh11b commented Nov 21, 2024

Summary of issue:

An impl specifies how a type satisfies a facet type, which may include multiple interfaces due to using extend or &. Question: what happens when such an impl is parameterized, and the implementation of a subset of the interfaces is superseded by a more specialized impl?

There are three broad approaches that have been considered in discussion:

  • "All or nothing": An impl is either used entirely or not at all. If there is a specialization for any overlap that is used instead, no other part of the multiple-interface impl is used. I don't think this option works as well as other options under consideration. It creates a lot of uncertainty about when an implementation is applicable, particularly in generic code where it can't be determined if any specializations are applicable. This approach runs into many stumbling blocks since constraints in generic code are not sensitive to specialization questions.
  • "Independent impls": An impl of multiple interfaces is treated as a collection of impls of the individual interfaces. In particular, the definition of a member of one interface can assume that the other interfaces are implemented, but not that the associated types (or other non-function associated constants) have expected values. This is the approach that has currently been assumed.
  • "Constrained impls": An impl of multiple interfaces is treated as a collection of impls of the individual interfaces with the additional constraint that no specialization changes the values of any non-function associated constants of any of the interfaces. This is the approach I suggest we adopt.

Details:

Consider these interface definitions:

interface I {
  let T:! type;
}
interface K {
  fn G();
}
interface L {
  let X:! type;
  fn H() -> X;
}
interface J {
  extend I;
  extend K;
  // `T` here resolves to `Self.(I.T)`.
  extend L where .X = T;
  fn F() -> T;
}

Imagine we have a blanket implementation of J, and a specialization of I:

// Implementing `J` also implements `I`, `K` and `L`.
impl forall [U:! type] U as J where .T = i32 {
  // 1: `F` is a member of `J`, with return type `I.T`.
  fn F() -> i32 { return 0; }
  // 2: `G` is a member of `K`.
  fn G() {}
  // 3: `H` is a member of `L`, with return type
  // `L.X` constrained to equal `I.T`.
  fn H() -> i32 { return F(); }
}

// 4: Specialization that changes `I.T` from `i32`.
// to `(i32, i32)` for `bool`.
impl bool as I where .T = (i32, i32) {}
// 5: Specialization that doesn't change `I.T` from the blanket impl.
impl f64 as I where .T = i32 {}
impl f64 as K { fn G() { Core.Print("Hello World!"); } }

The questions are then: Is this code valid? If not, what part is diagnosed as invalid? If it is, are the interfaces J, K, and L implemented for bool? If so, what signatures do J.F and L.H have? Is the situation any different for f64?

We don't want to be in a situation where generic code thinks the blanket impl applies but has the wrong idea about the return type of F() or H() since it

  • The "all or nothing" approach says this code is valid, and J, K, and L are not implemented for bool and also not implemented for f64, due to the specializations 4 & 5 blocking the blanket impl from applying.
  • The "independent impls" approach says this code is invalid. The implementations of J.F and L.H in the blanket impl can't assume that I.T is i32 since that is an associated constant from another interface, so the functions 1 & 3 fail type checking. There isn't a problem, though, with the implementation (2) of K or K.G.
  • The "constrained impls" approach says this code is valid. J, K, and L are not implemented for bool, since the blanket implementation has a constraint that says it only applies when .T is i32 (like it says right there in the declaration). They are implemented for f64, since the specialized implementation (5) of I for f64 has .T equal to i32. When type checking the blanket implementation, declarations (1) and (3) can assume I.T and L.X are i32, and that I, J, K, and L are all implemented, but nothing about the body of any of the associated functions from other interfaces (which are generally treated as opaque anyway, unless those functions are executed at compile time).
@josh11b josh11b added the leads question A question for the leads team label Nov 21, 2024
@chandlerc
Copy link
Contributor

Agreement on the constrained impls approach. It seems really important to have access to the associated types with certainty, especially in generic code using the impl as you mention.

@zygoloid
Copy link
Contributor

I came here to write up something in favor of the "constrained impls" approach. But I've been exploring the consequences and I am worried.

What per-interface implementations would we have for impl forall [U:! type] U as J where .T = i32? I think it would be something like this:

let F:! type = J where .T = i32;
impl forall [U:! F - I] U as I where .T = i32 {}
impl forall [U:! F - K] U as K {
  fn G() {}
}
impl forall [U:! F - L] U as L {
  fn H() -> i32 { return U.F(); }
}
impl forall [U:! F - J] U as J {
  fn F() -> i32 { return 0; }
}

... where A - B means "facet A with the parts mentioning B removed". Or put another way, the constraint on the per-interface implementation is that the type that was written on the left of as implements the facet type that was written on the right of as, except that the current per-interface implementation is used to implement that interface and so isn't part of the constraint.

Problem is, this appears to be circular. How would we determine which of those impls is used? Each of them depends on us already knowing that U implements all the others, so naively attempting to look for an impl would not terminate. The above rewrite would be rejected by our termination criteria.

One way out of this would be to arrange our rules so that we have a total ordering for the implementations that we're going to pick -- for example, we select the impl T as I first, then the impl T as K, then the impl T as L, then the impl T as J. But that doesn't seem to be possible for a case like this:

interface X { let XT:! type; fn XF() -> XT; }
interface Y { let YT:! type; fn YF() -> YT; }
impl forall [T:! type] T as X where .XT = () { fn XF() {} }
impl forall [T:! type] T as Y where .YT = () { fn YF() {} }
impl forall [T:! type, U:! type] (T, U) as X & Y where .XT = i32 and .YT = i32 {
  fn XF() -> i32 { return YF(); }
  fn YF() -> i32 { return XF(); }
}

Here, we can't select which impl we will use to implement (i32, i32) as X until we've selected the impl for (i32, i32) as Y and vice versa. And in general picking a "worse" impl for one of those might allow us to pick a "better" impl for the other one, for example if we add this into the mix:

impl forall [T:! type] (T, T) as X where .XT = i64 {
  fn XF() -> i64 { return 0; }
}

I'm not sure there's a well-defined best answer if we use this rule: note that we could either have .XT = i32 and .YT = i32, using the (T, U) impl, or .XT = i64 and .YT = (), using the (T, T) and T impls, and both are local maxima. Moreover, I suspect that even finding any valid solution for this problem in general reduces to version SAT, which is NP-complete.


I think there's another case which may seem distinct but is at least somewhat coupled to this decision... what happens in the case where a parameterized impl decomposes into a non-parameterized impl for some interface? For example:

interface A {
  fn F();
}
interface B(T:! type) {
  extend A;
  fn G();
}
impl forall [T:! type] i32 as B(T) {
  fn F() {}
  fn G() {}
}

I think our options here are:

  • It's invalid for a generic interface to extend another interface unless it passes all of its generic arguments to that interface in deducible positions. B(T) cannot extend A.
  • It's invalid for a blanket impl to implement B(T) in this way -- it would decompose into impl forall [T:! type] i32 as A, which is not valid as T is not deducible. I think this would best match the "all or nothing" approach -- "all" is not possible, so you get nothing. It probably also matches the "contrained impls" approach, given that there is no valid constraint for the A impl.
  • This is valid but T is not in scope in F(). This is the "independent impls" approach.

I don't have a good intuition for what I want to happen here. It seems like the definition of B(T) is asking for there to be at most one T for which it's implemented, because any such implementation also implements A, and A can only be implemented once, which is almost like saying that T is an associated type rather than an argument. Which maybe argues that the extend declaration itself should be invalid, if we don't want interfaces to be able to do that?

@josh11b
Copy link
Contributor Author

josh11b commented Nov 27, 2024

Only addressing the last question ("what happens in the case where a parameterized impl decomposes into a non-parameterized impl for some interface?"):

I do think we want to allow the definition of B(T) that has extend A;. In many cases A is just going to be a marker type with no members (Sendable), so there is no question about how A is implemented, just whether A is implemented. Once we allow extend A;, it is pretty clear that the implementation of A can't depend on T, since T is either unknown or can have many values. And I think this matches the use cases: if you write extend Clonable, you should be able to write a Clone() function independent of T.

So I'd like one of the last two options. I think the last option is what developers would want to write, and better supports evolution of the interface B(T), which is a significant advantage. I do like the "constrained impls" model, though, if it were viable, which might suggest we go with the middle option.

@josh11b
Copy link
Contributor Author

josh11b commented Dec 6, 2024

@zygoloid and I discussed this on Tuesday 2024-12-03. We came up with some additional concerns with implementing multiple interfaces, particularly as the result of one interface extending another.

  • There are multiple possible semantics you might want, and having a single impl does not provide the affordances for choosing between those options, where one impl per interface would. For example, in impl forall [T:! type] C(T) as I & J where .(I.x) = i32 and .(J.y) = .(I.x), if there is a specialization of C(T) for I, will J.y have the value i32 or the I.x from the specialization? In practice, the semantics of rewrites mean that .(I.x) is replaced with i32 at an early stage in the compiler (to support things like .(J.y) = .(I.x).D), and so only the first option is consistent. This is a particular concern for the "Independent impls" option above. If this impl is split into two, then the different possible meanings have different spellings:

    • impl forall [T:! type] C(T) as J where .(J.y) = i32 means J.y will be i32 independent of any specialization of C(T) for I
    • impl forall [T:! type where C(T) impls I] C(T) as J where .(J.y) = .(I.x) means J.y matches I.x even if C(T) is specialized
    • impl forall [T:! type where C(T) impls (I where .x = i32)] C(T) as J where .(J.y) = .(I.x) means this impl won't be used unless I.x is i32. Note this last form approximates the "Constrained impls" approach above, but with an explicit ordering to determine the semantics, and the existing language rules preventing the code from declaring cycles that would make it ambiguous.
  • If an interface J extends I but they are defined in distinct libraries, there is no guarantee that the an implementation of J belongs in the same library as an implementation of I for the same type due to the orphan rule.

  • The current documented rule for which interfaces an impl implements is those whose members are defined in the interface definition. This rule is ambiguous for empty interfaces or interfaces where all the associated functions have defaults. It also requires a lot of context to answer that question (this was intentional, to allow options for refactoring an interface without having to update implementations of it). In addition to being the source of readability concerns, this muddies the meaning of impl declarations, and make the compiler implementation much trickier (the compiler can't say what an impl declaration provides at the point where it is written, making it hard to give that declaration a clear type in the SemIR).

  • Ideas we considered to determine the interfaces implemented from only the impl declaration ran into problems. Without being able to control which interfaces an impl is defining, then it isn't clear how to handle implementing two interfaces that have common interface they both extend unless you implement them both in a single impl definition (which may not even be possible due to the orphan rule). Another idea we had in this space was a way to say "this interface minus one of the interfaces it extends" (maybe J \ I?).

We considered some restrictions on extend to address some of these concerns:

  • Perhaps implementing an interface only implemented the interfaces it extends that have to be defined in this library by the orphan rule.
  • Perhaps an interface may only extend an interface defined in the same library.
  • Perhaps an interface may only be extended by a single interface (though this precluded motivating use cases for extends like the subtyping relationships between different kinds of graphs used in the Boost Graph library).

None of these restrictions were sufficient to address the concerns. We ultimately decided that we likely wanted simpler rules:

  • implementing an interface does not implement any of the interfaces it extends, and
  • an impl declaration should implement a single interface.

Note that we still wanted to allow facet type expressions to the right of impl...as, as long as that facet type corresponds to a single interface (ignoring interfaces it extends, which incidentally seems to match Rust supertraits). This supports the use case of Core.Add actually being a named constraint defined in terms of Core.AddWith, and matched the natural implementation in our current reference toolchain.

Also note that we were more open to relaxing the second rule (so you could implement multiple interfaces in a single impl declaration) in the future than the first rule. This would require first defining the (potentially complicated) logic to for disentangling the implementations of the interfaces from a single impl definition, and would follow the "Independent impls" approach.

In this proposal, it would be less convenient to implement an interface together with the interfaces it extends, and evolutionary changes to split an interface into two with an extend relationship would be harder and have more steps. On the plus side, there wouldn't be any interaction with the orphan rule, the semantics of an impl declaration are explicit, and the developer has direct control over which interfaces are implemented where.

The "subsumption" use case of "when an interface X is implemented, Y will be implemented without the user having to write a separate impl definition" will instead be handled using blanket implementations. There are a couple of variations:

  • One interface is a strict superset of the other. In this case it would be a lot less confusing/surprising if the interfaces use the same implementations of functions that overlap. This is accomplished using a final blanket implementation. For example, we will define final impl forall [T:! type, U:! ImplicitAs(T)] U as As(T) and have As(T).Convert forward to ImplicitAs(T).Convert. This way types will either implement ImplicitAs(T) or As(T), and will get an error if they try to implement both.
  • One interface can be implemented in terms of the other. For example, if Ordered implies an implementation of IsEqual, types might still want to provide an explicit definition of IsEqual when they can do so more efficiently. This would use a non-final blanket implementation.

Note that the orphan rule prevents this blanket impl from being written unless the two interfaces in a subsumption relationship are in the same library.

Future work:

  • A feature to copy the members of an interface into another to make the subsumption use case easier to write and involve less duplication.
  • A feature to define an implementation of an interface in terms of another by forwarding, for similar reasons.
  • A final version of the match_first/impl_priority feature to resolve conflicts when multiple interfaces want to subsume a common interface. We likely want a feature like this for function overloading as well.
  • Previously mentioned possible support for implementing multiple interfaces with a single impl definition, as the result of using an & operator or named constraint to the right of as.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
leads question A question for the leads team
Projects
None yet
Development

No branches or pull requests

3 participants