Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generalized arity tuples #2702

Closed
wants to merge 1 commit into from
Closed

Conversation

Woyten
Copy link

@Woyten Woyten commented May 22, 2019

This is another proposal for a generalized tuples solution. Compared to similar RFCs, this RFC

  • only needs one key idea implemented in the compiler
  • does not add any new syntax
  • does not add any special traits
  • does not change the way Rust reasons about types

Rendered

@Centril Centril added T-lang Relevant to the language team, which will review and decide on the RFC. A-expressions Term language related proposals & ideas A-patterns Pattern matching related proposals & ideas A-tuples Proposals relating to tuples. A-typesystem Type system related proposals & ideas A-repr #[repr(...)] related proposals & ideas A-product-types Product type related proposals A-structural-typing Proposals relating to structural typing. labels May 23, 2019
Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some food for thought. Once they settle, please eventually extend the outcome of these discussions into the text itself.


Unfortunately, it is not possible to express the generalization strategy in Rust's type system. Instead, a common practice is to generalize code using the macro system. This has two major drawbacks:

- The code is not really general since it can only support a limited number of arities. This is the same restriction as if it had been written down by hand. To make things worse, each library has its own understanding about what is cosidered a good limit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For trait implementations, I think the typical number is around 10-12; do you really need more? -- please expand on this. :)

Functions like zip seems to be a different matter however.

- Focussed on variadic generics
- Introduces new syntax
- Includes new traits
- Uses special handling for references
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to survey / see a discussion of variadics, type level lists, etc. in other languages, including:

  • Haskell
  • Idris
  • C++
  • Other?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more or less how tuples work in Ceylon.

- Introduces new syntax
- Includes new traits
- Uses special handling for references

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are taking the frunk approach to this, it would be a good idea to have a discussion of the library and the various traits and transformations in there. In particular, frunk should provide us with a good investigation of how this approach actually pans out in terms of what transformations can be written and not.

cc @lloydmeta @ExpHP

where `Tuple` is a new struct located in `std::ops` with the following definition:

```rust
struct Tuple<ELEM, TAIL> {
Copy link
Contributor

@Centril Centril May 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is essentially a HList: https://docs.rs/frunk/0.3.0/frunk/hlist/index.html.

  • The good aspect of this is that you are essentially removing ty::TyKind::Tuple. The rules about unsized types in the last element should just fall out from structs. Overall, this is a substantial reduction in the amount of structural typing in the type system, which is a good thing. Instead, Tuple<H, T> is nominally typed and (T0, ..., Tn) is just sugar. You may want to extend the rationale with a note about the benefits of this simplification.

    • Also please make a note of Tuple becoming a #[lang = "tuple"] item by attaching the attribute here.
  • On the other hand, this also means that the compiler is no longer free to apply layout optimizations where fields are reordered. E.g. today, the compiler is free to lay (A, B, C, D) out as laid out as A D C B. After introducing struct Tuple<H, T>, the compiler can no longer do that because it is now possible to take a reference to tup.tail.

Copy link
Contributor

@Centril Centril May 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a discussion with @oli-obk, they noted that #[non_referenceable] pub tail: Tail poses a problem with respect to implementing PartialEq and similar traits when you want to do it recursively and generally for all tuples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More details:

impl PartialEq for () {
    fn eq(&self, other: &Self) -> bool {
        true
    }
}
impl<ELEM: PartialEq, TAIL: PartialEq> PartialEq for Tuple<ELEM, TAIL> {
    fn eq(&self, other: &Self) -> bool {
        self.elem == other.elem && self.tail == other.tail
    }
}

The self.tail == other.tail is essentially PartialEq::eq(&self.tail, &other.tail), which would violate #[non_referenceable].

Copy link
Author

@Woyten Woyten May 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I get this right, the compiler still has the flexibility to lay out Tuple<ELEM, TAIL> as either ELEM TAIL or TAIL ELEM. So (A, B, C, D) could become A B C D or B C D A or A C D B but not B A C D.

But, indeed, this is a hard restriction which might increase the memory footprint of every tuple.

This problem could be mitigated if the tuple representation was changed to a tree structure, e.g. Tuple<ELEM, LEFT, RIGHT>. In this way, the compiler could regain some control about the memory layout. In return, the compiler would need to match Tuple<Elem, Tail, ()> with Tuple<Elem, (), Tail> or wouldn't it? My first feeling is that this solution is bad just because it is not simple enough.

Copy link
Contributor

@Ixrec Ixrec May 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I misunderstood something in previous discussions, but it seemed like we already knew there's a fundamental choice we have to make here between:

  • the type lists used by variadic generics are identical to tuple types, and can compile away to nothing because tuples "don't exist at runtime" in a sense that rules out sub-tuples having addresses and being efficiently borrowable and so on
  • the type lists used by variadic generics are distinct from tuple types, so there is a certain amount of unfortunate duplication going on, but we get to make guarantees about tuple layout/addressability/etc

And any future variadic generics / generic tuples proposal would simply have to pick one of these and argue convincingly for it, but making up our minds on this was the main thing blocking progress on variadics. Is that no longer the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well there's a third option:

  • Go ahead with this proposal either with #[non_referenceable] or without, and then never add "variadic generics".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I in my understanding, solving the variadic tuple problem is equivalent to solving the variadic generic problem. I would even go so far as to say that you do not need variadic generics if you have variadic tuples.

@Ixrec am not aware that the variadic generic problem has a final solution yet, has it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's definitely not solved yet. The whole point of my last comment was that any proposal for variadic tuples effectively is a proposal to solve the variadic generics problem. In other words, even if the proposed solution is simply that variadic tuples are enough, this needs to be made explicit, and then it has to be argued that "full" variadic generics are unnecessary or not worth it (afaik no one's made that argument before; maybe I could be convinced).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that variadic generics might not be strictly necessary since you could model every type Generic<A, ...> with variadic generics simply as Generic<T> where T: TraitThatIsValidForAllTuples. The transition to a real generic notation could be done using a syntactic sugar approach.


- The compiler needs to treat any type `(ELEM, TAIL.0, ..., TAIL.n-1)` to be equivalent to `Tuple<ELEM, (TAIL.0, ..., TAIL.n-1)>`. This could work in the same way as `std::io::Result<T>` is considered equivalent to `core::result::Result<T, std::io::Error>`.
- Equivalently, every tuple value `(elem, tail.0, ..., tail.n-1)` must be considered structurally equal to `Tuple { elem: elem, tail: (tail.0, ..., tail.n-1) }`.
- Every tuple index access `tuple.n` must evaluate to `tuple{{.tail}^n}.elem`. In other words, `.tail` must be called `n` times before calling `.elem`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So essentially, you are moving tuples to the HIR lowering phases of the compiler and out of later phases. That turns tuples into essentially syntactic sugar. This is a nice simplification. On the other hand, this may also inflate compile times by giving the type checker later phases larger HIR trees to work with.

My overall sense is that it is hard to answer both the run-time and compile-time perf questions without implementing this in a PR and testing it out. Thus, if we are going to accept this, it would be best to run some experiments and get data to inform these questions.


- The compiler needs to treat any type `(ELEM, TAIL.0, ..., TAIL.n-1)` to be equivalent to `Tuple<ELEM, (TAIL.0, ..., TAIL.n-1)>`. This could work in the same way as `std::io::Result<T>` is considered equivalent to `core::result::Result<T, std::io::Error>`.
- Equivalently, every tuple value `(elem, tail.0, ..., tail.n-1)` must be considered structurally equal to `Tuple { elem: elem, tail: (tail.0, ..., tail.n-1) }`.
- Every tuple index access `tuple.n` must evaluate to `tuple{{.tail}^n}.elem`. In other words, `.tail` must be called `n` times before calling `.elem`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also please discuss pattern matching? Please consider at least these cases:

  • let (a, b) = tup;
  • let (ref mut? a, b) = tup;
  • let (a, ..) = tup;
  • let (a, b, ..) = tup;
  • let (a, b, c @ ..) = tup; -- this is not allowed today, but could potentially be. This ties into questions about &tup.tail
  • let (a, b, ref c @ ..) = tup; -- same here re. &tup.tail; also not allowed today.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

The selling point of the proposed solution is that it is completely based on existing concepts. The syntax and type system remain unaffected. Hence, the implementation effort should be predictable and the risk of compromising the overall quality of the language should be low. A second benefit is the possibility to define more advanced type mappings, e.g. `(A, B, C, ..., Z)` &rarr; `(B, A, C, ..., Z)`.
Copy link
Contributor

@Centril Centril May 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type system is absolutely affected, but as I noted before it is simplified... ;)

Those points are mainly decisions to be made before implementing the RFC:

- How should compiler messages or the documentation be rendered? The printed output for `Tuple<A, Tuple<B, Tuple<C, ()>>>` must probably be mapped back to `(A, B, C)` for readability. But what if this reverse mapping is impossible as is the case for the generalized tuple `impl`s?
- What should the compiler do with nonsensical tuples? A nonsensical tuple is a `Tuple` whose `TAIL` parameter is not a tuple (e.g. `Tuple<String, String>`). It feels like the easiest and most idiomatic answer is that the compiler should not care and let the code run into a type error as soon as the tuple is used. Nevertheless, nonsensical tuples could be discovered and reported by `clippy`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with not doing anything about "nonsensical" tuples; seems like banning them just brings unjustified complication to the type system and undoes the nice simplification benefits your proposal brings.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust's current type system is plenty enough to be able to specify that a tuple cons-element can only have () or another tuple cons-element as its tail associated type, so a separate mechanism for checking the well-formedness of a tuple type list is not needed. (Custom diagnostics might be of help in that area, though.)

Copy link
Member

@varkor varkor May 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just have a closed trait IsTuple that is only implemented by valid tuples? That would be a simple modification, but would avoid any extra complexities by not making sure Tuple is well-formed.


- How should compiler messages or the documentation be rendered? The printed output for `Tuple<A, Tuple<B, Tuple<C, ()>>>` must probably be mapped back to `(A, B, C)` for readability. But what if this reverse mapping is impossible as is the case for the generalized tuple `impl`s?
- What should the compiler do with nonsensical tuples? A nonsensical tuple is a `Tuple` whose `TAIL` parameter is not a tuple (e.g. `Tuple<String, String>`). It feels like the easiest and most idiomatic answer is that the compiler should not care and let the code run into a type error as soon as the tuple is used. Nevertheless, nonsensical tuples could be discovered and reported by `clippy`.
- How should the `Tuple` struct look like precisely? Should it export globally visible symbols like `tuple.elem` or `tuple.elem()` or should they be hidden behind a namespace, e.g. `Tuple::elem(tuple)`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the larger question here is whether &tup.tail should be possible or not. Please add that to the list.


- How should compiler messages or the documentation be rendered? The printed output for `Tuple<A, Tuple<B, Tuple<C, ()>>>` must probably be mapped back to `(A, B, C)` for readability. But what if this reverse mapping is impossible as is the case for the generalized tuple `impl`s?
- What should the compiler do with nonsensical tuples? A nonsensical tuple is a `Tuple` whose `TAIL` parameter is not a tuple (e.g. `Tuple<String, String>`). It feels like the easiest and most idiomatic answer is that the compiler should not care and let the code run into a type error as soon as the tuple is used. Nevertheless, nonsensical tuples could be discovered and reported by `clippy`.
- How should the `Tuple` struct look like precisely? Should it export globally visible symbols like `tuple.elem` or `tuple.elem()` or should they be hidden behind a namespace, e.g. `Tuple::elem(tuple)`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that a big question here is how much of frunk we want to add to the standard library.

@Woyten
Copy link
Author

Woyten commented May 23, 2019

@Centril Thank you very much for your feedback. I will address your comments and update this RFC once I find the time for it. 😊

@eaglgenes101
Copy link

This was essentially what I was thinking about proposing for variadic tuples, plus or minus some names and implementation details and such. I'll put my support behind it.

That said, my incubated proposal avoids specifying anything about tuple layout (mostly to avoid edge cases where alignment concerns cause tuples to be up to an order of magnitude larger than a similarly defined struct), and instead specifies a heterogeneous iteration mechanism, then augments tuples so that this heterogeneous iteration mechanism can be used to iterate over and destructure the tuple's fields by reference, by mut reference, or by move.

```rust
type () = (); // Not really an alias. Written down for completeness.
type (A,) = Tuple<A, ()>;
type (A, B) = Tuple<A, Tuple<B, ()>>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this desugaring is incompatible with the current unsizing rule.

  1. We need allow ([u8],).
  2. We also need to allow (u8, [u8]).

If (A,) is desugared as Tuple<A, ()>, this means ELEM must be relaxed as ELEM: ?Sized.

If (A, B) is desugared as Tuple<A, (B,)>, this means TAIL must be relaxed as TAIL: ?Sized.

But we cannot have two unsized fields in a structure (struct Tuple<E: ?Sized, T: ?Sized> { head: E, tail: T }.

Therefore, the tuple desugaring must terminate at (A,) and cannot be further desugared to Tuple<A, ()>.

Alternatively, you could reverse the expansion direction, so that only the final field needs to be unsized.

struct Tuple<Init, Last: ?Sized> {
    init: Init,
    last: Last,
}

type () = ();
type (A,) = Tuple<(), A>;
type (A, B) = Tuple<Tuple<(), A>, B>;
type (A, B, C) = Tuple<Tuple<Tuple<(), A>, B>, C>;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Easy solution 👍

@ahicks92
Copy link

Some thoughts:

When I implemented the initial version of size layout optimizations, we discovered that this can let some very small tuples and structs fit into registers when making function calls, i.e. (u8, u16, u8) becomes (u16, u8, u8). No one did a detailed analysis of how much this mattered because it was an emergent behavior; the reaction was "huh, that's cool and it might help." Given Rust's increased adoption in the embedded space, this may have practical implications on someone at this point.

Dropping layout optimizations also means that sequences of tuples take a real hit, if the program likes to build containers containing tuples. In practice (I don't have the reference handy) I think the largest gain we saw from field reordering was 72 bytes. I would expect the gains for reordering tuples to be much less since you usually use a struct for a large number of fields, but it can be a surprisingly significant drop.

I don't think these optimizations can be swept under the rug at this point. It's one thing for Rust to deoptimize memory usage on a PC, but quite another for it to deoptimize memory usage on a small microcontroller, where the line between "runs" and "doesn't run" can be small enough that a few extra bytes matters.

With variadics, you can for example pass 10 4-to-8 byte arguments in registers on some platforms. With tuples you can't, short of complicated optimizations that are hard to reason about as an end user of the language. For instance the compiler could elide tuples constructed at call sites, but probably couldn't if they're stored in variables. You can, however, arguably get forward compatibility with variadics using current tuples by making the compiler understand that a call myVariadic(my_tuple) should desugar to myVariadic(my_tuple.0, my_tuple.1, ...). In practice this behavior would probably need to be behind a macro or special syntax because there's no way to distinguish between calling a variadic function with one tuple argument, or wanting to expand a tuple, but it allows for much of the same effect.

Either way, I don't think that tuples can replace variadics in practice because variadics have a lot more potential around being faster. Perhaps there's a clever way to implement this in the compiler so that the gap closes, but if there's not I don't expect this will put that discussion to rest.

@Woyten
Copy link
Author

Woyten commented May 24, 2019

@camlorn @Centril What if the compiler would not use Tuple<ELEM, TAIL> or Tuple<INIT, ELEM> as a representation but Tuple<LEFT, RIGHT, ELEM> which would be equivalent to (LEFT.0, ..., LEFT.n-1, ELEM, RIGHT.0, ..., RIGHT.m-1)?

The compiler could reorder an arbitrary tuple (A, D, B, C, G, E, F) to the desired memory layout (A, B, C, D, E, F, G) using the following strategy (basically quick sort 😄):

(A, D, B, C, G, E, F)
  ELEM = G
  LEFT = (A, D, B, C)
    ELEM = D
    LEFT = (A,)
      ELEM = A
      LEFT = ()
      RIGHT = ()
    RIGHT = (B, C)
      ELEM = C
      LEFT = (B,)
        ELEM = B
        LEFT = ()
        RIGHT = ()
      RIGHT = ()
  RIGHT = E F
    ELEM = F
    LEFT = (E,)
      ELEM = E
      LEFT = ()
      RIGHT = ()
    RIGHT = ()

As ELEM is in the last position, unsized types should work as well. However, this approach would require at least one new language features that enables matching TUPLE<(), SUB, ELEM> with TUPLE<SUB, (), ELEM> and so on.

This means, in particular, when writing a generalized mapping function from, let's say, (u32, u16, u8) to (u8, u8, u8), we have to deal with the fact that the Tuple trees differ from each other and they need to me mapped at the value level. After all, I strongly believe that this structural mapping needs to be implemented anyway at some point in the future, no matter what the solution for the variadic tuples/generics problem will be.

@ExpHP
Copy link

ExpHP commented May 24, 2019

What if the compiler would not use Tuple<ELEM, TAIL> or Tuple<INIT, ELEM> as a representation but Tuple<LEFT, RIGHT, ELEM> which would be equivalent to (LEFT.0, ..., LEFT.n-1, ELEM, RIGHT.0, ..., RIGHT.m-1)?

That sounds like it has nasty implications for parametricity.

fn func<A, B, C>(tup: (A, B, C)) {
    // Does this have type A, B, or C?
    let x = get_elem(tup);
}

fn get_elem<L, R, E>(tup: Tuple<L, R, E>) -> E {
    tup.elem
}

@Woyten
Copy link
Author

Woyten commented May 24, 2019

@ExpHP The compiler would not accept tup.2 since tup being Tuple<L, R, E> is not a well-formed tuple.

On the other hand, if tup was Tuple<Tuple<(), (), usize>, Tuple<(), (), char>, String>, then tup.2 would evaluate to char.

@ExpHP
Copy link

ExpHP commented May 24, 2019

I meant .elem, I edited the post.

@comex
Copy link

comex commented May 25, 2019

You can accomplish most of the same tasks using a tuple trait. Using a trait (which the compiler could automatically implement for tuple types) would avoid the need to define tuples as having a particular structure, and also opens the door to more flexibility in the future, especially regarding references.

Here is a playground link where I've implemented the three examples in the RFC in today's Rust; I had to use a macro to define two base traits (HeadTail and Prepend) for different sizes of tuples, but Join, Last, and Halve themselves are implemented without any macros. However, Last does not actually compile, because rustc thinks the two impls could conflict even though they don't. That seems like it should be possible to improve.

With GATs, instead of having a HeadTail trait and a Prepend trait, you could pack everything into one trait:

trait Tuple {
    type Head;
    type Tail;
    fn head(self) -> Self::Head;
    fn tail(self) -> Self::Tail;

    type Prepend<N>;
    fn prepend<N>(self, n: N) -> Self::Prepended<N>;

    // more operations...
}

Edit: Actually, that's not great because it would imply defining a head and tail for (). What we really need is some way to express disjunction: a tuple either has a head and a tail, or is (). Ideally it would be possible to write a separate impl of a trait for each case, and then be able to convince the compiler that you've impled the trait for all T: Tuple.

Anyway, if you read my implementation, the obvious downside is the stuttering:

impl<T> Join for T where
    T: Copy,
    T: HeadTail,
    <T as HeadTail>::Head: Join,
    <T as HeadTail>::Tail: Join,
    <<T as HeadTail>::Tail as Join>::Joined: Prepend<<<T as HeadTail>::Head as Join>::Joined> {
    type Joined = <<<T as HeadTail>::Tail as Join>::Joined as Prepend<<<T as HeadTail>::Head as Join>::Joined>>::Prepended;

However, this could potentially be mitigated in a few different ways:

  • If the (already accepted) implied bounds RFC is ever implemented, especially if it's extended to type aliases, you could get most of those bounds introduced implicitly.
  • Even without that, use of type aliases could make the above snippet much shorter.
  • rustc can get smarter about inferring the appropriate trait for type projections. Currently, you can write T::Foo and have rustc guess which trait Foo belongs to based on impls in scope, but if you have a nested projection likeT::Foo::Bar, rustc requires explicitly specifying the trait for Bar.
  • edit: With GATs, it would be enough to have a T: Tuple bound (which could be put on the associated type definition) and then be able to use operations like Prepend without needing to add a separate bound.

Edit: And regardless of what happens with tuples, it would be nice to make that sort of computation using associated types more ergonomic.

@RustyYato
Copy link

RustyYato commented May 25, 2019

You can accomplish most of the same tasks using a tuple trait. Using a trait (which the compiler could automatically implement for tuple types) would avoid the need to define tuples as having a particular structure, and also opens the door to more flexibility in the future, especially regarding references.

This seems like a great idea. One issue that people may have with is it, is that it allows other types to masquerade as tuples, but I think that is fine.

Edit: Actually, that's not great because it would imply defining a head and tail for (). What we really need is some way to express disjunction: a tuple either has a head and a tail, or is (). Ideally it would be possible to write a separate impl of a trait for each case, and then be able to convince the compiler that you've impled the trait for all T: Tuple.

This can be solved with more traits (yay)

trait Tuple {}

trait NonEmpty: Tuple {
    type Head;
    type Tail: Tuple;

    fn head(self) -> Self::Head;
    fn tail(self) -> Self::Tail;
    fn split_off(self) -> (Self::Head, Self::Tail);
}

edit: With GATs, it would be enough to have a T: Tuple bound (which could be put on the associated type definition) and then be able to use operations like Prepend without needing to add a separate bound.

If we don't want to wait for GATs we can have

trait Prepend<T>: Tuple {
    type Joined: Tuple;
    
    fn prepend(self, value: T) -> Self::Joined;
}

Edit: And regardless of what happens with tuples, it would be nice to make that sort of computation using associated types more ergonomic.

Yes, I think chalk will bring in most of those improvements. I would especially love to have some sort of delayed bounds.


The only problem is, how do we implement algorithms with just traits?

@Aaron1011
Copy link
Member

One issue that people may have with is it, is that it allows other types to masquerade as tuples, but I think that is fine.

The compiler could simply prevent any manual implementations of the Tuple trait. Such a restriction could always be relaxed in the future, without breaking backwards compatibility.

@RustyYato
Copy link

RustyYato commented May 25, 2019

Thinking about this some more, with @Aaron1011's idea of limiting who can implement the trait, and with the NonEmpty trait we can implement algorithms recursively.

trait Algo {
    fn do_work(self);
}

impl Algo for () {
    fn do_work(self) {}
}

impl<T: NonEmpty> Algo for T {
    fn do_work(self) {
        do_work_with_head(self.head);
        self.tail.do_work() // no need to have a bound `T::Tail: Algo` because we have `NonEmpty` and `()` implemented
    }
}

@ahicks92
Copy link

@Woyten and everyone else:

Unless something has significantly changed since I did my work on the compiler (it has been a very long time as these things go at this point) making a tuple a tree is going to be just as hard to optimize as making a tuple list-like (there are proper terms for these, but I haven't done serious functional programming in a long time). The compiler is written such that assuming that a type-of-types-of... will in some way combine the layouts of the lower levels. There's been some movement away from that, for example the niche optimizations, but something this fundamental would be a pretty big change. I believe tuples are actually the same variant of ty::layout as structs, and that the rest of the pipeline after that gets to treat them as effectively the same--that is to say either we can optimize nested structs and this representation of tuples, or do neither, and since we can't do this optimization on nested structs because of pretty bad performance implications and borrowing, we can't do it at all (specifically if a.b is itself a struct, &a.b can't borrow if we optimize layout using a global thing that looks at all fields of all nested types).

Someone more up to date on the compiler's internal workings should perhaps chime in here.

There's a lot of paths forward without these downsides:

  1. Many traits can be implemented by implementing the trait for one item of the tuple, then effectively saying "for all items do...". @comex's proposal can be made to do this, and it could be given a convenient syntax (i.e. a macro in std).
  2. The type system can be extended (@comex again). How is an interesting discussion that I don't have the background for, but most of the proposals around that bring other interesting benefits in addition to "we fixed tuples" and are worth it for that reason in my opinion.
  3. One of the most common uses for this that I've seen is being able to write generic future combinators, which are better served by variadics anyway (because then you don't have to wrap everything in an extra set of parens).

In general I favor approaches that don't deoptimize memory to make this work. having to do some extra copies makes a program slower. Having to use more memory makes a program run out of memory. In hindsight someone should probably have raised concerns around never being able to undo field reordering once it was stabilized for this reason, but we are nonetheless in a state where I would be hesitant to do that; CPU is a more bountiful resource than memory, in the sense that if you don't have enough you run slower vs if you don't have enough you can't run at all. Also reordering tuple fields to fix your memory issue is not nearly so easy as reordering struct fields, since the order of tuple fields is their names--that is to say if we disabled this for structs we could fix programs that had this issue by changing the struct definition, but for anything else that might suddenly break or balloon in memory usage, someone has to find all the places it's used too.

No one will pay for the extra copying, should it be necessary, on day one either. It'll only be in new APIs that use the new feature.

For borrowing the tail of tuples, I think that we should disallow it unless it's the last item. This may be technically infeasible. But if we don't, it's a weird asymmetry in a lot of these proposals, since in effect you can probably only borrow the tail in some places (one unique advantage of this specific RF C is that I think it would work everywhere).

@SimonSapin
Copy link
Contributor

I like the direction of a trait rather than a struct. However, would it require a special case in a impl coherence rules so that impl<T> SomeTrait for T where T: Tuple is not considered to conflict with impl SomeTrait for SomeUpstreamType? I think it currently would, in case the dependency crate would later add impl Tuple for SomeUpstreamType.

To avoid the need to borrow the tail of a tuple (and the memory layout constraints that implies) we could:

  • Only provide decomposition into head and tail that move / take ownership of the input tuple rather than borrow it. This requires GATs with lifetime parameters.
  • Have projections from references to a tuple to a tuple of references, similar to Option::as_ref and Option::as_mut.

API sketch:

pub trait Tuple {
    type AsRef<'a>;
    type AsMut<'a>;
    fn as_ref<'a>(&'a self) -> Self::AsRef<'a>;
    fn as_mut<'a>(&mut 'a self) -> Self::AsRef<'a>;

    type First;
    type Rest;
    fn split_first(self) -> Option<(Self::First, Self::Rest)>;
}

// Compiler-generated for all tuple sizes:

impl Tuple for () {
    type AsRef<'a> = ();
    type AsMut<'a> = ();
    fn as_ref<'a>(&'a self) -> Self::AsRef<'a> { () }
    fn as_mut<'a>(&mut 'a self) -> Self::AsRef<'a> { () }

    type First = !;
    type Rest = !;
    fn split_first(self) -> Option<(Self::First, Self::Rest)> { None }
}
...
impl<A> Tuple for (A,) {
    type AsRef<'a> = (&'a A);
    type AsMut<'a> = (&'a mut A);
    fn as_ref<'a>(&'a self) -> Self::AsRef<'a> { (&self.0) }
    fn as_mut<'a>(&mut 'a self) -> Self::AsRef<'a> { (&mut self.0) }

    type First = A;
    type Rest = ();
    fn split_first(self) -> Option<(Self::First, Self::Rest)> { Some((self.0, ()) }
}

impl<A, B> Tuple for (A, B) {
    type AsRef<'a> = (&'a A, &'a B);
    type AsMut<'a> = (&'a mut A, &'a mut B);
    fn as_ref<'a>(&'a self) -> Self::AsRef<'a> { (&self.0, &self.1) }
    fn as_mut<'a>(&mut 'a self) -> Self::AsRef<'a> { (&mut self.0, &mut self.1) }

    type First = A;
    type Rest = (B,);
    fn split_first(self) -> Option<(Self::First, Self::Rest)> { Some((self.0, (self.1,)) }
}

// etc.

Alternative where () doesn’t implement the trait:

pub trait NonEmptyTuple {
    // …
    fn split_first(self) -> (Self::First, Self::Rest)
}

@RustyYato
Copy link

RustyYato commented May 26, 2019

I think that having 1 trait is better, using an Option is good enough since this trait will be useless as a trait object, split_first will be inlined and the Option will be optimized away.

@SimonSapin you put &mut 'a instead of &'a mut

@kennytm
Copy link
Member

kennytm commented May 26, 2019

Could this RFC (whether struct or trait approach) be applicable to function pointers?

  • fn(A, B, C, D) -> R
  • unsafe fn(A, B, C, D) -> R
  • extern fn(A, B, C, D) -> R
  • unsafe extern fn(A, B, C, D) -> R
  • extern fn(A, B, C, D, ...) -> R
  • unsafe extern fn(A, B, C, D, ...) -> R

What about the Fn/FnMut/FnOnce traits/objects?

@comex
Copy link

comex commented May 26, 2019

I like the direction of a trait rather than a struct. However, would it require a special case in a impl coherence rules so that impl<T> SomeTrait for T where T: Tuple is not considered to conflict with impl SomeTrait for SomeUpstreamType? I think it currently would, in case the dependency crate would later add impl Tuple for SomeUpstreamType.

Good point. Something like this old proposal for "sealed traits" could work – a way to mark a trait as not being implementable outside of the crate it's defined in, which would then allow more relaxed coherence rules.

@comex
Copy link

comex commented May 26, 2019

What about the Fn/FnMut/FnOnce traits/objects?

We could change the Fn* traits to add Tuple bounds, e.g.

trait Fn<Args> where Args: Tuple

This would not be a breaking change, because you can't write Fn<Args> on stable; you have to use the Fn(Arg, Arg) sugar, which guarantees that the generic parameter is a tuple.

However, to be able to recurse on the Args parameter in a generic context would require some way to express disjunction, as I mentioned before. You need to be able to write separate impls of your trait for non-empty tuples and for (), and then somehow convince the compiler that your trait is necessarily impl'ed by Args given that Args: Tuple.

One potential approach could be based on associated constants. Something like:

enum TupleKind {
    Empty,
    NonEmpty
}
trait Tuple {
    const KIND: TupleKind;
}
impl<T> Foo for T where T: Tuple<KIND=TupleKind::Empty> { ... }
impl<T> Foo for T where T: Tuple<KIND=TupleKind::NonEmpty> { ... }

The compiler would have to add support for referencing associated constants with Trait<Foo=Bar> syntax, not just associated types, and additionally be able to tell that the impls together cover all possible variants of TupleKind.

Alternately, there have been some proposals for "mutually exclusive traits".

A third approach could be to add type inequality constraints, but I don't think those would be as easy to implement as they seem: they run into the same coherence issues as specialization.

@eaglgenes101
Copy link

eaglgenes101 commented May 27, 2019

I like the direction of a trait rather than a struct. However, would it require a special case in a impl coherence rules so that impl<T> SomeTrait for T where T: Tuple is not considered to conflict with impl SomeTrait for SomeUpstreamType? I think it currently would, in case the dependency crate would later add impl Tuple for SomeUpstreamType.

Good point. Something like this old proposal for "sealed traits" could work – a way to mark a trait as not being implementable outside of the crate it's defined in, which would then allow more relaxed coherence rules.

Trait specialization is coming around as part of the 2019 roadmap; perhaps it might be the tool at hand that we use to make this work when it comes time to implementation?

(Chalk should be able to reason that either the sentinel unit type does implement the type list trait, and thus its implementation unambiguously specializes the type list trait's, or that it doesn't, and thus there is no overlap. In either case, the types as declared are okayed by the type checker, and the trait implementation written specifically for the sentinel unit type applies to that type.)

@comex
Copy link

comex commented May 27, 2019

Specialization could work for some things, but as it stands today it has limitations. If you write <Foo as Trait>::Bar, the compiler will usually 'normalize' the projection by replacing it with whatever Bar is defined as in whichever Trait impl it finds for Foo. But if that impl only defines a default type, the compiler intentionally leaves it un-normalized. This is meant to preserve forwards compatibility with hypothetical specialized impls being added in the future; unfortunately, it also makes it impossible to do type-level computation with it, which severely limits what you can accomplish. In other words, if you ask "does a specialized impl apply", the compiler can only answer "yes" and "maybe", not "no". Perhaps this could be improved in the future, but I'm not exactly sure how.

...But in any case, the post you quoted wasn't about conflicts between nonempty tuples and (), but about conflicts in downstream crates between impls of the same trait for tuples and for unrelated types, because the compiler doesn't want to rule out that you could impl Tuple for UnrelatedType. Specialization could serve as a workaround there too, but it's annoying to force on downstream crates, and for some use cases you'd need the (proposed but not implemented) "lattice rule".

pub elem: ELEM,
pub tail: TAIL,
}
```
Copy link
Contributor

@gnzlbg gnzlbg May 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this compatible with the layout specified in the unsafe code guidelines?

an anonymous tuple type (T1..Tn) of arity N is laid out "as if" there were a corresponding tuple struct declared in libcore:

#[repr(Rust)]
struct TupleN<P1..Pn:?Sized>(P1..Pn);

Note that this specifies that the layout of a tuple does not match that of the struct this RFC proposes, but of a tuple-struct.

This allows the compiler to perform some optimizations, like field re-ordering, but this RFC does not mention anything about this particular trade-off. I'd recommend scanning the UCGs repo for trade-offs and extending the RFC with how would this change affect those.

@frehberg
Copy link

Nice! I must admit, I implemented a "sequence" crate https://crates.io/crates/seq, but never thought about it as a generalization of "tuples".

As this proposed idea of "tuple" is very close to the head/tail concept known form functional languages, would the proposed tuple-concept in rust also permit to extend an N-tuple, prepending another element to it and forming an (N+1)-tuple, formed by head and a tail, where the tail would reference the N-tuple?

@matthieu-m
Copy link

Back in the days, prior to C++11, Boost.Tuple was developed using a similar ConsList strategy.

It unlocked a number of usecases, however it also had a number of drawbacks:

  • Template instantiation recursion limit would pop up relatively frequently, as all algorithms had to be expressed as recursions with accumulator.
  • Said template instantiation recursion limit was a hint that compile-time were suffering, and they did.
  • Said recursive algorithms with accumulator were not really straightforward to write, debug or read.

The addition of variadic templates to C++11 was a very welcome change. It did not clean up all the cruft, but it did sped up compilation and lift quite a few limitations.


This modest proposal is interesting, from a hacker/minimalist POV, however I cannot help but think back to my C++ experience and wonder:

  • Is this proposal not going to suffer from the same drawbacks that Boost.Tuple did?
  • Is this proposal going to be sufficient, or will it have to be replaced by a better handling of variadic generics?
  • If this proposal will have to be replaced anyway, is the intermediate step necessary or should we just go with the next step immediately?

@eaglgenes101
Copy link

One could layer an inductive destructuring mechanism over such a base, but then with the inductive destructuring mechanism, the whole part of the inductive concrete tuple definition made redundant, and can be replaced by methods that inductively destructure to references, mutable references, or moved values.

@spunit262
Copy link

spunit262 commented May 30, 2019

I don't have any experience with C++11's variadic templates, but @matthieu-m's comment reminded me of D's variadic templates and static foreach.
So what do people think of this quick mock up?

impl<T @ (..: Debug, _: ?Sized + Debug,)> Debug for T {
    fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {
        let mut builder = f.debug_tuple("");
        static for v in self {
            builder.field(&v);
        }
        builder.finish()
    }
}

@samsartor
Copy link
Contributor

It is possible to bridge the gap between the recursive and iterative approaches without new syntax. For example:

pub trait VariadicConsumer<T> {
    fn take(&mut self, arg: T);
}

pub trait VariadicProducer<C> {
    fn pass(self, to: C);
}

impl<T: Debug> VariadicConsumer<&T> for DebugTuple {
    fn take(&mut self, arg: &T) {
        self.field(arg);
    }
}

impl<T> Debug for T
    where &T: VariadicProducer<DebugTuple>
{
    fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {
        let mut builder = f.debug_tuple("");
        self.pass(builder);
        builder.finish()
    }
}

The compiler could then have optimized type checking and inlining of VariadicProducer implementations, roughly equivalent to:

impl<C> VariadicProducer<C> for () {
    fn pass(self, to: C) { }
}

impl<C, T: Tuple> VariadicProducer<C> for T
    where C: VariadicConsumer<T::Head>, T::Tail: VariadicProducer<C>
{
    fn pass(self, to: C) {
        to.take(self.head());
        self.tail().pass(to);
    }
}

impl<C, T: Tuple> VariadicProducer<C> for &T
    where T::AsRef: VariadicProducer<C>
{
    fn pass(self, to: C) {
        self.as_ref().pass(to)
    }
}

impl<C, T: Tuple> VariadicProducer<C> for &mut T
    where T::AsMut: VariadicProducer<C>
{
    fn pass(self, to: C) {
        self.as_mut().pass(to)
    }
}

This Variadic(Producer|Consumer) API isn't enough on its own to cover more complex cases where different elements of the tuple have different trait bounds. It could serve the common case well, without suffering from the issues of Boost.Tuple. However, I don't think it would add value if the compiler could apply similar optimized type checking and inlining to any usage of the recursive API.

@jswrenn jswrenn mentioned this pull request Jul 30, 2019
@drdozer
Copy link

drdozer commented Sep 26, 2019

Just as a data point, scala 3 (dotty) has generalised tuples. They have a thin layer of compiler magic so that they behave as if they were an hlist. The nil element is the empty tuple. Then all other tuples can be interacted with as hcons cells. This is a source-level fiction, so it's up to the dotty compiler to map this down to jvm objects with multiple fields instead of the inefficient singly-linked list representation.

Since rust owns the concrete representation, I don't see why something similar couldn't be done. Also, because rust owns the sharing, it should be possible to optimise adding and removing elements from both ends of a tuple so that it grows/shrinks in place for the cases where the grow/shrink ops take ownership of the prior tuple.

@ExpHP
Copy link

ExpHP commented Sep 26, 2019

Since rust owns the concrete representation, I don't see why something similar couldn't be done

It's not that rust can't, it's that we probably don't want it to. A cons-able tuple layout is at odds with optimizing for space and alignment.

Modern rust optimizes (u16, u8, u32) to have a size of 8 bytes, whereas (u16, (u8, u32)) is required to occupy 12. Losing that optimization means we'll get a 50% increase in memory consumption when a program constructs a Vec<(u16, u8, u32)>.

Since rust owns the concrete representation, I don't see why something similar couldn't be done. Also, because rust owns the sharing, it should be possible to optimise adding and removing elements from both ends of a tuple so that it grows/shrinks in place for the cases where the grow/shrink ops take ownership of the prior tuple.

This bit puzzles me because there's virtually nothing to optimize here. Growing or shrinking a tuple currently is a memcpy of each field in the output tuple. This would only change it so that maybe you can have one big memcpy of an entire tuple (plus or minus one field at the end), which I'd imagine should make a difference of peanuts.

@ExpHP ExpHP mentioned this pull request Oct 2, 2019
@nikomatsakis
Copy link
Contributor

Hello! We received this RFC in today's "backlog bonanza" meeting. We felt that while the idea of handling tuples of arbitrary arity is very appealing, this was overall not a good fit for our current priorities and in particular not for the current roadmap, which is really oriented at finishing off the features we have in flight and not in adding new ones. This feature seems sufficiently complex that it would be a major undertaking and we don't have the bandwidth for that. If/when we do consider it in the future, though, we felt it would make sense to discuss in conjunction with variadic generics, for the two features seem to be quite similar.

@rfcbot fcp close

@rfcbot
Copy link
Collaborator

rfcbot commented Sep 2, 2020

Team member @nikomatsakis has proposed to close this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-close This RFC is in PFCP or FCP with a disposition to close it. labels Sep 2, 2020
@rfcbot rfcbot added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Sep 16, 2020
@rfcbot
Copy link
Collaborator

rfcbot commented Sep 16, 2020

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Sep 16, 2020
@rfcbot rfcbot added the finished-final-comment-period The final comment period is finished for this RFC. label Sep 26, 2020
@rfcbot
Copy link
Collaborator

rfcbot commented Sep 26, 2020

The final comment period, with a disposition to close, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC is now closed.

@rfcbot rfcbot added to-announce closed This FCP has been closed (as opposed to postponed) and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. disposition-close This RFC is in PFCP or FCP with a disposition to close it. labels Sep 26, 2020
@rfcbot rfcbot closed this Sep 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-expressions Term language related proposals & ideas A-patterns Pattern matching related proposals & ideas A-product-types Product type related proposals A-repr #[repr(...)] related proposals & ideas A-structural-typing Proposals relating to structural typing. A-tuples Proposals relating to tuples. A-typesystem Type system related proposals & ideas closed This FCP has been closed (as opposed to postponed) finished-final-comment-period The final comment period is finished for this RFC. T-lang Relevant to the language team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.