-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tuple cons cell syntax #1582
Closed
Closed
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,262 @@ | ||
- Feature Name: tuple_cons_cells | ||
- Start Date: 2016-05-17 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Add a syntax for expressing tuples as a head and tail pair, similar to a Lisp | ||
cons cell. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
Currently, rust doesn't give the user any way to talk about generically-sized | ||
tuples. This means that it's not possible to, for example, implement a trait | ||
for all tuples. Traits like `fmt::Debug` are only implemented for tuples of up | ||
to some arbitrary number of elements. These impls are generated using some | ||
hacky macro magic and they end up polluting the rust-doc documentation. This | ||
RFC would alleviate these problems. | ||
|
||
This RFC is also a step towards more general variadic generics as proposed in | ||
draft RFC #376. While that RFC discusses using variadic lists of types in | ||
positions such as function argument lists, this RFC only covers the specific | ||
case of tuples of generic arity. Whatever "full" variadic generics eventually | ||
look like, when and if they get implemented, it is unlikely we would want the | ||
design for tuples to work any differently to what's proposed here. | ||
|
||
# Detailed design | ||
[design]: #detailed-design | ||
|
||
This RFC proposes introducing two new syntactic forms: one for tuple types and | ||
one for tuple terms. With this new syntax a tuple can be expressed as `(head_0, | ||
head_1, ... head_n; tail)` where `head_x` are the first `n + 1` elements of the | ||
tuple and `tail` is a tuple containing the remainder of the tuple. The table | ||
below shows some of the different ways of expressing the same tuple combining | ||
current rust syntax and the new syntax. | ||
|
||
0 elements | ||
() | ||
(; ()) | ||
(; (; ())) | ||
|
||
1 element | ||
|
||
(a,) | ||
(a; ()) | ||
(a,; ()) | ||
(; (a,)) | ||
(; (a; ())) | ||
(; (a,; ())) | ||
|
||
2 elements | ||
|
||
(a, b) | ||
(a, b,) | ||
(a, b; ()) | ||
(a, b,; ()) | ||
(a; (b,)) | ||
(a; (b; ())) | ||
(a; (b,; ())) | ||
|
||
3 elements | ||
|
||
(a, b, c) | ||
(a, b, c,) | ||
(a, b, c; ()) | ||
(a, b, c,; ()) | ||
(a, b; (c,)) | ||
(a, b,; (c,)) | ||
(a, b; (c; ())) | ||
(a, b; (c,; ())) | ||
(a, b,; (c; ())) | ||
(a, b,; (c,; ())) | ||
(a; (b, c)) | ||
(a; (b, c,)) | ||
(a,; (b, c)) | ||
(a,; (b, c,)) | ||
(a; (b, c; ())) | ||
(a; (b, c,; ())) | ||
(a,; (b, c; ())) | ||
(a; (b, c,; ())) | ||
(a,; (b, c,; ())) | ||
(a; (b; (c,))) | ||
(a; (b,; (c,))) | ||
(a,; (b; (c,))) | ||
(a,; (b,; (c,))) | ||
(a; (b; (c; ()))) | ||
(a; (b; (c,; ()))) | ||
(a; (b,; (c; ()))) | ||
(a; (b,; (c,; ()))) | ||
(a,; (b; (c; ()))) | ||
(a,; (b; (c,; ()))) | ||
(a,; (b,; (c; ()))) | ||
(a,; (b,; (c,; ()))) | ||
|
||
and so forth... | ||
|
||
This RFC proposes equivalent syntax for tuple types. Formally, the syntax for | ||
tuple types could be described with the following grammar fragment: | ||
|
||
``` | ||
ty_tuple | ||
: "(" ")" | ||
| "(" ty "," ty_tuple_inner ")" | ||
| "(" ty_tuple_inner ";" ty ")" | ||
|
||
ty_tuple_inner | ||
: %empty | ||
| ty | ||
| ty "," ty_tuple_inner | ||
``` | ||
|
||
With this syntax, any tuple can be expressed as either `()` or `(head; tail)`. | ||
This makes it possible for generic code to handle all tuples by covering just | ||
these two cases. | ||
|
||
In addition to syntax for expressions and types, this RFC also proposes syntax | ||
for destructuring tuples into a head and tail. Here, the obvious syntax is | ||
used, ie. `let (head; tail) = (a; b);` results in `head == a` and | ||
`tail == b`. | ||
|
||
This RFC also proposes a new marker trait be added to the language. `Tuple` is | ||
a trait which is implemented for `()` and `(H; T) where T: Tuple`. In general, | ||
the type `(H; T)` is only valid when `T: Tuple`. | ||
|
||
### Representation | ||
|
||
The main problem with implementing this RFC is the question of representation. | ||
At the memory level, in current Rust, there is no guarantee that the | ||
representation of an `(a; b)` contains the representation of a `b`. The | ||
solution proposed here is two-fold. First, we allow types to have separate | ||
stride and size as per RFC issue #1397. Secondly, we layout tuples in reverse | ||
order. Under this scheme, the tuple `(A, B, C) : (u16, u16, u32)` would be | ||
represented as | ||
|
||
``` | ||
------------------------------------------------ | ||
| Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | | ||
|------|-------------------|---------|---------| | ||
| Data | C (u32) | B (u16) | A (u16) | | ||
------------------------------------------------ | ||
``` | ||
|
||
And it's tail `(B, C) : (u16, u32)` would be represented as | ||
|
||
``` | ||
------------------------------------------------ | ||
| Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | | ||
|------|-------------------|---------|---------| | ||
| Data | C (u32) | B (u16) | padding | | ||
------------------------------------------------ | ||
``` | ||
|
||
Crucially, the stride of this type is only 6 bytes. This means that a | ||
`&mut (u16, u32)` can not be used to modify the tuple's two trailing "padding" | ||
bytes as any tuple (accessed through a reference) may be the tail of a larger | ||
tuple. | ||
|
||
## Example | ||
|
||
This code shows how we could use the proposed syntax to `impl Debug` for all | ||
tuples. | ||
|
||
``` | ||
// libcore/fmt/mod.rs | ||
|
||
// Private trait used to help impl Debug. | ||
trait TupleExt: Tuple + Debug { | ||
fn debug_tail(&self, f: &mut Formatter) -> Result; | ||
} | ||
|
||
impl TupleExt for () { | ||
fn debug_tail(&self, f: &mut Formatter) -> Result { | ||
write!(f, ")") | ||
} | ||
} | ||
|
||
impl<H: Debug, T: TupleExt> TupleExt for (H; T) { | ||
fn debug_tail(&self, f: &mut Formatter) -> Result { | ||
let (ref head; ref tail) = *self; | ||
try!(write!(f, ", {:?}", *head)); | ||
tail.debug_tail(f) | ||
} | ||
} | ||
|
||
// impl Debug for 0 elements | ||
impl Debug for () { | ||
fn fmt(&self, f: &mut Formatter) -> Result { | ||
write!(f, "()") | ||
} | ||
} | ||
|
||
// impl Debug for 1 element | ||
impl<H: Debug> Debug for (H,) { | ||
fn fmt(&self, f: &mut Formatter) -> Result { | ||
let (ref head,) = *self; | ||
write!(f, "({:?},)", *head) | ||
} | ||
} | ||
|
||
// impl Debug for 2 or more elements | ||
impl<H0: Debug, H1: Debug, T: TupleExt> Debug for (H0, H1; T) { | ||
fn fmt(&self, f: &mut Formatter) -> Result { | ||
let (ref head; ref tail) = *self; | ||
try!(write!(f, "({:?}", *head)); | ||
tail.debug_tail(f) | ||
} | ||
} | ||
``` | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
* Adds more syntax and another concept that people will need to learn. | ||
* Code that modifies the head element of a tuple through a mutable reference - | ||
where the compiler cannot guarantee that the tuple is not part of a larger | ||
tuple - will sometimes be less efficient as the compiler will no longer be | ||
able to assume that it's safe to overwrite trailing padding bytes. | ||
|
||
# Alternatives | ||
[alternatives]: #alternatives | ||
|
||
* Not do this. | ||
* Consider using a different syntax. The `(H; T)` syntax was chosen to resemble | ||
the `[T; N]` syntax for arrays (on the theory that a tuple type is defined by | ||
the type of its element and the type of its tail whereas an array type is | ||
defined by the type of its elements and its length). Another possible syntax | ||
would be `(a, b, ...more_elems)` although this would conflict with the | ||
inclusive ranges RFC. Another is `(a, b, more_elems...)` although this looks | ||
very similar to range syntax and may be confusing. | ||
* Consider a different layout. By packing tuples less efficiently we could | ||
obviate the need for the stride/size distinction and make updating the head | ||
elements of tuples more efficient. Overall though I'm not sure this | ||
would be a win. The efficiency hit associated with the proposed design only | ||
happens when modifying a tuple through a mutable reference. Also the reference | ||
must be to the tuple itself, not to an element in the tuple like what one | ||
would obtain by writing `(mut ref a, ...) = some_tuple`. Also, the update | ||
must happen to the head element of the tuple and the head element must be | ||
small. Conversely, packing tuples less efficiently would often result in | ||
significantly less efficient layout (eg. `(u16, u16, u32)` taking 12 bytes | ||
instead of 8). [More knowledgeable people than me disagree though](https://github.com/rust-lang/rfcs/issues/1397#issuecomment-213311508), | ||
so it would be worth discussing this further and trying to obtain data to | ||
inform a decision with. | ||
* Sidestep the representation issue by disallowing references to the tail of a | ||
tuple. This would largely defeat the purpose of the RFC as, for example, the | ||
`Debug` implementation above would be impossible to write. | ||
* Sidestep the representation issue by making references to the tail of a tuple | ||
expand into a tuple of references. Here, `let (; ref x) = (0, 1, 2);` would | ||
yield `x == (&0, &1, &2)`. This would get extremely messy. Consider the | ||
`Debug` implementation above which recursively formats its tuple argument. On | ||
the first iteration it would be handling a tuple of values. On the second, a | ||
tuple of references-to-values. On the third, a tuple of | ||
references-to-references-to-values. And so forth. It would also be surprising | ||
and unintuitive that `let (; x) = (0, 1, 2)` gives `x == (0, 1, 2)` but | ||
`let (; ref x) = (0, 1, 2)` doesn't give `x == &(0, 1, 2)`. | ||
|
||
# Unresolved questions | ||
[unresolved]: #unresolved-questions | ||
|
||
The representation issue warrants further discussion. | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As enthusiastic as I am about this RFC, there's two potential issues I don't see addressed (or I'm not aware of their resolution):
&(bool, char, Display)
. (cc all but the last field of a tuple must be Sized #1592)unsafe
code which assumes that all ofsize_of::<T>()
can be overwritten?(Personally, while the alternative of laying out tuples less efficiently is shaky on "don't sacrifice potential performance ever ever" grounds, I do think it could pass the "don't pay for what you don't use" test. Heretofore, tuples and structs have been semantically equivalent, so it makes perfect sense that they would use the same representation. With this change, tuples would now be more powerful than structs: they would not merely be "anonymous structs", as they had been before, but heterogenously-typed lists which can be iterated over. Furthermore, struct fields are unordered, while tuple fields would have a strict ordering not just syntactically but now semantically as well. Just as unordered collections often get to use more efficient representations than ordered ones, it also makes sense that unordered structs would get to use more efficient representations than ordered tuples. But we should collect statistics and benchmarks about the impact on existing real-world Rust code if we choose to pursue this option. My suspicion is that the impact would be negligible - tuples with more than two elements and of different types are already an uncommon case, I think.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, that's a pretty strong argument for changing the syntax to something like
(head..., tail)
where the subtuple is collected into the head rather than the tail. I'll see what comes of the conversation on #1592 before updating the RFC.Well, in general having, a size/stride distinction would break code like that. Unless we make
size_of
actually return the stride for the sake of backwards-compatibility.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to iterate backwards feels unnatural to me - everything else (of course) goes forwards. Just my personal opinion though...