Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Named tuples #1673

Closed
wants to merge 4 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions proposals/named-tuples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@

# Tuples: Small to Big

When tuples were added in C# 7 they brought a few new things to the language
Copy link
Contributor

@jnm2 jnm2 Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing : (and perhaps a comma before they)


1. A short, simple syntax for grouping pieces of data
2. A way to group simple data beyond method boundaries with automatic structural equality
3. First-class support for pattern matching and destructuring with positional semantics

In retrospect, it looks like tuples have done a very good job of addressing these pain points
for small sets of data across short pieces of a program. This is great, but it presents a
problem when you decide that your set of data has either grown too large to represent in a tuple,
or you decide that you want that tuple to be a core type in large parts of your program. Since
a tuple is not really its own type and instead is simply a composite of its members, this forces
you to repeat the type declaration at every reference point, which quickly becomes laborious if
you either have many tuple members or you use the tuple in many places. In addition, the more
the tuple type becomes central to the design of a piece of your program, the more the structural
typing becomes a problem and the desire for traditional C# nominal typing becomes prominent.

This is generally the point where most users will want to evolve their tuple type into a proper
named type. Unfortunately, this is extremely laborious. Now the following must be defined manually:

1. `ItemX` properties for each positional member of the type
2. Optionally, custom named properties for each positional member of the type
3. A constructor to assign each of the input parameters
4. A deconstructor to support pattern matching
4. Memberwise equality, including `GetHashCode` and `IEquatable`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: i would be clear that while an IDE can help out with all of that (and it's something i'm working on), it's still conceptually very laborious. It's just more code to see. It's places where bugs can creep in. It's something you need to maintain as you evolve your type.


While the associated `data` named types makes some of this easier, the positionality and tuple
characteristics are not addressed and are orthogonal to the former proposal.

# Proposal

To solve these problems, I propose what I'm calling "named tuples."

A named tuple has the following syntax for classes, with an analogous definition for structs:

```antlr
class_declaration
: attributes? class_modifier* 'partial'? 'class' identifier type_parameter_list?
parameter_list? class_base? type_parameter_constraints_clause* (class_body | ';')?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think you want: (class_body | ';')?, that seems to imply that you could leave off both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also seem to allow class_base, even with "parameter_list". So it would be good (and maybe you do it below) to have strong semantics defined around inheritance.

;

parameter_list
: '(' parameters ')'
;

parameters
: attributes? parameter
| parameters ',' attributes? parameter
;

parameter
: 'readonly'? identifier identifier?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i hope that identifier identifier? is there so you can name parameters/properties differently :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the left identifier is the type, the right is the name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the left identifier is the type,

:-/ I'm feeling dumb. How is an identifier sufficient to represent all the types a parameter could be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably not. The spec calls this a type but I can't find any definition for the type non-terminal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agocke

The spec calls this a type but I can't find any definition for the type non-terminal.

It's in the Types section.

;
```

For example, a Cartesian "point" class could like the following as a named tuple:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For [](start = 0, length = 3)

Nit: for readability, I'd suggest swapping the example and the formal syntax (the example gets the "point" across more smoothly).


```C#
class Point(int, int);

void M()
{
var p = new Point(0, 0);
}
```

Like tuples, names of the elements are optional. If no names are provided, a consumer
can refer to automatically generated `ItemX` properties, as well as use the automatically
Copy link
Member

@jcouv jcouv Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ItemX [](start = 38, length = 5)

If you give names to an element, maybe we should not give an ItemN property for that element.
For tuples, we had to support ItemN because they exist in the underlying type, which you may be using in a C# 6 program.

Going further, maybe we should not have ItemN at all (you must name the elements).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm tentatively in agreement with this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree with the idea, it's important to point out that if a user decides to move a tuple to a named tuple and they had a few spots where they were using the ItemX syntax, this would break their code. I doubt it's a high percentage of our users though, so I'm in favor the change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compat was my main driver here. If you remove the ItemX property you also make itimpossible to retrieve an unnamed item except through deconstruction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my preference is simply: you have to provide member names for a named-tuple. Once you are naming, just name all the things...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no reason to prohibit it. It's a nice, short syntax for simple positional types. Even if there's only deconstruction, that's enough to make it work just fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agocke In what situations is it useful to name the type, but not its members?

Even if you're using deconstructor, I think it's very useful to know what the deconstructed values mean and member names do that well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svick The best situation is in a discriminated union:

enum class Option<T>
{
    Some(T),
    None
}

Every ML language I've used has a simple syntax for declaring union elements with broader names, but not types. If you're deconstructing via pattern matching anyway, the name isn't terribly useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agocke To me, that sounds like something that should be considered as part of the design of discriminated unions, not in this independent feature.

If it turns out that discriminated unions would use named tuples with unnamed members, that can be added then. But I don't think it's a reason to add it before that.

generated Deconstruct method, e.g.

```C#
void M()
{
var (x, y) = new Point(0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: if we allow inheritance in this proposal, deconstruction gets... interesting :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would it be any more interesting than it can be now? You can already define Deconstruct methods for your classes and use inheritance with them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! i think i thought it might be a problem because you wouldn't necessarily know the relationship between the order of data-members in a derived type, vs those in the base type.

however, as i try to make an example, it looks like it may not be an issue. You'd would just use the parameter-list sig as provided int he derived type for generating things.

}
```

In the simplest form, the body of the class is also optional.

## Generated members
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated members [](start = 3, length = 17)

Maybe we should also generate a conversion from named-tuple to regular-tuple.
Point(int, x, int y) -> (int x, int y) (and possibly vice-versa)
That would make it easier to upgrade your code from tuple to named-tuple incrementally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about that. There is some benefit to allowing incremental upgrade, but it seems like this would make it much easier for a user to get into a scenario where they accidentally convert from data type to another data type without intending to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be very likely to break @agocke's equality rules by making it very easy to accidentally coerce the Point(x, y) to a (x, y) tuple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your concern is about accidentally converting, then an explicit method (.AsTuple()) and a constructor (Point((int x, int y) tuple)) would make the translation explicit, while retaining the benefits.


In reply to: 198679230 [](ancestors = 198679230)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could get behind an explicit AsTuple(). My concern is about the accidental cases.


From the definition of `data` named types, you can see that tuples already conform to many
of the same semantics. The same is true of named tuples. Equality is generated like
for `data` named types, but the `data` keyword is not needed. One difference between equality
in named tuples and equality in anonymous tuples is that anonymous tuples are structurally
equal, while named tuples follow the C# standard of nominal equality for named types. This
means that one named tuple instance can only be equal to another named tuple instance if they
are the same type, not if they just have equal members.

In addition to the equality members, named tuples also generate

1. A constructor corresponding to the parameter list in the type definition.
2. `ItemX` properties for each member
3. If a parameter is named in the parameter list, a public property with the same name
that gets/sets that member.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the past, there's been contention about naming properties the same name as parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to not care.

4. A Deconstruct method corresponding to the parameter list.

If the `readonly` modifier precedes any of the member types in the parameter list, all
autogenerated properties for that parameter are get-only properties.

## Customization

If any of the members which would be automatically generated are manually specified in
the body of the class, those members skip auto-generation, including the constructor. It
is not illegal to define a constructor with the same signature. However, the parameter
names of the constructor are not considered when assigning names for automatically generating
member properties.

Note that, unlike in the `data` named type proposal, there is no special behavior for
object initializers, so an unspeakable initialization method will not be generated.