Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Named tuples #1673

Closed
wants to merge 4 commits into from
Closed

Proposal: Named tuples #1673

wants to merge 4 commits into from

Conversation

agocke
Copy link
Member

@agocke agocke commented Jun 27, 2018

This is part of series of "working with data" proposals and references
the proposal in #1667.

In contrast to data named types, named tuples place an emphasis on positional semantics, short syntax, and high pattern matching integration.

This is part of series of "working with data" proposals and references
the proposal in dotnet#1667.
@agocke agocke changed the title Add named tuple proposal Proposal: Named tuples Jun 27, 2018

# Tuples: Small to Big

When tuples were added in C# 7 they brought a few new things to the language
Copy link
Contributor

@jnm2 jnm2 Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing : (and perhaps a comma before they)

@HaloFour
Copy link
Contributor

Seems like something easily addressed by aliases.

I'd make the argument that there isn't/shouldn't be a lot of overlap between tuples and named types. The compiler makes some assumptions now that the container of a proper tuple is irrelevant, such as flattening assignment or equality of members, which would no longer be possible or could expose subtle differences in this "promotion". This feels like a proposal desperately in need of a problem.

@jnm2
Copy link
Contributor

jnm2 commented Jun 27, 2018

Does this replace primary constructors?

;
```

For example, a Cartesian "point" class could like the following as a named tuple:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For [](start = 0, length = 3)

Nit: for readability, I'd suggest swapping the example and the formal syntax (the example gets the "point" across more smoothly).

2. Optionally, custom named properties for each positional member of the type
3. A constructor to assign each of the input parameters
4. A deconstructor to support pattern matching
4. Memberwise equality, including `GetHashCode` and `IEquatable`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: i would be clear that while an IDE can help out with all of that (and it's something i'm working on), it's still conceptually very laborious. It's just more code to see. It's places where bugs can creep in. It's something you need to maintain as you evolve your type.

```antlr
class_declaration
: attributes? class_modifier* 'partial'? 'class' identifier type_parameter_list?
parameter_list? class_base? type_parameter_constraints_clause* (class_body | ';')?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think you want: (class_body | ';')?, that seems to imply that you could leave off both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also seem to allow class_base, even with "parameter_list". So it would be good (and maybe you do it below) to have strong semantics defined around inheritance.

```

Like tuples, names of the elements are optional. If no names are provided, a consumer
can refer to automatically generated `ItemX` properties, as well as use the automatically
Copy link
Member

@jcouv jcouv Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ItemX [](start = 38, length = 5)

If you give names to an element, maybe we should not give an ItemN property for that element.
For tuples, we had to support ItemN because they exist in the underlying type, which you may be using in a C# 6 program.

Going further, maybe we should not have ItemN at all (you must name the elements).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm tentatively in agreement with this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree with the idea, it's important to point out that if a user decides to move a tuple to a named tuple and they had a few spots where they were using the ItemX syntax, this would break their code. I doubt it's a high percentage of our users though, so I'm in favor the change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compat was my main driver here. If you remove the ItemX property you also make itimpossible to retrieve an unnamed item except through deconstruction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my preference is simply: you have to provide member names for a named-tuple. Once you are naming, just name all the things...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no reason to prohibit it. It's a nice, short syntax for simple positional types. Even if there's only deconstruction, that's enough to make it work just fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agocke In what situations is it useful to name the type, but not its members?

Even if you're using deconstructor, I think it's very useful to know what the deconstructed values mean and member names do that well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svick The best situation is in a discriminated union:

enum class Option<T>
{
    Some(T),
    None
}

Every ML language I've used has a simple syntax for declaring union elements with broader names, but not types. If you're deconstructing via pattern matching anyway, the name isn't terribly useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agocke To me, that sounds like something that should be considered as part of the design of discriminated unions, not in this independent feature.

If it turns out that discriminated unions would use named tuples with unnamed members, that can be added then. But I don't think it's a reason to add it before that.

;

parameter
: 'readonly'? identifier identifier?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i hope that identifier identifier? is there so you can name parameters/properties differently :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the left identifier is the type, the right is the name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the left identifier is the type,

:-/ I'm feeling dumb. How is an identifier sufficient to represent all the types a parameter could be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably not. The spec calls this a type but I can't find any definition for the type non-terminal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agocke

The spec calls this a type but I can't find any definition for the type non-terminal.

It's in the Types section.

```C#
void M()
{
var (x, y) = new Point(0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: if we allow inheritance in this proposal, deconstruction gets... interesting :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would it be any more interesting than it can be now? You can already define Deconstruct methods for your classes and use inheritance with them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! i think i thought it might be a problem because you wouldn't necessarily know the relationship between the order of data-members in a derived type, vs those in the base type.

however, as i try to make an example, it looks like it may not be an issue. You'd would just use the parameter-list sig as provided int he derived type for generating things.

1. A constructor corresponding to the parameter list in the type definition.
2. `ItemX` properties for each member
3. If a parameter is named in the parameter list, a public property with the same name
that gets/sets that member.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the past, there's been contention about naming properties the same name as parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to not care.

@CyrusNajmabadi
Copy link
Member

This an interesting start. I've put in some thoughts about areas i think need to hammered out. esp. inheritance. it either needs to be disallowed. or it needs a large amount of thought into how it would work across named-tuples.


In the simplest form, the body of the class is also optional.

## Generated members
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated members [](start = 3, length = 17)

Maybe we should also generate a conversion from named-tuple to regular-tuple.
Point(int, x, int y) -> (int x, int y) (and possibly vice-versa)
That would make it easier to upgrade your code from tuple to named-tuple incrementally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about that. There is some benefit to allowing incremental upgrade, but it seems like this would make it much easier for a user to get into a scenario where they accidentally convert from data type to another data type without intending to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be very likely to break @agocke's equality rules by making it very easy to accidentally coerce the Point(x, y) to a (x, y) tuple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your concern is about accidentally converting, then an explicit method (.AsTuple()) and a constructor (Point((int x, int y) tuple)) would make the translation explicit, while retaining the benefits.


In reply to: 198679230 [](ancestors = 198679230)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could get behind an explicit AsTuple(). My concern is about the accidental cases.

@HaloFour
Copy link
Contributor

@jnm2

Does this replace primary constructors?

Only in the way that it takes the same syntax and uses it in a way that is significantly less useful. Without primary constructors I think that ADTs/DUs also come into question. Unless the team thinks that they should be big tuples, too.

Being able to alias tuples would solve every problem mentioned above but with fewer compatibility concerns, both in the current project and across any dependent projects. Consumers wouldn't have to be recompiled and any/all reflection code would behave exactly as it did before.

@jnm2
Copy link
Contributor

jnm2 commented Jun 27, 2018

I'm having the same concern, that this would block other uses of the primary constructor syntax.

@333fred
Copy link
Member

333fred commented Jun 28, 2018

The compiler makes some assumptions now that the container of a proper tuple is irrelevant

One of the intentions of this proposal is that for "named tuples", this is not true. For example, this would ensure that a CartesianCoordinate is not comparable with PolarCoordinate. Aliasing would allow these to be comparable, which is not desirable for those two types. There might be a place for aliasing tuples, but that would remove a large portion of the benefit from this feature.

@HaloFour
Copy link
Contributor

@333fred

One of the intentions of this proposal is that for "named tuples", this is not true.

But for any project migrating to this feature, that was true. So this feature would potentially break that code, possibly subtly.

And if C# would get proper strong type aliasing it would solve this problem in that each alias is a distinct "type", so two aliases of (int, int) wouldn't be considered the same type. This has further benefits in being able to alias other types like string, Guid or int that are common representations of data that can be accidentally passed into the wrong domain.

@ufcpp
Copy link

ufcpp commented Jun 28, 2018

Can we use combination of the named tuples and the data classes? (like formerly proposed features - records/primary constructors)

data class A(int X, int Y)
{
    public int Z { get; set; }
}

var a = new A(1, 2) { Z = 3 };

@agocke
Copy link
Member Author

agocke commented Jun 28, 2018

For people looking at the primary constructor proposal, what specifically do you miss with this design?

I think the advantage is compatibility with existing tuple constructions.

@agocke
Copy link
Member Author

agocke commented Jun 28, 2018

Also, I was anticipating you could use these as elements of discriminated unions. Why not?

@gulshan
Copy link

gulshan commented Jun 28, 2018

Just mentioning, I like the Kotlin approach in these regard. I also think their approach is easy to understand and reason about-

  • Primary constructor is available to all classes/types. Properties are generated from primary constructor parameters and can be mutable or immutable. Nothing else is generated/added to the type. Existing classes/types can be easily refactored to use primary constructors.
  • A data modifier on a class/type generates necessary things for equality and with operations.

In the proposed C# approach with data classes and named tuples, I think-

  • There are similarities and slight differences among these proposals. Both data classes and named tuples generate members for simple operations. Yet because of slight differences in behaviors(like use of constructors, object initializers, immutability etc), it will not easy for a developer to decide to go with regular POCO or data classes or named tuple IMO.
  • Existing classes cannot easily move using these new features. Because these changes will not be mere refactoring, retaining same behavior. It will be a decision problem.

I think facing these confusions, many(or most) developers will just keep using POCO. So, I think there should be more simplicity in these regard and proposals should be more modular.

@agocke
Copy link
Member Author

agocke commented Jun 28, 2018

I see the point, but one thing I think is missing is that F# (and I think Kotlin) both allow statements/expressions directly inside the primary constructor body. Without that, doesn't it seem limited? Wouldn't you almost always want a "data" primary constructor?

@gulshan
Copy link

gulshan commented Jun 28, 2018

F# primary constructors are like the one proposed in C# 6- no property is generated from primary constructor parameters. Properties have to be defined (and optionally initialized) again. In Kotlin, developers can decide to (or not to) generate properties from primary constructor parameters. All of these following lines are valid class definitions in Kotlin-

class Test1(x: Int)                // No property generated, an empty class
class Test2(val x: Int)            // Public immutable property generated
class Test3(var x: Int)            // Public mutable property generated
class Test4(private val x: Int)    // Private immutable property generated
data class Test5(val x: Int)       // Equality and with operation generated for properties

@agocke I'm not fully sure this was your concern though.

@gulshan
Copy link

gulshan commented Jun 28, 2018

Corresponding behaviors in C# may look something like this IMO-

  1. Access modifiers indicates to generate properties from primary constructor parameters-
class Test1(int x);                       // No property generated, an empty class
class Test2(public readonly int X);       // Public immutable property generated
class Test3(public int X);                // Public mutable property generated
class Test4(private readonly int X);      // Private immutable property generated
data class Test5(public readonly int X);  // Equality and with operation generated for properties
  1. Or, it can be decided that, properties will always be generated from primary constructor parameters. Then access modifiers are not needed to indicate property generation. Then proerties can be defaulted to public access but previous no-property generating Test1 class is not possible-
class Test2(readonly int X);          // Public immutable property generated
class Test3(int X);                   // Public mutable property generated
class Test4(private readonly int X);  // Private immutable property generated
data class Test5(readonly int X);     // Equality and with operation generated for properties
  1. Also, generated proerties can always be getter-only i.e. immutable. Then readonly can be removed but previous mutable property generating Test3 class is not possible-
class Test2(int X);          // Public immutable property generated
class Test4(private int X);  // Private immutable property generated
data class Test5(int X);     // Equality and with operation generated for properties

@HaloFour
Copy link
Contributor

@agocke

I think the advantage is compatibility with existing tuple constructions.

That's also a disadvantage when you don't want tuple constructions, conventions and opinions. As brought up in other comments these "named tuples" are likely to require additional members and behaviors to help bridge the migration from a tuple to a nominal type. But if I'm writing a new type I don't need (and probably don't want) any of that.

Also, I was anticipating you could use these as elements of discriminated unions. Why not?

That would mean that elements of a discriminated union have to also be "named tuples" and any of the opinions described above are along for the ride regardless of whether or not the developer wants them to be.

Without that, doesn't it seem limited? Wouldn't you almost always want a "data" primary constructor?

I actually agree with this sentiment. I don't think that primary constructors should be another full-fledged constructor syntax. I think that they should be relative limited and opinionated, but not necessarily overlapping with the opinions of named tuples. If the "record" part of primary constructors is to be moved to its own proposal as "data classes" (which I think is a good thing) then in my opinion, primary constructors should serve to bring positionality and deconstruction, but nothing more. I want it to make sense to use primary constructors with "data classes", and to use them to define DUs or case classes.

@agocke
Copy link
Member Author

agocke commented Jun 28, 2018

Sure. Let's try to break it all down, then build it back up.


  1. "Primary constructors" don't automatically generate members.

That means class Point(int x, int y); turns into

class Point
{
    public Point(int x, int y) { }
}

No generated members means no deconstruct. Equality could be defined or not. If it's not defined, this is just reference equality. If it is, all instances of Point are equal.

This is the least desirable state, for me. It turns the shortest, simplest syntax into a worthless definition that no one will ever want to use. It's almost the definition of a language pit of failure.


  1. Automatically generate members, no equality.

Result:

class Point
{
   public int x { get; }
   public int y { get; }
   public Point(int x, int y) { this.x = x; this.y = y; }
   public Deconstruct(out int x, out int y) { x = this.x; y = this.y; }
}

This seems usable, but frustrating. Usually the point of declaring these small, simple data structures
is to be thin wrappers over data, as I described in my data classes post. This approach makes the
defaults work against you, and really forces you to use data class Point(int x, int y); for the most
common use case.


  1. Automatically generate members, with equality
class Point : IEquatable<Point>
{
   public int x { get; }
   public int y { get; }
   public Point(int x, int y) { this.x = x; this.y = y; }
   public Deconstruct(out int x, out int y) { x = this.x; y = this.y; }
   Equals ...
   HashCode ...
}

This seems pretty useful. I could imagine a wide variety of types being perfectly represented by this
construction. It also makes flipping the defaults for the uncommon cases relatively easy. Say you
want to use reference equality. You can just define,

  public bool override Equals(Point other) => object.ReferenceEquals(this, other);
  public int override GetHashCode() => base.GetHashCode();

The primary drawback is that if you're evolving an existing tuple, all calls to ItemX will be broken.

There is, however, a question of what "equality" means in this context. I see two possibilities, 3a and 3b.


3a. The primary constructor syntax implies data, so it has the same semantics as data classes.

This has no affect on class Point(int x, int y); but it would on something more complicated, like

class Point(int x, int y)
{
  public int Z { get; }
}

In the expanded example, Z would be included in the equality, as well as x and y (since they are implicitly public autoproperties). Presumably adding a data modifier to the declaration would be illegal because it's redundant.


3b. The primary constructor syntax has equality, but it only includes the constructor parameters, not
other members.

I could see some value in this. It makes for a relatively easy way to split out the members you consider
the "equality" API, and the stuff that's just "supporting". In addition, there's no reason why we couldn't
allow the optional data modifier that essentially transforms the semantics into (3a). The major downside
I see is that we're making autogenerated equality more complicated and there are more rules to remember.


  1. The original proposal

This is basically (3) with ItemX properties also generated. The upside is that it makes tuple evolution
easier. The downside is that the ItemX properties may be unwanted.


I think that pretty much covers everything everyone's talked about, right? I haven't touched on With-ers, but I imagine 3a/3b would also be affected, which could be somewhat complicated.

@HaloFour
Copy link
Contributor

@agocke

I like 3b.

My issue with 4 is exactly those additional ItemX members. Sure, they can help with tuple migration specifically, but that doesn't appeal to me because I would never think to use tuples in places where I'd want domain objects so it feels like it's a feature to help people who accidentally misused tuples. I'm not claiming that this does not happen. If the ItemX properties were well hidden this could be more palatable to me, e.g. hidden from Intellisense but there just in case some code still accidentally calls it. But the challenge would be ensuring that such a thing doesn't still leak out in other places, like XML/JSON/whatever serialization.

TL;DR, I like Scala "case classes", and I'd like that:

case class Point(x: Int, y: Int)
val pt = Point(2, 3)
val pt2 = Point(2, 3)
pt == p3 // true
val x = pt.x
val Point(x, y) = pt // Scala's version of a deconstruct, I like C#'s better
pt match {
    case Point(x, y) => "yay"
    case _ => "boo"
}

@agocke
Copy link
Member Author

agocke commented Aug 4, 2018

I'm going to close this out for now. I think, after a bit of discussion, this is heading in a place where my proposal is really just a few changes to the existing records proposal. Rather than duplicate a bunch of info, I'll just propose to make a few changes to the existing records proposal.

@agocke agocke closed this Aug 4, 2018
@agocke agocke deleted the named-tuples branch August 4, 2018 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants