Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Types for enum variants #1450

Closed
wants to merge 3 commits into from
Closed

Types for enum variants #1450

wants to merge 3 commits into from

Conversation

nrc
Copy link
Member

@nrc nrc commented Jan 7, 2016

Edit: this RFC is now only about variant types, the unions stuff is removed in favour of #1444

This is something of a two-part RFC, it proposes

  • making enum variants first-class types,
  • untagged enums (aka unions).

The latter is part of the motivation for the former and relies on the former to
be ergonomic.

In the service of making variant types work, there is some digression into
default type parameters for functions. However, that needs its own RFC to be
spec'ed properly.

This is something of a two-part RFC, it proposes

* making enum variants first-class types,
* untagged enums (aka unions).

The latter is part of the motivation for the former and relies on the former to
be ergonomic.

In the service of making variant types work, there is some digression into
default type parameters for functions. However, that needs its own RFC to be
spec'ed properly.
@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jan 7, 2016
@nrc nrc self-assigned this Jan 7, 2016
@nrc nrc mentioned this pull request Jan 7, 2016

Importing an enum imports it into both the value and type namespace. Importing
a variant imports it only into the value namespace. To maintain backwards
compatibility, this will remain the default. In order to import an enum variant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually required for backwards compatibility? I don't think there's currently a way you can have a variant in scope and have a type with the same name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, looks like you're right, that is good news!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't check the code right now, but it looks like variants are already imported into both namespaces (and I do remember that they are defined in both namespaces):

use E::V;

enum E {
    V { field: u8 }
}

type V = u8; // Conflicts as expected

fn main() {
    let e = V{field: 0}; // Compiles as expected 
    match e {
        V{..} => {} // Compiles as expected 
    }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't check the code right now, but it looks like variants are already imported into both namespaces (and I do remember that they are defined in both namespaces):

I was wondering about that, I know we tried to prepare for the possibility that variants would become types...

@mahkoh
Copy link
Contributor

mahkoh commented Jan 7, 2016

#[repr(union)]
enum X {
    A {
        alpha: u8,
    },
    B(u8),
}

struct Y {
    x: X,
}

Given y: Y, how do I access alpha?

@nrc
Copy link
Member Author

nrc commented Jan 7, 2016

@mahkoh

fn foo(y: Y) {
    let a = unsafe { y.x as X::A };
    println!("alpha: {}", a.alpha);
}

@mahkoh
Copy link
Contributor

mahkoh commented Jan 7, 2016

@nrc This means that modifying any field in a union in a struct requires

  1. Copying the union field out of the struct,
  2. Making the modification,
  3. Copying the field back into the struct.

In the presence of nested unions, repeat. This seems to be extremely inefficient and unergonomic since unions are very often used as struct fields.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 7, 2016

Furthermore, what happens if the union doesn't implement Copy?

@nrc
Copy link
Member Author

nrc commented Jan 7, 2016

Should work by reference too:

fn foo(mut y: Y) {
    let a = unsafe { &mut y.x as &mut X::A };
    *a = 42;
}

I should add casts of the reference types to the RFC too.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 7, 2016

Also I believe that one of your arguments against the other RFC was that it requires unsafe code even if you already know the variant. This can be easily solved by taking the idea from this RFC and applying it to the other RFC:

union X {
    f1: T1,
    f2: T2,
}

let x: X = ...
let y: &T1 = unsafe { &x as &T1 };

@retep998
Copy link
Member

retep998 commented Jan 7, 2016

Let's look at a simple example of a union I already deal with in winapi.

https://github.com/retep998/winapi-rs/blob/master/src/wincon.rs#L17-L25

The original in C looks like:

typedef struct _KEY_EVENT_RECORD {
    BOOL bKeyDown;
    WORD wRepeatCount;
    WORD wVirtualKeyCode;
    WORD wVirtualScanCode;
    union {
        WCHAR UnicodeChar;
        CHAR   AsciiChar;
    } uChar;
    DWORD dwControlKeyState;
} KEY_EVENT_RECORD;

Under this RFC in Rust it would look like:

STRUCT!{struct KEY_EVENT_RECORD {
    bKeyDown: ::BOOL,
    wRepeatCount: ::WORD,
    wVirtualKeyCode: ::WORD,
    wVirtualScanCode: ::WORD,
    uChar: KEY_EVENT_RECORD_uChar,
    dwControlKeyState: ::DWORD,
}}
#[repr(union)] enum KEY_EVENT_RECORD_uChar {
    UnicodeChar(::WCHAR),
    AsciiChar(::CHAR),
}

Since I cannot conveniently import UnicodeChar into the global namespace (union variants are not in the global namespace in C so there may be other identifiers with the same name) that means users would have to fully specify that type, which is kinda ugly:

let x: KEY_EVENT_RECORD = ...;
let y = x.uChar as KEY_EVENT_RECORD_uChar::UnicodeChar;

Meanwhile in C someone could just do x.uChar.UnicodeChar. They can even mutate the variant in place instead of having to do x.uchar = KEY_EVENT_RECORD_uChar::UnicodeChar(foo).

With my current macro solution the user can call inherent methods to get (mutable) references to the variant, which is almost as good as direct field access. Any RFC for unions needs to provide syntax that is better than what I currently have, which means direct field access is really important.

@nrc
Copy link
Member Author

nrc commented Jan 7, 2016

@retep998 I don't think that direct field access is a very Rust-y solution - it's an unsafe operation which doesn't indicate why/in what way it is unsafe. I think the casting should extend to (mutable) references, and within a scope you can bring in enum variants as types, so you have a clear two line solution instead of an unclear one solution, which seems not too bad:

fn foo(x: KEY_EVENT_RECORD) {
    use KEY_EVENT_RECORD_uChar::*;

    let y = unsafe { &mut x.uChar as &mut UnicodeChar };
    y.0 = 42; 
}

You could stick that all on one line if you like, but I admit that gets pretty ugly.

expression - both the variant type and the enum type. If there is no further
information to infer one or the other type, then the type checker uses the enum
type by default. This is analogous to the system we use for integer fallback or
default type parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, pretty much what I expected and @nikomatsakis also confirmed it's what he would do (although him and @aturon aren't sure it's enough i.e. compared to full subtyping).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it gets a bit more complicated if we also support nested enums, though probably the fallback would just be to the root in that case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wanted to add that with nested enums we would have a chain of potential types to infer to, but I realized that nested enums aren't in the scope of this RFC.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 7, 2016

I don't think that direct field access is a very Rust-y solution - it's an unsafe operation which doesn't indicate why/in what way it is unsafe

The same applies to calling an unsafe method.

You could stick that all on one line if you like

You have to do either that or introduce another scope due to lexical lifetimes. More realistically, the code would look like this:

fn f(mut x: Struct1) {
    {
        use KEY_EVENT_RECORD_uChar::*;

        let y = unsafe { &mut x.f as &mut Variant1 };
        y.0 = 42; 
    }
    g(&x);
}

And when you have nested enums then it gets really funny.

inheritance: if we allow nested enums, then there are many more possible types
for a variant, and generally more complexity. If we allow data bounds (c.f.,
trait bounds, e.g., a struct is a bound on any structs which inherit from it),
then perhaps enum types should be considered bounds on their variant types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can retain only trait bounds and use e.g. V: VariantOf<E> - which is what I've used in my version of the examples in @nikomatsakis' blog post introducing many of these ideas.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 7, 2016

Tbqh, if this is the union accepted union implementation, then I'll just continue to do what I do right now: Add two methods per variant (variant, variant_mut) and transmute the passed reference. Code like that already works and it will be shorter than this. The only thing gained by this implementation is that one can have a type with the correct size and alignment. But once ctfe becomes better, that will be possible without this implementation too.

@strega-nil
Copy link

I love this proposal, thank you for coming up with it. It still needs a little work, but it is the best thing I've seen so far and works well in the fledgeling memory model that will be released soon.

@LaylBongers
Copy link

Here is an alternative proposal for syntax on unions, anything but final but as a rough idea:

struct KEY_EVENT_RECORD {
    bKeyDown: ::BOOL,
    wRepeatCount: ::WORD,
    wVirtualKeyCode: ::WORD,
    wVirtualScanCode: ::WORD,
    uChar: KEY_EVENT_RECORD_uChar,
    dwControlKeyState: ::DWORD,
}

c_union KEY_EVENT_RECORD_uChar {
    unicode_char: ::WCHAR,
    ascii_char: ::CHAR,
}

// Accessing:
let mut record = foo();
{
    // Immutable Ref
    let unicode_c = unsafe { record.wChar.as_unicode_char() };
}
{
    // Mutable Ref
    let ascii_c = unsafe { record.wChar.as_ascii_char_mut() };
}

Setting is still undetermined. My main issue with this syntax would be that adding methods feels too magic-y for rust, but I feel they're important to communicate what's unsafe about what's being done. I want to try to fit in a #[derive(_)] in there to mitigate that, but c_union wouldn't have any use without the derive so it feels wrong to add it.

@strega-nil
Copy link

I could see this going a few ways:

// This is a simplified struct from WinAPI
#[repr(union)]
enum tagVARIANTvariant {
    llVal(LONGLONG),
    lVal(LONG),
    bVal(BYTE),
    iVal(SHORT),
}

#[repr(union)]
enum tagVARIANTtag {
    __tagVARIANT {
        vt: VARTYPE,
        variant: tagVARIANTvariant,
    },
    // other fields here I've taken out
}

struct tagVARIANT {
    tag: tagVARIANTtag,
}

fn test() {
    let x: tagVARIANT;
    // This is the safest bet. However, it's also a pain to use, and it's unlikely that FFI people
    // will go along with it
    (x.tag as tagVARIANTtag::__tagVARIANT).vt = LLVAL;
    (x.tag as tagVARIANTtag::__tagVARIANT).variant = tagVARIANTvariant::llval(100);

    // We could also automatically import the types. This is backwards compatible in the `as` field,
    // although I'm less certain about the = field. This would also make normal enums nicer to use,
    // and I've heard talk of doing this with match statements.
    (x.tag as __tagVARIANT).vt = LLVAL;
    (x.tag as __tagVARIANT).variant = llval(100);

    // This is the last suggestion, and my least favorite. However, if we can't find a solution, we
    // may have to go with it :/
    x.tag.__tagVARIANT.vt = LLVAL;
    x.tag.__tagVARIANT.variant = tagVARIANTvariant::llval(100);
}

To clarify what I'm talking about with match statements, some have talked about:

enum E {
    Var0,
    Var1,
}

match E { // this would be okay, despite never `use`ing Var0 or Var1
    Var0 => {},
    Var1 => {},
}

@mahkoh
Copy link
Contributor

mahkoh commented Jan 8, 2016

There is also a need for non-anonymous variants. E.g.

#[repr(C)]
struct TagVariant {
    vt: VARTYPE,
    variant: tagVARIANTvariant,
}

#[repr(union)]
enum tagVARIANTtag {
    __tagVARIANT: TagVariant,
    // other fields here I've taken out
}

@strega-nil
Copy link

@mahkoh That's covered by tagVARIANTvariant.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 8, 2016

@ubsan So every access additionally requires a .0?

@strega-nil
Copy link

@mahkoh If you're doing it that way, yes, I assume.

@mahkoh
Copy link
Contributor

mahkoh commented Jan 8, 2016

This RFC deals with two orthogonal issues:

  • Allowing enum variants to be used as types
  • Allowing the discriminant to be removed from enums

Therefore it should be split into two RFCs.

@eddyb
Copy link
Member

eddyb commented Jan 8, 2016

I agree with @mahkoh, as I believe that enum variants as types is less controversial of a feature, and it's an important step towards empowering ADT hierarchies.


## impls

`impl`s may exist for both enum and variant types. There is no explicit sharing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this let us fix rust-lang/rust#5244? Example:

enum Option<T> {
    Some(T),
    None,
}

// we add this
impl<T> Copy for Option<T>::None {}

// then, either this just works
let x: [Option<String>; 10] = [None; 10];

// or this works (can this be written without the temporary `t`?)
let t: [Option<String>::None; 10] = [None; 10];
let x: [Option<String>; 10] = t;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would likely require that the variant types have the same size as the enum itself, so they'd have to have unused padding for things like the discriminant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the case according to the RFC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct solution here is not using Copy directly, i.e. by either allowing constants, or ADTs of Copy-able types.

While @japaric's proposal may be equivalent to the latter option, let x: [Option<String>; 10] = [None; 10]; would not work with this RFC as written because None would infer to Option<String> which is not Copy - and there is no conversion from [None<T>; N] to [Option<T>; N] (but there could be? not sure what we can do here).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Fri, Jan 08, 2016 at 07:14:59AM -0800, Jorge Aparicio wrote:

Would this let us fix rust-lang/rust#5244?

Yes, perhaps, that's one of the appealing things about having variants
be types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Fri, Jan 08, 2016 at 10:06:19AM -0800, Eduard-Mihai Burtescu wrote:

While @japaric's proposal may be equivalent to the latter option, let x: [Option<String>; 10] = [None; 10]; would not work with this RFC as written because None would infer to Option<String> which is not Copy - and there is no conversion from [None<T>; N] to [Option<T>; N] (but there could be? not sure what we can do here).

In the RFC as written, I think something like [None: None<String>; 32] would be required?

@nikomatsakis
Copy link
Contributor

There is a lot here. I'll just try to write some comments as I digest (and catch up on the already lengthy comment thread).

First, I'm not a big fan of foo as Variant downcasting. It's good for unions but kind of "bad style" elsewhere and I'm not sure we need to add it; I'd prefer to encourage people to write if let or match, just as they do today.

Second, I would love to have variants be types, but the integration described in the RFC sounded a bit rougher than I would like. I guess I have to process it more, as the RFC has a number of particular mechanisms, and it's hard for me to judge if they are good choices or not. This may also be something we can also expect to hammer on and experiment with in the implementation phase, but it'd be good to think hard about the rough edges that we expect to emerge.

Finally, on the topic if unions in particular. It's certainly the case that a #[repr(union)] annotation feels more natural on an enum than on a struct, but I also find something in #1444 very appealing. Certainly it's a much more targeted addition, which I like. I also like that you use field projection foo.bar to select which variant you want -- it feels like an "active selection", whereas matching does not. (as, admittedly, feels good too -- but then as seems to be sugar for match, and match feels wrong, since I expect that in a match the compiler will pick which arm, not me.)

@petrochenkov
Copy link
Contributor

The "postponed" issue for this RFC was never created, so I've made one - #2347.

@petrochenkov petrochenkov mentioned this pull request Feb 23, 2018
@Centril Centril added the postponed RFCs that have been postponed and may be revisited at a later time. label Feb 26, 2018
@alexreg
Copy link

alexreg commented May 15, 2018

Any chance of this being revisited soon?

@nielsle
Copy link

nielsle commented May 15, 2018

For the sake of posterity. The following pattern combined with crates such as https://github.com/JelteF/derive_more and https://github.com/DanielKeep/rust-custom-derive allows you to do some of the stuff mentioned in the RFC.

struct Variant1( i32, String );
struct Variant2 { f1: i32, f2: String };

pub enum Foo {
    Variant1(Variant1),
    Variant2(Variant2),
}

So if the RFC is revived, then the pattern should probably be discussed under alternatives.

delapuente added a commit to delapuente/qasmsim that referenced this pull request Aug 8, 2018
The instructions of QASM are classified in a hierarchy, I've tried to
implement them in the form of several enums but extending enums is not
supported in Rust (yet).

https://stackoverflow.com/questions/25214064/can-i-extend-an-enum-with-additional-values

The hierarchy is mostly related to QuantumOperation and UnitaryOperation.
A UnitaryOperation is always a QuantumOperation but not the other way around.

I went for the wrapping alternative which will make getting the value a bit
tedious. I hope to be able of implementing some traits to ease this task but
without rust-lang/rfcs#1450 it could be difficult.

In reality, this hierarchy is not strictly neccessary, I could have
grouped all the instructions in the same enum and let the grammar accept
those programas that use the different operations correctly, but I wanted
to experiment with a proper separation of instruction types. Let's see.

Perhaps, a different approach would be following this Thin Traits proposal:
http://smallcultfollowing.com/babysteps/blog/2015/10/08/virtual-structs-part-4-extended-enums-and-thin-traits/
@alexreg
Copy link

alexreg commented Aug 22, 2018

@nrc Any chance of getting this RFC reopened now, and potentially merged soon? I have a couple of things I'm working on now, but I'd be willing to take this on afterwards perhaps.

@eddyb
Copy link
Member

eddyb commented Aug 22, 2018

@alexreg You'd probably need to ask @nikomatsakis about the status of what'd be needed to implement this.

@alexreg
Copy link

alexreg commented Aug 22, 2018

Okay, hopefully @nikomatsakis will see this comment and reply here.

@Ericson2314
Copy link
Contributor

"Safe unions", non-lexical &move liveness, and non-lexical enum variant type are all related. I'd love to have all 3!

@leonardo-m
Copy link

leonardo-m commented Aug 22, 2018

I have a question regarding enum variants as types, is this going to be correct?

fn main() {
    use std::mem::size_of;
    assert_eq!(size_of::<Option<u32>>(), 8);
    assert_eq!(size_of::<Some<u32>>(), 4);
}

@eddyb
Copy link
Member

eddyb commented Aug 22, 2018

@leonardo-m I don't think so, since we'd want to allow coercing &Some<u32> to &Option<u32>, which means it must already be the full size of the enum and have the tag set to 1.

@alexreg
Copy link

alexreg commented Aug 22, 2018

@Ericson2314 Is there a proposal for "safe unions"? I don't know what that means. Unless perhaps it's similar to one of the ideas suggested here. No idea what the other two mean either. :-P

@jeffreydecker
Copy link

I would love to see this get in at some point in the future. Allowing for variants to impl traits on their own would be a nice alternative to requiring a match to essentially do the same.

I'm beginning to use enum variants to help represent states in a state machine and this would go a long way in making the experience a bit better, mainly with regard to code organization.

The pattern referenced above by @nielsle is valid and I have tried it out. IMO it's not nearly as readable/understandable or even maintainable as supporting variants as first class types would be. I've actually stuck with using pure enum variants as opposed to that pattern because of this.

@alexreg
Copy link

alexreg commented Sep 24, 2018

@jeffreydecker I think a bunch of us would like the feature! Fancy writing an RFC? ;-) You could easily get some help from folks here or on Discord, I reckon.

@nrc
Copy link
Member Author

nrc commented Sep 24, 2018

The blocking thing for this feature is properly understanding default type parameters and the interaction with the various kinds of fallbacks to defaults (e.g., numeric fallbacks). I think once we understand that a bit better, then we could re-open this RFC (or something like it).

@alexreg
Copy link

alexreg commented Sep 24, 2018

Ah, fair enough. We already have default type parameters though, so I don't see the problem.

@burdges
Copy link

burdges commented Sep 25, 2018

There is another concern in that refinement types would ideally extend enum variant types and be useful for formal verification, so an RFC would ideally consider those possible future directions.

@alexreg
Copy link

alexreg commented Sep 25, 2018

I think that would be coming a very long way off, if ever... probably not worth much thought at this point.

@varkor
Copy link
Member

varkor commented Nov 10, 2018

I've submitted an RFC to follow up on this one and permit enum variants to be treated as types:
Enum variant types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. postponed RFCs that have been postponed and may be revisited at a later time. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.