-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repr(tag = ...)
for type aliases
#3659
base: master
Are you sure you want to change the base?
Conversation
This seems like a straightforward change that we should support to improve type definitions for interfaces that depend on type aliases, particularly those that vary by target. @rfcbot merge That said, I think we should go with the mentioned alternative of allowing type aliases to shadow reprs (with a lint) in a future edition, to avoid forcing people to write @rfcbot concern allow-type-aliases-to-shadow-reprs (We may even consider allowing that in current editions on the basis of a crater run turning up no conflicts.) |
Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. |
@rfcbot concern ambiguity This makes things like type transparent = u64;
#[repr(transparent)] enum Foo { … } legal. That seems particularly bad for any other proc macros that want to look at the attribute as part of things like their own soundness checks, since they'd no longer know what's going on from what's in the attribute. Also, I'm always a bit sad when we have "real" stuff in an attribute, in the sense that it's something that we have a proper grammar construction for rather than just needing to deal in tokens. It makes me think of changing it to instead be enum Foo : c_int { … } or something instead. |
@rfcbot resolve ambiguity I'm an idiot and somehow missed the "you need That does solve the hard blocker, but it still makes me wish for a better thing instead so that |
Just an idea for a another syntax that is not ambiguous: And can I use complex types such as |
Hmm, making the existing Although, I'm not sure if "type = " is the right name, since we're really talking about the enum discriminant here, but "discriminant" is unfortunately a very, very long name. Perhaps In terms of allowing shadowing, the main reason why I'm totally against it is because there's no way to override the shadowing. For example, you can shadow |
"type" or "base" or several other possible names could work there. |
I also really like |
Decided to just update the RFC to use the |
This is not straightforward, we currently never have "real code" (types, expressions, patterns) in inert attributes. For the purpose of this feature we'll need treat the (In other words, this is not what inert attributes are generally supposed to be used for.) |
Discriminant might be long, but on the other hand it describes exactly what this is, and makes it very clear and obvious. It is also consistent with So if it was me, I'd go with It doesn't look bad at all: #[derive(Clone, Copy, Eq, PartialEq)]
#[repr(discriminant = u32)]
#[non_exhaustive]
enum Foo {
//...
} |
@petrochenkov How difficult is it to make |
@kennytm |
I'm a bit confused about the concept of inert attributes versus attribute macros, and also why particularly this would be a problem to implement. I assume you're totally correct about the specific issues with implementing this, but could use a bit more info to fully understand what we're going for. From my (relatively naïve) perspective, when the definition is expanded in the HIR, we just resolve the type alias when we put together the enum definition, since we should have the information at that point. Since we're literally just listing allowed types for the alias, it shouldn't need much extra logic. But this appears to be wrong, and I'm not sure why. |
text/0000-repr-type-aliases.md
Outdated
In addition to the primitive types themselves, you can also use the path to a type alias in the `repr` attribute instead, and it will resolve the primitive type of the type alias. However, to ensure compatibility as new potential representations are added, the path to the alias must contain a double-colon: you can access an alias `Alias` defined in the same module by using `self::Alias`. | ||
|
||
For example, `#[repr(core::ffi::c_int)]` is valid because it contains a double-colon, but a `use core::ffi::c_int` followed by `#[repr(c_int)]` is not. If you wanted to `use core::ffi::c_int` first, then you could still do `#[repr(self::c_int)]` to reference the type. | ||
To ensure compatibility, the `#[repr(type = ...)]` form is required if the type is not one of the known primitive types. Note that this form is not necessarily equivalent to using the primitive representations directly, since shadowing is possible; for example, if you did `type u32 = u8` and then `#[repr(type = u32)]`, this would be equivalent to `#[repr(u8)]`, not `#[repr(u32)]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think type u32 = u8
seems needlessly obfuscating; I think it'd be a more readable example to write type C = u8
and then #[repr(type = C)]
, which is equivalent to #[repr(u8)]
rather than $[repr(C)]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that I was intentionally pointing out the obfuscation here because it feels more likely: #[repr(type = C)]
is obviously going to mean whatever C
type you have, but #[repr(type = u32)]
meaning #[repr(u8)]
is more likely to occur in something like proc macros if someone is doing something nefarious. So, genuinely, there is a preference to do #[repr(u32)]
over #[repr(type = u32)]
when you don't necessarily trust the parent scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've since updated this section a bit to elaborate a bit better, including what a type alias C
might look like. Does this feel like it addresses your concerns?
@rfcbot resolve allow-type-aliases-to-shadow-reprs Thank you to @ogoffart for the better |
@rfcbot reviewed This seems like a great idea to me. As a future extension, I would like it if we could support types like the following: #[repr(transparent)]
pub struct SecretInteger(u16); Since that would allow "delegating representations" without allowing code to rely on the specific value of a type alias (this is especially useful when it changes on rarely-tested platforms). If you agree that this is sensible, could you please add it as a future possibility? |
…dowing, and alter the recommended list of lints
So, taking in additional suggestions:
Still need to go through more existing feedback though, so, there will probably be more changes. I don't expect to change the syntax again, though. |
repr(type)
for type aliasesrepr(discriminant = ...)
for type aliases
If
All these cases require having the So what are the possible ways to put
enum E: TYPE { ... } // E.g. something like this (the specific syntax is taken from C++).
// Before expansion
#[repr_discriminant(TYPE)]
enum E { ... }
// After expansion, you may see this with `-Zunpretty=expanded`.
enum E builtin#repr TYPE { ... }
// Before expansion
#[repr(discriminant = TYPE, packed)]
enum E { ... }
// After expansion, you may see this with `-Zunpretty=expanded`.
#[repr(packed)]
enum E builtin#repr TYPE { ... } |
Now that this is in FCP, I decided to go through the last of the feedback and add more info to the RFC so folks that get notified via TWIR or otherwise can get a reasonable picture of what this feature is proposing. Let me know if you have any additional feedback on the contents of the RFC itself. (Note: these aren't any more design changes, just clarifications on alternatives, prior art, and drawbacks. In particular, I mentioned the desire for syntax and how that can still be added after this RFC is merged.) Also, I still would like to express my preference for the attribute over the syntax regardless. To me, |
text/0000-repr-type-aliases.md
Outdated
## `type` | ||
|
||
The second proposal for this RFC used `type = ...` instead of `discriminant = ...`. Initially, this decision was chosen because `discriminant` was long to type, but it was ultimately decided that `discriminant` is more clear and that the extra typing is worth it. Additionally, RFC [#3607] proposes using `discriminant` in its proposed syntax, so, this would be further in line with that as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we intend for repr(discriminant = u8)
to be equivalent to repr(u8)
, then I'd like to propose that we perhaps shouldn't use repr(discriminant = ...)
since it might complicate a future possibility for enums.
That future possibility — alluded to in this RFC — is that we someday might want to expand the accepted set of discriminant types of enums (e.g., to include things implementing StructuralEq
). And, if we had such functionality, it might also be nice to declare up front what the discriminant type is. A natural syntax for this could be #[repr(discriminant = ...)]
.
However, repr(discriminant = ...)
, as proposed by this RFC, does not only set the discriminant type of an enum, but also affects its memory layout. This is not desirable for all enums. If this RFC is adopted as-is, we'd need to find an alternative way to say "the discriminant type of this enum is X, but I don't care what the enum's in-memory representation is".
Given this, I'd like to suggest that this RFC consider repr(tag = ...)
. Doing so has three advantages:
- it makes it clearer that the annotation has an affect on the enum's in-memory representation, not just its logical representation
- it keeps
repr(discriminant = ...)
syntactically reserved for the aforementioned future possibility in which the user explicitly provides the type of the discriminant - it's significantly shorter than "discriminant"
If not repr(tag = ...)
than perhaps something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused how labeling this a tag makes it less confusing. It's a shorter word, but that's about it.
We've already established that we're changing the memory layout by using repr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that these are often called "tagged unions" in contexts where memory layout matters, I do think the name is clearer. It's also going to be consistent with other in-flight changes to Rust's documentation: rust-lang/reference#1454 (comment)
What are the advantages of discriminant
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are often called this, but we also explicitly use the term discriminant in places where it has an API meaning. See: https://doc.rust-lang.org/nightly/std/mem/fn.discriminant.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the lang meeting today, we decided on tag
. We felt that the existing mem::discriminant
API was unfortunate, and unloved in enough other ways, that we didn't need to weigh that heavily as precedent. See also here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very fair. I think that it would be worth proposing to deprecate & rename that API if this is the route we're going, to ensure uniformity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also, while the decision on the rename is resolved, I'm going to unresolve this until I update the RFC to rename the attribute, just so it's easier to track. Plus, it would be helpful to clarify all the finer details on how they should be written into the RFC before this gets marked as resolved.)
I didn't want to have to elaborate more on the context of syntax, but since the FCP is not happening any more, I might as well. First, I'll start with the obvious. A colon is a terrible syntax. There's a reason why, when given the chance to design a new language from scratch instead of bolting on additional syntax like C++ does, Java and many other languages chose to use a keyword to describe the relationship between the name of the type and whatever occurs after the colon. Sure, Java also has distinctions like With Second, the ship has already sailed as far as Like, I personally have no issues with adding a dedicated syntax for this in the future. I just don't think that I think that an attribute fits this feature because, as mentioned before, any syntax will likely be less descriptive and more confusing. Elevating |
I went to mark this waiting-on-team, but it looks like that label doesn't exist here. Please treat this RFC as waiting on us to give better feedback. |
I should add: I was having a sour day before I saw this, so, that is definitely reflected in my response. I think that it's absolutely reasonable to want to pause the FCP because you have additional concerns without fully fleshing out those concerns, since you are on a time limit. I just wish that were the reason stated for pausing the FCP, instead of explicitly providing not-fleshed-out concerns as an excuse instead. Because as far as I'm concerned, the reasons mentioned were addressed before the FCP started, and no additional reasons were cited. |
(We were actually talking in the lang design meeting about marking this waiting-on-team before I even saw your response. Under the usual "no new rationale", we always need to put detailed rationale and concerns in the thread.) |
I would like to second @clarfonthey’s point about clarity. What I really appreciate about Rust is being able to write more “self-documenting” code, where types and variable names (and ? short-circuiting) make many more comments superfluous. A big part of that is writing code that looks like it explains itself. Yes, attributes are always a bit more magic than I would like, but otherwise In contrast, the If Rust gains dedicated syntax for this feature at some point (and perhaps a stable trait to reason about the discriminant type in the type system), that would be nice. But it should be as easy to read as the proposed attribute syntax. As this syntax is also not a new attribute but merely extends an existing one, I feel like not having it would at some point feel increasingly like a lack in feature with the repr attribute. |
@rustbot labels +S-waiting-on-team We decided in our design meeting today that we're going with So if this RFC were to be accepted, we'd want it updated to |
repr(discriminant = ...)
for type aliasesrepr(tag = ...)
for type aliases
"tag" might not be a good choice of terminology. As I already wrote over there: The use of "tag" as apparently a synonym for "discriminant" is unfortunate insofar as "tag" exists as a term in the compiler and it is not equivalent to "discriminant" there. It refers to how the discriminant is encoded in memory. For instance, for an Granted, with this largely being internal compiler terminology, it can be changed. But it will certainly be confusing to people that have worked with enums in the compiler in the past. We have also occasionally used this terminology in opsem and other language discussions, to my knowledge. And we would need a new word for what is currently called "tag". |
"Tag" could make sense here since |
@RalfJung Here, tag is appropriate. This RFC can be thought of as an extension of RFC2195: Really Tagged Unions, allowing the primitive repr type to be a type alias. Variants of enums with explicit primitive reprs are defined to always have tags. By contrast, the feature proposed by #3607 only concerns discriminants. I agree entirely with your comment there that the use of "tag" in that case is inappropriate. It's extremely useful to draw a distinction between discriminants (part of a variant's logical representation, and something all variants have) and tags (part of the physical representation, and only had by the variants of some enums). Using these terms synonymously will make talking about enum representation much more challenging. @traviscross I hope there is still time and procedural flexibility for the lang team to reconsider using the same terminology for both of these RFCs. |
We'll take the required time and consider all the feedback. |
@jswrenn that's fair, if this here is indeed about controlling the representation, and maybe even saying that With |
I think unless we change https://doc.rust-lang.org/reference/type-layout.html#primitive-representation-of-enums-with-fields-less-enums, a |
Hmm, that distinction is not one we were thinking of. We were thinking of "tag" as a synonym for "discriminant" -- one that may or may not be represented with actual bytes in memory.
|
Yeah, I'm also of the opinion that discriminant is a much better term in general. The way we tell variants apart is using a discriminant. That discriminant may be a dedicated tag field, but it could be niche values. However, I think that calling this tag is acceptable because, well, it's an explicit tag representation. Take this example: enum Even {
Zero = 0,
Two = 2,
}
enum Odd {
One = 1,
Three = 3,
}
enum Both {
Odd(Odd),
Even(Even),
} There is a sense where all of these could have a The "discriminant" here would fit into a So, calling this representation flag |
if this gets accepted, this discriminant/tag definition should be added to the glossary of the rust reference |
I agree that the decisions made, if final, should be solidified in the reference and similar documentation. Personally, I'm fine with deciding that this particular representation can be called a "tag" in all cases and in the reference, and thus using that name in the attribute makes sense. However, whether enum discriminants in general should be called tags is something I particularly disagree with, and which I'm not sure is actually finalised given the current discussion. That said, it is the teams responsible that have the final say in these discussions, and while I do hope they listen to the feedback presented to them, I ultimately can't force them to decide one way or another. As far as I'm concerned, the discussion left to be had is largely external to this particular RFC, and I've also already made the name changes in the RFC text itself so that we can restart the FCP. I'm not sure what the best place to continue the discussion is, but considering how I don't have anything particular to add to it, I'll wait for others to decide and share links and such. |
I think the concern for naming-and-syntax is not just "tag" vs "discriminant" but because of implementation constraint of rustc 🤷 the compiler currently simply cannot support |
I'm fine with commenting that down, although based upon @petrochenkov's comments (#3659 (comment) and #3659 (comment)) I was under the impression that this was less a design concern and more of an implementation detail. Like, I agree that implementing this now would require a few invasive changes to the way To me, approving this RFC means that:
If, later down the road, someone proposes a better syntax for this feature and decides to make a new RFC amending this one to ditch the attribute-based syntax, I'm fine with that. I just think that inheriting C++'s messy To me, attributes are just a predefined syntax we can use instead of sprinkling other sigils in the language, and are a way to easily stabilise features without giving them a permanently stable notation. The fact that the compiler is unable to keep up with this perception is something that would be better to change, IMHO, or maybe it's just I really shouldn't go off my own personal interpretation of the opinion of one person when that interpretation could be wrong, or that person's opinion doesn't reflect the entire team, but at least without any other input on this, that's what I'm doing for now. As stated, if this is actually a serious problem to consider, or if the team(s) would prefer to postpone this RFC instead of merging it, I can update the RFC text to reflect that. |
@clarfonthey: We've just been a bit busy recently, with the edition coming up and whatnot. We'll circle back to these design questions when we can. |
My entire point here was that I'm against rushing out a syntax before it's fully designed, so, it would be hypocritical to try and rush this out while folks have other, more important work to do. Thank you all for the 2024 efforts. |
Primitive representations on enums now accept type aliases, meaning that in addition to primitives like
#[repr(u32)]
,#[repr(tag = core::ffi::c_int)]
and#[repr(tag = my_type)]
are now accepted.Internals discussion: https://internals.rust-lang.org/t/pre-rfc-type-aliases-in-repr/20956
Last comment on RFC under first version (
self::
): #3659 (comment)Last comment on RFC under second version (
type = ...
): #3659 (comment)Last comment on RFC under third version (
discriminant = …
): #3659 (comment)Rendered