Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: #[derive_no_bound(..)] and #[derive_field_bound(..)] #2353

Closed
wants to merge 10 commits into from

Conversation

Centril
Copy link
Contributor

@Centril Centril commented Feb 27, 2018

🖼️ Rendered

📝 Summary

This RFC gives users a way to control trait bounds on derived implementations by allowing them to omit default bounds on type parameters or add bounds for field types. This is achieved with the two attributes #[derive_no_bound(Trait)] and #[derive_field_bound(Trait)].

The semantics of #[derive_no_bound(Trait)] for a type parameter P are:
The type parameter P does not need to satisfy Trait for any field referencing it to be Trait

The semantics of #[derive_field_bound(Trait)] on a field are that the type of the field is added to the where-clause of the referenced Trait as: FieldType: Trait.

💖 Thanks

Thanks to @nox for collaborating on this RFC with me as well as @kennytm for providing some great input.

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Feb 27, 2018
+ Replace bounds on impl of `Clone` and `PartialEq` with `T: Sync`

```rust
#[derive_bound(Clone, PartialEq, T: Sync)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attributes currently doesn't support this syntax (T: Sync).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, this is what was meant by:

Changing this would require a language change.

I'll clarify =)

Copy link
Member

@kennytm kennytm Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be quite a large language change 🙂 And I wonder if this will leads to things like

#[derive_bound(T: PartialEq<#[thing] fn(i32)>)]

Copy link
Contributor Author

@Centril Centril Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand I think steps to at least provide a sensible subset of things such as bounds, expressions, types in the attribute grammar is good as people already do this kinds of things in serde, derivative, (my derive crate for proptest also does this, but it is still WIP)... It is just done in string quotes which is somewhat hacky :(

That is, afaik, people can already write:

#[serde(bound = "T: PartialEq<#[thing] fn(i32)>")]

I think we should provide custom derive macro authors and users ways to do these things in a less hacky way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to say that I clarified this =)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't attributes allow arbitrary token trees now? This should make this acceptable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clarcharr I don't think so, but I'm not sure - I'd love to be proven wrong =)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attributes do allow arbitrary token trees between their brackets now.

@stbuehler
Copy link

Did anyone think about automatically detecting whether the bounds on the type parameters are required?

I think this should at least be listed as a theoretical alternative. The downside would be that it's less visible (especially if the fields are private) and easier to accidentally break APIs.

The proposed syntax seems to be the "safe" option; but it's also quite heavy to use imho.

Another alternative would be allowing things like:

#[derive]
impl<T> Clone for MyArc<T>;

Or in more crazy scenarios:

#[derive]
impl<T: UnrelatedTrait> Clone for MyArc<T>;

This probably requires more lines of code, but is imho better readable, and allows more powerful constructs.

@nox
Copy link
Contributor

nox commented Feb 28, 2018

Did anyone think about automatically detecting whether the bounds on the type parameters are required?

That cannot be done, that's why this is not listed as an alternative, deriving is a syntactical thing and syntactical things don't have any type knowledge.

Another alternative would be allowing things like:

What is impl<T> Clone for MyArc<T>; supposed to mean? Why is the derive attribute empty?

This probably requires more lines of code, but is imho better readable, and allows more powerful constructs.

Not being able to know the bounds just from reading the code isn't more readable IMO.

@main--
Copy link

main-- commented Feb 28, 2018

Since all those derives (clone, copy, eq, cmp, ...) rely entirely on forwarding the implementation in some way to the struct's fields, wouldn't it always work if we never add bounds and instead always bind on field types? So basically the equivalent of derive_no_bound+derive_field_bound, always, everywhere.

@nox
Copy link
Contributor

nox commented Feb 28, 2018

Since all those derives (clone, copy, eq, cmp, ...) rely entirely on forwarding the implementation in some way to the struct's fields, wouldn't it always work if we never add bounds and instead always bind on field types?

No, because of privacy. You can't have a where clause involving a private type in a public one. And changing that now would be a breaking change anyway.

@stbuehler
Copy link

What is impl<T> Clone for MyArc<T>; supposed to mean? Why is the derive attribute empty?

And there I thought that part would be obvious...

#[derive]
impl<T: Clone> Clone for MyArc<T>;

would be the equivalent to what

#[derive(Clone)]
struct MyArc<T>(...);

is today: it should automatically derive the trait implementation. The derive attribute is empty, because the trait to be derived is already given.

[...] deriving is a syntactical thing [...]

Hm right; this means my idea only works if the compiler manages to find the type definition in that stage. Maybe restricting this to only be allowed in the same file after the type definition would make this work?

This probably requires more lines of code, but is imho better readable, and allows more powerful constructs.

Not being able to know the bounds just from reading the code isn't more readable IMO.

This second idea is unrelated to my first about automatically finding the bounds; this one requires writing the bounds just like in a normal implementation, but allows deriving the implementation even if the bounds are more complex.

You're trying to solve the more complex cases with derive_field_bound, but this only works if the field type is public, as you said yourself.

Sorry about the confusion.

@nox
Copy link
Contributor

nox commented Feb 28, 2018

And there I thought that part would be obvious...

Things need to be exhaustively described in a RFC, we don't have room for confusion.

Deriving the trait implementation from an impl block without a body looks like a cute idea, but this seems unrelated to the RFC at hand, and would result in more verbose code than the attributes mentioned here, given it requires writing the type name again, its where clause if any (that's alleviated by implied bounds but that's not stable yet), and for use cases needing #[derive_field_bound], you would end up having to rewrite the field types too. Finally, that idea does not let you control the bounds for multiple derived traits at once.

I think this idea should be the way to go if someone needs to go beyond what's provided by the attributes in this RFC, but it doesn't ultimately supersede it.

Hm right; this means my idea only works if the compiler manages to find the type definition in that stage. Maybe restricting this to only be allowed in the same file after the type definition would make this work?

Even in the same file, a derive macro doesn't have access to the definitions of types used in the item it's deriving things for.

@petrochenkov
Copy link
Contributor

@nox

No, because of privacy. You can't have a where clause involving a private type in a public one.

The privacy issue is fixable, in theory it's already fixed by #2145, it just needs implementation.
It turns out there's a harder problem - coinductive reasoning.

@stbuehler
Copy link

Deriving the trait implementation from an impl block without a body looks like a cute idea, but this seems unrelated to the RFC at hand,

I disagree. I think the proposed syntax is real ugly; I don't look forward to using it, although the feature itself is quite important imho.

So if there are better ways doing it, those are imho related to this RFC.

and would result in more verbose code than the attributes mentioned here, given it requires writing the type name again,

Yes. I'm fine with that compared to the syntax currently proposed.

its where clause if any (that's alleviated by implied bounds but that's not stable yet),

Doesn't look like an important argument.

and for use cases needing #[derive_field_bound], you would end up having to rewrite the field types too.

But either derive_field_bound won't work for private types, or we can just always use it by default.

Finally, that idea does not let you control the bounds for multiple derived traits at once.

Yes, and that's why I like it. Because it's way easier to see what bounds are required for a certain trait compared to reading a big mess of attributes.

I think this idea should be the way to go if someone needs to go beyond what's provided by the attributes in this RFC, but it doesn't ultimately supersede it.

I disagree again; I'd rather not see the attributes getting implemented in the first place if a better solution can be found.

Even in the same file, a derive macro doesn't have access to the definitions of types used in the item it's deriving things for.

I'm not familiar with the implementation here, and not sure where this is in the range of "theoretically possible, but not gonna happen", "would be a lot of work" or "trivial to change". But it's certainly not impossible to implement.

I admit I have no idea how to combine my idea with proc_macros (although I'm pretty sure I wouldn't want to parse the proposed attributes either).

@nox
Copy link
Contributor

nox commented Feb 28, 2018

I disagree. I think the proposed syntax is real ugly; I don't look forward to using it, although the feature itself is quite important imho.

I look forward to using it because I actually need it or my code doesn't compile. As for the syntax being ugly, the syntax in that RFC lets me have less code duplication than your suggestion, so I don't really see what you want me to tell you.

Yes. I'm fine with that compared to the syntax currently proposed.

And I'm not because I will have around 60 types needing this.

Doesn't look like an important argument.

Well, that's your opinion. I will have around 60 types needing this.

But either derive_field_bound won't work for private types, or we can just always use it by default.

It won't work, but my types are public, and I will not have to write them once in the type and once in each derived impl, which is better than your proposal.

Yes, and that's why I like it. Because it's way easier to see what bounds are required for a certain trait compared to reading a big mess of attributes.

I have types that derive more than 10 traits. I have around 60 types like that.

I disagree again; I'd rather not see the attributes getting implemented in the first place if a better solution can be found.

Then show me a solution that doesn't lead to inordinate amounts of code duplication.

Edit: I forgot a verb.

@stbuehler
Copy link

My feeling is that in Rust it is preferred to have more code over having compact and unreadable code.

Maybe you could use macros to organize your types in a more compact way.

It seems you're also motivated by some time constraints, but it is not obvious why those are relevant for the Rust community.

All in all I'd still like to see the feature itself; so if the requirement is that customizing the bounds for derive needs to be done in attributes in the type definition, you're proposal seems overall fine to me.

@nox
Copy link
Contributor

nox commented Feb 28, 2018

I'm not really motivated by some time constraint, given my project is blocked on at least 2 other unstable features. I'm motivated by not duplicating code. Your idea leads to duplicating code immensely. There are even cases where #[derive(Clone)] struct Foo<...>(...); wouldn't be translatable to #[derive] impl<...> Clone for Foo<...> with current code, because the latter wouldn't know about where clauses in the definition of Foo<...>.

I'm already using macros extensively, that's why I can derive more than 10 traits on them. There is not much you can do when your work is to encode more than 300 different CSS properties with a huge variety of types and intricacies.

Finally, I still don't see what's supposed to be unreadable about the attributes suggested in this RFC.

@Centril
Copy link
Contributor Author

Centril commented Feb 28, 2018

@stbuehler

#[derive]
impl<T: UnrelatedTrait> MyTrait for MyType<T>;

Of the top of my head that is an interesting extension, but I agree with @nox that it is more verbose, less local and should be used when deriving still can't figure it out, but for specific problem this RFC attempts to solve, I think it is not the better solution.

The problem with #[derive] on impls is that you have even less information about the type you are trying to derive the trait for as you have zero of the type's definition. No fields, no variant, no information of other traits being derived, no other attributes, no nothing. The benefit of deriving on the type definition site is that you can infer from the fields present the behavior that the impl will have.

I disagree. I think the proposed syntax is real ugly;

By all means, if you have some other attribute names in mind or a method that is equally terse as the one proposed in the RFC, I'm all ears. Tho, I don't understand why you think it is ugly... It builds upon what we really have, attributes, so I think it is consistent.

@petrochenkov

The privacy issue is fixable, in theory it's already fixed by #2145, it just needs implementation.

I'm not sure it is morally right to fix it... seems somewhat strange to me that you should be able to refer to private types in public types's impl where clauses..?

@clarfonthey
Copy link
Contributor

I like this a lot but I wish there were a way to make the names less... unwieldy.

@nox
Copy link
Contributor

nox commented Mar 1, 2018

They are just a bit lengthy, but does that really matter?

text/0000-derive-bound-control.md Outdated Show resolved Hide resolved
text/0000-derive-bound-control.md Outdated Show resolved Hide resolved
@Centril
Copy link
Contributor Author

Centril commented Mar 1, 2018

@clarcharr

I like this a lot but I wish there were a way to make the names less... unwieldy.

We originally went without derive_ prefixes but added them to make things more legible... If you have some suggestions on alternative naming we're all ears =)

@hadronized
Copy link
Contributor

Oh this, with phantom typing. DO WANT!

Have you considered to talk about type roles?

@nox
Copy link
Contributor

nox commented Mar 2, 2018 via email

@scottmcm scottmcm mentioned this pull request Mar 4, 2018
@clarfonthey
Copy link
Contributor

clarfonthey commented Mar 7, 2018

I think that always_derive is better of a name than derive_no_bound. To me, always_derive(Default) very clearly states "derive default so that I can always use it, regardless of T"

@nox
Copy link
Contributor

nox commented Mar 7, 2018

The derive_ part of the names are only there because derive is a reserved prefix for attributes (AFAIK), so it would be derive_always.

@mzabaluev
Copy link
Contributor

derive_no_bounds could be a helper attribute

Ah thanks, I did not think this works on the item level attributes alongside the macro's own attribute, too.
(Besides, could this lead to strangeness when an attribute is both a helper and also has its own macro in scope to process it?)

@nox
Copy link
Contributor

nox commented Nov 12, 2019

@mzabaluev

The way this could be made to work without compiler hacks is that derive_field_bound(...) is backed by an attribute macro that looks for a yet-unpeeled derive attribute in the item AST passed to it. If it finds one, the macro re-injects its attribute below the derive attribute so that the derive macro will pick it up as a helper attribute and eat it, implementing the directives. If there is no derive attribute below, it probably means a derive macro above is unaware of derive_field_bound, so it has been regurgitated as an inert attribute, and so a semi-informative error can be raised saying that the feature is not supported by the combination of attributes present on the item.

See the RFC:

#[derive_field_bound(Trait)] is specified on a type definition and Trait is registered for deriving by a custom macro which specifies #[proc_macro_derive(Trait, attributes(<attr_list>))] where <attr_list> does not mention derive_field_bound. If #[derive_field_bound] is specified instead, then this applies to all traits derived. This also applies to #[derive_no_bound].

@Lythenas
Copy link

Like mentioned in #2811 (comment) I think this is mostly a shortcoming of how Clone and other derives are implemented. E.g. for

#[derive(Clone)]
struct X<T> {
  x: PhantomData<T>
}

the generated bound is T: Clone where it should actually be PhantomData<T>: Clone because that's where the actual .clone() method is called.

@CAD97
Copy link

CAD97 commented Nov 12, 2019

The problem with "just making derive smarter" (which has been discussed before but I didn't find the comments with a super quick overview) is privacy/encapsulation.

#[derive(Clone)]
pub struct PubType<T> {
    inner: PrivType<T>,
}

#[derive(Clone)]
pub(self) struct PrivType<T>(T);

If the generated Clone impl for PubType uses where PrivType<T>: Clone, then it both leaks the fact that PubType is implemented by holding a PrivType and violates the private-in-public rules.

The "most correct" "smart" version would recursively expand these bounds until reaching public types to bound, but this starts to be very magic and hard to control what implementation details are publicized (and thus what changes are breaking). Not to mention it then adds ordering constraints on derive expansion and requires much more type information than current macros which are pure over their input.

@mzabaluev
Copy link
Contributor

@Lythenas

the generated bound is T: Clone where it should actually be PhantomData<T>: Clone

T should be unbounded. The impl should not leak the fact that the struct has a PhantomData inside.
But as @CAD97 says above, figuring out the loosest possible bound of T that only refers to publicly visible types and parameters is not something that proc macros can currently do, and even if they could, this may be more magic than the macro user can reason about.

@Lythenas
Copy link

Lythenas commented Nov 12, 2019

Yes that makes sense. I didn't think about private types. I still think for some simple cases the derive could be made smarter (e.g. Arc and PhantomData implement Clone for every T). But that maybe has to be done in the compiler.

I also noticed that in the Prior arts section of Haskell it says:

it is not possible to configure deriving mechanisms in Haskell

while Haskell doesn't have attributes like Rust it does have the StandaloneDeriving extension. Which (like this RFC) allows to specify the bounds of a derived instance:

data Foo a = Bar a | Baz String
deriving instance Eq a => Eq (Foo a)

Unlike a deriving declaration attached to a data declaration, GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate boilerplate code for the specified class, and typechecks it. If there is a type error, it is your problem. (GHC will show you the offending code if it has a type error.)

Source: https://downloads.haskell.org/~ghc/7.8.4/docs/html/users_guide/deriving.html#stand-alone-deriving

This is basically what this RFC is trying to accomplish, right? Although probably a bit more verbose.

Syntax like that would also allow to derive a trait for a more specific instance. E.g.

deriving instance Eq a => Eq (Foo [a])

This would maybe translate to rust syntax that looks like this:

#[derive]
impl<T> Clone for Struct1<T>;
#[derive]
impl<T: Clone, S> Clone for Struct2<T, S>;
#[derive]
impl<T, S> Clone for Struct2<T, S> where T: Clone;
#[derive]
impl Clone for Struct1<String> where T: Clone;

which is probably a lot more repetitive than what this RFC suggests but IMHO it is also a lot more readable and basically mirrors what you would find in the generated documentation.

Edit: Also like the Haskell docs mention this would allow the proc macros responsible for deriving the trait work like they do now (generate the boilerplate) and the compiler would replace the generated bounds with the bounds given by the user.

@mzabaluev
Copy link
Contributor

This would maybe translate to rust syntax that looks like this:

#[derive]
impl<T> Clone for Struct1<T>;
#[derive]
impl<T: Clone, S> Clone for Struct2<T, S>;
#[derive]
impl<T, S> Clone for Struct2<T, S> where T: Clone;
#[derive]
impl Clone for Struct1<String> where T: Clone;

This has the same expressive strengths and weaknesses as #2811 in its current shape (as of eaaa257), but will require a change in Rust syntax. I like the idea of there being a greppable impl keyword for each generated impl, though.

@mzabaluev
Copy link
Contributor

mzabaluev commented Nov 12, 2019

@nox

#[derive_field_bound(Trait)] is specified on a type definition and Trait is registered for deriving by a custom macro which specifies #[proc_macro_derive(Trait, attributes(<attr_list>))] where <attr_list> does not mention derive_field_bound. If #[derive_field_bound] is specified instead, then this applies to all traits derived. This also applies to #[derive_no_bound].

Should the compiler treat specially the situation when derive_field_bound or derive_no_bound is present, but no derive macro applicable on the item has listed it as a helper attribute? Currently that would result in an "unknown attribute" error, which is not very helpful.

Edit: Re-reading the rule above, it does seem to call for special treatment that does not automatically fall out of the current behavior over helper attributes.

@nox
Copy link
Contributor

nox commented Nov 13, 2019

This would maybe translate to rust syntax that looks like this:

#[derive]
impl<T> Clone for Struct1<T>;
#[derive]
impl<T: Clone, S> Clone for Struct2<T, S>;
#[derive]
impl<T, S> Clone for Struct2<T, S> where T: Clone;
#[derive]
impl Clone for Struct1<String> where T: Clone;

which is probably a lot more repetitive than what this RFC suggests but IMHO it is also a lot more readable and basically mirrors what you would find in the generated documentation.

This has been suggested before and requires repeating the field types if that's what you want to put trait bounds on.

@nikomatsakis

This comment has been minimized.

@nikomatsakis
Copy link
Contributor

@rfcbot fcp close

We discussed this RFC in our "Backlog Bonanza". The consensus was that we will close it, even though we are interested in solving this general problem. The solution here doesn't seem to be 100% on target. The right next step to carry this work forward would be to create a lang team project proposal, because we would want to charter an effort to explore and write-up the design space. Thanks all.

@rfcbot
Copy link
Collaborator

rfcbot commented Mar 30, 2021

Team member @nikomatsakis has proposed to close this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-close This RFC is in PFCP or FCP with a disposition to close it. labels Mar 30, 2021
@nikomatsakis
Copy link
Contributor

@rfcbot reviewed

(I checked the names of folks who were present in the meeting.)

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Mar 30, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Mar 30, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Apr 9, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Apr 9, 2021

The final comment period, with a disposition to close, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC is now closed.

@rfcbot rfcbot added to-announce closed This FCP has been closed (as opposed to postponed) and removed disposition-close This RFC is in PFCP or FCP with a disposition to close it. labels Apr 9, 2021
@rfcbot rfcbot closed this Apr 9, 2021
@shlevy
Copy link

shlevy commented Apr 23, 2021

@nikomatsakis Is there somewhere that interested parties can watch for the lang team project proposal?

@Lokathor
Copy link
Contributor

https://github.com/rust-lang/lang-team would be the place

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-derive Deriving related proposals & ideas A-traits Trait system related proposals & ideas closed This FCP has been closed (as opposed to postponed) finished-final-comment-period The final comment period is finished for this RFC. T-lang Relevant to the language team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.