From 38d358d9aad795a07980c84b0b52725bfa7d94d2 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 1 Nov 2023 18:05:35 +0000 Subject: [PATCH 01/47] Arbitrary self types v2. This PR suggests small changes to the existing unstable "aribtrary self types" feature to make it more flexible. In particular, it suggests driving this feature from a new (ish) "Receiver" trait instead of from Deref, but to maintain compatibility by having a blanket implementation for all Deref types. This is a squashed commit of much work by various folks including Johann Hemmann, Lukas Wirth, Mads Marquart and myself. Thanks also to David Hewitt and Manish Goregaokar for feedback. Co-authored-by: Johann Hemmann --- text/0000-arbitrary-self-types-v2.md | 403 +++++++++++++++++++++++++++ 1 file changed, 403 insertions(+) create mode 100644 text/0000-arbitrary-self-types-v2.md diff --git a/text/0000-arbitrary-self-types-v2.md b/text/0000-arbitrary-self-types-v2.md new file mode 100644 index 00000000000..7c6147237f8 --- /dev/null +++ b/text/0000-arbitrary-self-types-v2.md @@ -0,0 +1,403 @@ +- Feature Name: Arbitrary Self Types 2.0 +- Start Date: 2023-05-04 +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +Allow types that implement the new `trait Receiver` to be the receiver of a method. + +# Motivation +[motivation]: #motivation + +Today, methods can only be received by value, by reference, or by one of a few blessed smart pointer types from `core`, `alloc` and `std` (`Arc`, `Box`, `Pin

` and `Rc`). + +It's been assumed that this will eventually be generalized to support any smart pointer, such as an `CustomPtr`. Since late 2017, it has been available on nightly under the `arbitrary_self_types` feature for types that implement `Deref` and for raw pointers. + +This RFC proposes some changes to the existing nightly feature based on the experience gained, with a view towards stabilizing the feature in the relatively near future. + +## Motivation for the arbitrary self types feature overall + +One use-case is cross-language interop (JavaScript, Python, C++), where other languages' references can’t guarantee the aliasing and exclusivity semantics required of a Rust reference. For example, the C++ `this` pointer can't be practically or safely represented as a Rust reference because C++ may retain other pointers to the data and it might mutate at any time. Yet, calling C++ methods intrinsically requires a `this` reference. + +Another case is when the existence of a reference is, itself, semantically important — for example, reference counting, or if relayout of a UI should occur each time a mutable reference ceases to exist. In these cases it's not OK to allow a regular Rust reference to exist, and yet sometimes we still want to be able to call methods on a reference-like thing. + +In theory, users can define their own smart pointers. In practice, they're second-class citizens compared to the smart pointers in Rust's standard library. User-defined smart pointers to `T` can accept method calls only if the receiver (`self`) type is `&T` or `&mut T`, which isn't acceptable if we can't safely create native Rust references to a `T`. + +This RFC proposes to loosen this restriction to allow custom smart pointer types to be accepted as a `self` type just like for the standard library types. + +See also [this blog post](https://medium.com/@adetaylor/the-case-for-stabilizing-arbitrary-self-types-b07bab22bb45), especially for a list of more specific use-cases. + +## Motivation for the v2 changes + +Unstable Rust contains an implementation of arbitrary self types based around the `Deref` trait. Naturally, that trait also provides a means to create a `&T`. + +However, if it's OK to create a reference `&T`, you _probably don't need this feature_. You can probably simply use `&self` as your receiver type. + +This feature is fundamentally aimed at smart pointer types `P` where it's not safe to create a reference `&T`. As noted above, that's most commonly because of semantic differences to pointers in other languages, but it might be because references have special meaning or behavior in some pure Rust domain. Either way, it's not OK to create a Rust reference `&T` or `&mut T`, yet we may want to allow methods to be called on some reference-like thing. + +For this reason, implementing `Deref::deref` is problematic for _nearly everyone who wants to use arbitrary self types_. + +If you're implementing a smart pointer `P` yet you can't allow a reference `&T` to exist, any option for implementing `Deref::deref` has drawbacks: + +* Specify `Deref::Target=T` and panic in `Deref::deref`. Not good. +* Specify `Deref::Target=*const T`. This works with the current arbitrary self types feature, but is only possible if your smart pointer type contains a `*const T` which you can reference - this isn't the case for (for instance) weak pointers or types containing `NonNull`. + +Therefore, the current Arbitrary Self Types v2 provides a separate `Receiver` trait. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +When declaring a method, users can declare the type of the `self` receiver to be any type `T` where `T: Receiver` or `Self`. + +The `Receiver` trait is simple and only requires to specify the `Target` type: + +```rust +trait Receiver { + type Target: ?Sized; +} +``` + +The `Receiver` trait is already implemented for many standard library types: +- smart pointers in the standard library: `Rc`, `Arc`, `Box`, and `Pin>` (and in fact, any type which implements `Deref`) +- references: `&Self` and `&mut Self` +- pointers: `*const Self` and `*mut Self` + +Shorthand exists for references, so that `self` with no ascription is of type `Self`, `&self` is of type `&Self` and `&mut self` is of type `&mut Self`. + +All of the following self types are valid: + +```rust +impl Foo { + fn by_value(self /* self: Self */); + fn by_ref(&self /* self: &Self */); + fn by_ref_mut(&mut self /* self: &mut Self */); + fn by_ptr(*const Self); + fn by_mut_ptr(*mut Self); + fn by_box(self: Box); + fn by_rc(self: Rc); + fn by_custom_ptr(self: CustomPtr); +} + +struct CustomPtr(*const T); + +impl Receiver for CustomPtr { + type Target = T; +} +``` + +## Recursive arbitrary receivers + +Receivers are recursive and therefore allowed to be nested. If type `T` implements `Receiver`, and type `U` implements `Receiver`, `T` is a valid receiver (and so on outward). + +For example, this self type is valid: + +```rust +impl MyType { + fn by_rc_to_box(self: Rc>) { ... } +} +``` + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +## `core` libs changes + +The `Receiver` trait is made public (removing its `#[doc(hidden)])` attribute), exposing it under `core::ops`. It gains a `Target` associated type. + +This trait marks types that can be used as receivers other than the `Self` type of an impl or trait definition. + +```rust +pub trait Receiver { + type Target: ?Sized; +} +``` + +A blanket implementation is provided for any type that implements `Deref`: + +```rust +impl Receiver for P +where + P: Deref, +{ + type Target = T; +} +``` + +(See [alternatives](#no-blanket-implementation) for discussion of the tradeoffs here.) + +It is also implemented for `&T`, `&mut T`, `*const T` and `*mut T`. + +## Compiler changes + +The existing Rust [reference section for method calls describes the algorithm for assembling method call candidates](https://doc.rust-lang.org/reference/expressions/method-call-expr.html). This algorithm changes in one simple way: instead of dereferencing types (using the `Deref`) trait, we use the new `Receiver` trait to determine the next step. + +Because a blanket implementation is provided for users of the `Deref` trait and for `&T`/`&mut T`, the net behavior is similar. But this provides the opportunity for types which can't implement `Deref` to act as method receivers. + +Dereferencing a raw pointer usually needs `unsafe` (for good reason!) but in this case, no actual dereferencing occurs. This is used only to determine a list of method candidates; no memory access is performed and thus no `unsafe` is needed. + +## Object safety + +Receivers are object safe if they implement the (unstable) `core::ops::DispatchFromDyn` trait. + +As not all receivers might want to permit object safety or are unable to support it, object safety should remain being encoded in a different trait than the here proposed `Receiver` trait, likely `DispatchFromDyn`. + +This RFC does not propose any changes to `DispatchFromDyn`. Since `DispatchFromDyn` is unstable at the moment, object-safe receivers might be delayed until `DispatchFromDyn` is stabilized. `Receiver` is not blocked on further `DispatchFromDyn` work, since non-object-safe receivers already cover a big chunk of the use-cases. + +## Lifetime elision + +As discussed in the [motivation](#motivation), this new facility is _most likely_ to be used in cases where a standard reference can't normally be used. But in other cases a smart pointer self type might wrap a standard Rust reference, and thus might be parameterized by a lifetime. + +Lifetime elision works in the expected fashion: + +```rust +struct SmartPtr<'a, T: ?Sized>(&'a T); + +impl<'a, T: ?Sized> Receiver for SmartPtr<'a, T> { + type Target = T; +} + +struct MyType; + +impl MyType { + fn m(self: SmartPtr) {} + fn n(self: SmartPtr<'_, Self>) {} + fn o<'a>(self: SmartPtr<'a, Self>) {} +} +``` + +## Diagnostics + +The existing branches in the compiler for "arbitrary self types" already emit excellent diagnostics. We will largely re-use them, with the following improvements: + +- In the case where a self type is invalid because it doesn't implement `Receiver`, the existing excellent error message will be updated. +- An easy mistake is to implement `Receiver` for `P`, forgetting to specify `T: ?Sized`. `P` then only works as a `self` parameter in traits `where Self: Sized`, an unusual stipulation. It's not obvious that `Sized`ness is the problem here, so we will identify this case specifically and produce an error giving that hint. +- There are certain types which feel like they "should" implement `Receiver` but do not: `Weak` and `NotNull`. If these are encountered as a self type, we should produce a specific diagnostic explaining that they do not implement `Receiver` and suggesting that they could be wrapped in a newtype wrapper if method calls are important. This will require `Weak` and `NonNull` be marked as lang items so that the compiler is aware of the special nature of these types. (The authors of this RFC feel that these extra lang-items _are_ worthwhile to produce these improved diagnostics - if the reader disagrees, please let us know.) +- Under some circumstances, the compiler identifies method candidates but then discovers that the self type doesn't match. This results currently in a simple "mismatched types" error; we can provide a more specific error message here. The only known case is where a method is generic over `Receiver`, and the caller explicitly specifies the wrong type: + ```rust + #![feature(receiver_trait)] + + use std::ops::Receiver; + + struct SmartPtr<'a, T: ?Sized>(&'a T); + + impl<'a, T: ?Sized> Receiver for SmartPtr<'a, T> { + type Target = T; + } + + struct Foo(u32); + impl Foo { + fn a>(self: R) { } + } + + fn main() { + let foo = Foo(1); + let smart_ptr = SmartPtr(&foo); + smart_ptr.a(); // this compiles + smart_ptr.a::<&Foo>(); // currently results in "mismatched types"; we can probably do better + } + ``` +- If a method `m` is generic over `R: Receiver` (or, perhaps more commonly, `R: Deref`) and `self: R`, then someone calls it with `object_by_value.m()`, it won't work because Rust doesn't know to use `&object_by_value`, and the message `the trait bound Foo: 'Receiver/Deref' is not satisfied` is generated. While correct, this may be surprising because users expect to be able to use `object_by_value.m2()` where `fn m2(&self)`. The resulting error message already suggests that the user create a reference in order to match the `Receiver` trait, so this may be sufficient already, but we may add an additional note here. + +# Drawbacks +[drawbacks]: #drawbacks + +Why should we *not* do this? + +- Deref coercions can already be confusing and unexpected. Adding a new `Receiver` trait could cause similar confusion. +- Custom smart pointers are a niche use case (but they're very important for cross-language interoperability.) + +## Method shadowing +[method-shadowing]: #method-shadowing + +For a smart pointer `P`, a method call `p.m()` might call a method on the smart pointer type itself (`P::m`), or, if the smart pointer implements `Deref`, it might already call `T::m`. This already gives the possibility that `T::m` would be shadowed by `P::m`. + +Current Rust standard library smart pointers are designed with this shadowing behavior in mind: + +* `Box`, `Pin`, `Rc` and `Arc` already heavily use associated functions rather than methods +* Where they use methods, it's often with the _intention_ of shadowing a method in the inner type (e.g. `Arc::clone`) + +These method shadowing risks are effectively the same for `Deref` and `Receiver`. This RFC does not make things worse (it just adds additional flexibility to the `self` parameter type for `T::m`). However it does mean that the `Receiver` trait cannot be added to smart pointer types which were not designed with these concerns in mind. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +As this feature has been cooking since 2017, many alternative implementations have been discussed. + +## Deref-based +[deref-based]: #deref-based + +As noted in the rationale section, the currently nightly implementation implements arbitrary self types using the `Deref` trait. + +## No blanket implementation for `Deref` +[no-blanket-implementation]: #no-blanket-implementation + +The other major approach previously discussed is to have a `Receiver` trait, as proposed in this RFC, but without a blanket implementation for `T: Deref`. Blanket implementations are unusual for core Rust traits, but the authors of this RFC believe it's necessary in this case. + +Specifically, this RFC proposes that the existing method search algorithm is modified to search the `Receiver` chain _instead of_ the `Deref` chain. + +It's therefore a major compatibility break if existing `Deref` implementors cease to be usable as `self` parameters. Just in the standard library, we'd have to add `Receiver` implementations for `Cow`, `Ref`, `ManuallyDrop` and possibly many other existing implementors of `Deref`: third party libraries would have to do the same. Without that, method calls on these types would not be possible: + +```rust +fn main() { + let ref_cell = RefCell::new(/* something cloneable */); + ref_cell.borrow().clone(); // no longer possible if: + // 1) we cease to explore Deref in identifying method candidates + // 2) Ref doesn't implement Receiver. +} +``` + +This doesn't just break people previously using the unstable Rust `arbitrary_self_type` feature; it breaks stable Rust usages as well. Obviously this is not acceptable, so we believe the blanket implementation is necessary. + +In any case, we think a blanket implementation is desirable: + +* It prevents `Deref` and `Receiver` having different `Target`s. That could possible lead to confusion if it prompted the compiler to explore different chains for these two different purposes. +* If smart pointer type `P` is in a crate, users of `P` to create `P` will be able to use it as a `self` type for `MyConcreteType` without waiting for a new release of the `P` crate. + +We found that [some crates use `Deref` to express an is-a not a has-a relationship](ttps://gist.github.com/davidhewitt/d0ed031fb05f6db98ee249ae089b268e) and so, ideally, might have preferred the option of setting up `Deref` and `self` candidacy separately. But, on discussion, we concluded that traits would be a better way to model those relationships. + +## Explore both `Receiver` and `Deref` chains while identifying method candidates + +We could modify the method search algorithm to explore both `Deref` and `Receiver` targets when identifying method candidates. This would avoid breaking compatibility, yet would give the desired flexibility for folks who wish to implement `Receiver` but not `Deref`. + +We don't think this is such a good option because: + +* It's more confusing for users; +* It could lead to a worst-case O(n^2) number of method candidates to explore (though possibly this could be limited to O(2n) if we added restrictions); +* It's a more invasive change to the compiler; +* We don't know of any use-cases which the `Receiver` and blanket implementation for `Deref` do not allow. + +If some use-case presents itself where a type _must_ implement `Deref` but not `Receiver`; or a use-case presents itself where `Deref` and `Receiver` _must_ have different `Target`s then we will have to consider this more complex option. + +## Generic parameter + +Change the trait definition to have a generic parameter instead of an associated type. There might be permutations here which could allow a single smart pointer type to dispatch method calls to multiple possible receivers - but this would add complexity, no known use case exists, and it might cause worst-case O(n^2) performance on method lookup. + +## Do not enable for pointers + +It would be possible to respect the `Receiver` trait without allowing dispatch onto raw pointers - they are essentially independent changes to the candidate deduction algorithm. + +We don't want to encourage the use of raw pointers, and would prefer rather that raw pointers are wrapped in a custom smart pointer that encodes and documents the invariants. So, there's an argument not to add the raw pointer support. + +However, the current unstable `arbitrary_self_types` feature provides support for raw pointer receivers, and with years of experience no major concerns have been spotted. We would prefer not to deviate from the existing proposal more than necessary. Moreover, we are led to believe that raw pointer receivers are quite important for the future of safe Rust, because stacked borrows makes it illegal to materialize references in many positions, and there are a lot of operations (like going from a raw pointer to a raw pointer to a field) where users don't need to or want to do that. We think the utility of including raw pointer receivers outweighs the risks of tempting people to over-use raw pointers. + +## Provide compiler support for dereferencing pointers + +This RFC proposes to implement `Receiver` for `*mut T` and `*const T` within the library. This is slightly different from the unstable arbitrary self types support, which instead hard-codes pointer support into the candidate deduction algorithm in the compiler (because obviously `Deref` can't be implemented for pointers.) + +We prefer the option of specifying behavior in the library using the normal trait, though it's a compatibility break for users of Rust who don't adopt the `core` crate (including compiler tests). + +## Implement for `Weak` and `NonNull` + +`Weak` and `NonNull` were not supported by the prior unstable arbitrary self types support, but they share the property that it may be desirable to implement method calls to `T` using them as self types. Unfortunately they also share the property that these types have many Rust methods using `self`, `&self` or `&mut self`. If we added to the set of Rust methods in future, we'd [shadow any such method calls](#method-shadowing). We can't implement `Receiver` for these types unless we come up with a policy that all subsequent additions to these types would instead be associated functions. That would make the future APIs for these types a confusing mismash of methods and associated functions, and the extra user complexity doesn't seem merited. + +## Not do it + +As always there is the option to not do this. But this feature already kind of half-exists (we are talking about `Box`, `Pin` etc.) and it makes a lot of sense to also take the last step and therefore enable non-libstd types to be used as self types. + +There is the option of using traits to fill a similar role, e.g. + +```rust +trait ForeignLanguageRef { + type Pointee; + fn read(&self) -> *const Self::Pointee; + fn write(&mut self, value: *const Self::Pointee); +} + +// -------------------------------------------------------- + +struct ConcreteForeignLanguageRef(T); + +impl ForeignLanguageRef for ConcreteForeignLanguageRef { + type Pointee = T; + + fn read(&self) -> *const Self::Pointee { + todo!() + } + + fn write(&mut self, _value: *const Self::Pointee) { + todo!() + } +} + +// -------------------------------------------------------- + +struct SomeForeignLanguageType; + +impl ConcreteForeignLanguageRef { + fn m(&self) { + todo!() + } +} + +trait Tr { + type RustType; + + fn tm(self) + where + Self: ForeignLanguageRef; +} + +impl Tr for ConcreteForeignLanguageRef { + type RustType = SomeForeignLanguageType; + fn tm(self) {} +} + +fn main() { + let a = ConcreteForeignLanguageRef(SomeForeignLanguageType); + a.m(); + a.tm(); +} +``` + +This successfully allows method calls to `m()` and even `tm()` without a reference to a `SomeForeignLanguageType` ever existing. However, due to the orphan rule, this forces every crate to have its own equivalent of `ConcreteForeignLanguageRef`. This workaround has been used by some interop tools, but use across multiple crates requires many generic parameters (`impl ForeignLanguageRef`). + +## Always use `unsafe` when interacting with other languages + +One main motivation here is cross-language interoperability. As noted in the rationale, C++ references can't be _safely_ represented by Rust references. Many would say that all C++ interop is intrinsically unsafe and that `unsafe` blocks are required. Maybe true: but that just moves the problem - an `unsafe` block requires a human to assert preconditions are met, e.g. that there are no other C++ pointers to the same data. But those preconditions are almost never true, because other languages don't have those rules. This means that a C++ reference can never be a Rust reference, because neither human nor computer can promise things that aren't true. + +Only in the very simplest interop scenarios can we claim that a human could audit all the C++ code to eliminate the risk of other pointers existing. In complex projects, that's not possible. + +However, a C++ reference _can_ be passed through Rust safely as an opaque token such that method calls can be performed on it. Those method calls actually happen back in the C++ domain where aliasing and concurrent modification are permitted. + +For instance, + +```rust +struct ForeignLanguageRef; + +fn main() { + let some_foreign_language_reference: ForeignLanguageRef<_> = CallSomeForeignLanguageFunctionToGetAReference(); + // There may be other foreign language references to the referent, with concurrent + // modification, so some_foreign_language_reference can't be a &T + // But we still want to be able to do this + some_foreign_language_reference.SomeForeignLanguageMethod(); // executes in the foreign language. Data is not + // dereferenced at all in Rust. +} +``` + +# Prior art +[prior-art]: #prior-art + +A previous PR based on the `Deref` alternative has been proposed before https://github.com/rust-lang/rfcs/pull/2362 and was postponed with the expectation that the lang team would [get back to `arbitrary_self_types` eventually](https://github.com/rust-lang/rfcs/pull/2362#issuecomment-527306157). + +# Feature gates + +This RFC is in an unusual position regarding feature gates. There are two existing gates: + +- `arbitrary_self_types` enables, roughly, the _semantics_ we're proposing, albeit [in a different way](#deref-based). It has been used by various projects. +- `receiver_trait` enables the specific trait we propose to use, albeit without the `Target` associated type. It has only been used within the Rust standard library, as far as we know. + +Although we presumably have no obligation to maintain compatibility for users of the unstable `arbitrary_self_types` feature, we should consider the least disruptive way to introduce this feature. + +Options are: + +* Use the `arbitrary_self_types` feature gate, and remove the `receiver_trait` feature gate immediately. +* Use the `receiver_trait` feature gate and remove the `arbitrary_self_types` feature gate immediately. +* Invent a new feature gate. + +This RFC proposes the first course of action, since `arbitrary_self_types` is used externally and we think all currently use-cases should continue to work. + +# Summary + +This RFC is an example of replacing special casing aka. compiler magic with clear and transparent definitions. We believe this is a good thing and should be done whenever possible. From d13c17050a2beadaa52b5cffd7061c96791a60f7 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 1 Nov 2023 18:12:51 +0000 Subject: [PATCH 02/47] Reflect RFC PR number. --- ...bitrary-self-types-v2.md => 3519-arbitrary-self-types-v2.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-arbitrary-self-types-v2.md => 3519-arbitrary-self-types-v2.md} (99%) diff --git a/text/0000-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md similarity index 99% rename from text/0000-arbitrary-self-types-v2.md rename to text/3519-arbitrary-self-types-v2.md index 7c6147237f8..2fc8e0a77e9 100644 --- a/text/0000-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -1,6 +1,6 @@ - Feature Name: Arbitrary Self Types 2.0 - Start Date: 2023-05-04 -- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- RFC PR: [rust-lang/rfcs#3519](https://github.com/rust-lang/rfcs/pull/3519) - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) # Summary From 827e3d48d872756d9aa1091f4c88c60b40db5311 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 1 Nov 2023 19:50:51 +0000 Subject: [PATCH 03/47] Fix pointer parameter. --- text/3519-arbitrary-self-types-v2.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 2fc8e0a77e9..032a016b648 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -73,8 +73,8 @@ impl Foo { fn by_value(self /* self: Self */); fn by_ref(&self /* self: &Self */); fn by_ref_mut(&mut self /* self: &mut Self */); - fn by_ptr(*const Self); - fn by_mut_ptr(*mut Self); + fn by_ptr(self: *const Self); + fn by_mut_ptr(self: *mut Self); fn by_box(self: Box); fn by_rc(self: Rc); fn by_custom_ptr(self: CustomPtr); From efaf68c624d869a2bd248c164a924e79a596aa56 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 2 Nov 2023 10:09:27 +0000 Subject: [PATCH 04/47] Switch diagnostics to Diagnostic Items. --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 032a016b648..ac24f851beb 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -173,7 +173,7 @@ The existing branches in the compiler for "arbitrary self types" already emit ex - In the case where a self type is invalid because it doesn't implement `Receiver`, the existing excellent error message will be updated. - An easy mistake is to implement `Receiver` for `P`, forgetting to specify `T: ?Sized`. `P` then only works as a `self` parameter in traits `where Self: Sized`, an unusual stipulation. It's not obvious that `Sized`ness is the problem here, so we will identify this case specifically and produce an error giving that hint. -- There are certain types which feel like they "should" implement `Receiver` but do not: `Weak` and `NotNull`. If these are encountered as a self type, we should produce a specific diagnostic explaining that they do not implement `Receiver` and suggesting that they could be wrapped in a newtype wrapper if method calls are important. This will require `Weak` and `NonNull` be marked as lang items so that the compiler is aware of the special nature of these types. (The authors of this RFC feel that these extra lang-items _are_ worthwhile to produce these improved diagnostics - if the reader disagrees, please let us know.) +- There are certain types which feel like they "should" implement `Receiver` but do not: `Weak` and `NotNull`. If these are encountered as a self type, we should produce a specific diagnostic explaining that they do not implement `Receiver` and suggesting that they could be wrapped in a newtype wrapper if method calls are important. We hope this can be achieved with [diagnostic items](https://rustc-dev-guide.rust-lang.org/diagnostics/diagnostic-items.html). - Under some circumstances, the compiler identifies method candidates but then discovers that the self type doesn't match. This results currently in a simple "mismatched types" error; we can provide a more specific error message here. The only known case is where a method is generic over `Receiver`, and the caller explicitly specifies the wrong type: ```rust #![feature(receiver_trait)] From 47be87bcfaa4a46cfdee7c7e2aae4fd441060dcc Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 2 Nov 2023 10:20:18 +0000 Subject: [PATCH 05/47] Adding example to motivation section as requested. --- text/3519-arbitrary-self-types-v2.md | 53 +++++++++++++++++++++++++++- 1 file changed, 52 insertions(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index ac24f851beb..8a96778b388 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -19,7 +19,57 @@ This RFC proposes some changes to the existing nightly feature based on the expe ## Motivation for the arbitrary self types feature overall -One use-case is cross-language interop (JavaScript, Python, C++), where other languages' references can’t guarantee the aliasing and exclusivity semantics required of a Rust reference. For example, the C++ `this` pointer can't be practically or safely represented as a Rust reference because C++ may retain other pointers to the data and it might mutate at any time. Yet, calling C++ methods intrinsically requires a `this` reference. +One use-case is cross-language interop (JavaScript, Python, C++), where other languages' references can’t guarantee the aliasing and exclusivity semantics required of a Rust reference. For example, the C++ `this` pointer can't be practically or safely represented as a Rust reference because C++ may retain other pointers to the data and it might mutate at any time. Yet, calling C++ methods intrinsically requires a `this` reference. With "arbitrary self types", smart pointer types can be created which obey foreign-language semantics and can be used in safe Rust code: + +```rust +#[repr(transparent)] +#[derive(Clone)] +/// A C++ reference. Obeys C++ reference semantics, not Rust reference semantics. +/// There is no exclusivity; the underlying data may mutate, etc. +pub struct CppRef { + ptr: *const T, +} + +impl Receiver for CppRef { + type Target = T; +} + +// generated by bindings generator +struct ConcreteCppType { + // ... +} + +// all generated by bindings generator; mostly calls into C++ +impl ConcreteCppType { + fn some_cpp_method(self: CppRef) { + } + fn get_int_field(self: &CppRef) -> u32 { + } + fn get_more_complex_field(self: &CppRef) -> CppRef { + } + fn equals(self: &CppRef) -> bool { + } +} + +// generated by bindings generator +fn get_cpp_reference() -> CppRef { + // ... +} + +fn main() { + // Safe Rust code manipulating C++ objects via C++-semantics references + let cpp_obj_reference: CppRef = get_cpp_reference(); + // cpp_obj_reference does not obey Rust reference semantics. Other + // "references" to the same data may exist in the Rust or C++ domain. + // But it can effectively be used as an opaque token to pass safely + // through Rust back into C++ + let some_value: u32 = cpp_pbj_refence.get_int_field(); + let some_field = cpp_obj_reference.get_more_complex_field(); + cpp_obj_reference.compare_with(&get_cpp_reference()); +} +``` + +(fuller example [here](https://github.com/google/autocxx/blob/main/src/reference_wrapper.rs#L117), with various [trait-based attempts](#not-do-it) to work around the lack of arbitrary self types.) Another case is when the existence of a reference is, itself, semantically important — for example, reference counting, or if relayout of a UI should occur each time a mutable reference ceases to exist. In these cases it's not OK to allow a regular Rust reference to exist, and yet sometimes we still want to be able to call methods on a reference-like thing. @@ -293,6 +343,7 @@ We prefer the option of specifying behavior in the library using the normal trai `Weak` and `NonNull` were not supported by the prior unstable arbitrary self types support, but they share the property that it may be desirable to implement method calls to `T` using them as self types. Unfortunately they also share the property that these types have many Rust methods using `self`, `&self` or `&mut self`. If we added to the set of Rust methods in future, we'd [shadow any such method calls](#method-shadowing). We can't implement `Receiver` for these types unless we come up with a policy that all subsequent additions to these types would instead be associated functions. That would make the future APIs for these types a confusing mismash of methods and associated functions, and the extra user complexity doesn't seem merited. ## Not do it +[not-do-it]: #not-do-it As always there is the option to not do this. But this feature already kind of half-exists (we are talking about `Box`, `Pin` etc.) and it makes a lot of sense to also take the last step and therefore enable non-libstd types to be used as self types. From 9607769d860093fd7dc68819b4ff60c1695752e4 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 2 Nov 2023 10:33:04 +0000 Subject: [PATCH 06/47] Adding extra motivation suggested by @clarfonthey. Thanks! --- text/3519-arbitrary-self-types-v2.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 8a96778b388..5b0849c2b5f 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -73,7 +73,9 @@ fn main() { Another case is when the existence of a reference is, itself, semantically important — for example, reference counting, or if relayout of a UI should occur each time a mutable reference ceases to exist. In these cases it's not OK to allow a regular Rust reference to exist, and yet sometimes we still want to be able to call methods on a reference-like thing. -In theory, users can define their own smart pointers. In practice, they're second-class citizens compared to the smart pointers in Rust's standard library. User-defined smart pointers to `T` can accept method calls only if the receiver (`self`) type is `&T` or `&mut T`, which isn't acceptable if we can't safely create native Rust references to a `T`. +A third motivation is that taking smart pointer types as `self` parameters can enable functions to act on the smart pointer type, not just the underlying data. For example, taking `&Arc` allows the functions to both clone the smart pointer (noting that the underlying `T` might not implement `Clone`) in addition to access the data inside the type, which is useful for some methods. Also, being able to change a method from accepting `&self` to `self: &Arc` can be done in a mostly frictionless way, whereas changing from `&self` to a static method accepting `&Arc` will always require some amount of refactoring. These options are currently open only to Rust's built-in smart pointer types, not to custom smart pointer types. + +In theory, users can define their own smart pointers. In practice, they're second-class citizens compared to the smart pointers in Rust's standard library. A type `T` can accept method calls using smart pointers as the `self` type only if they're one of Rust's built-in smart pointers. This RFC proposes to loosen this restriction to allow custom smart pointer types to be accepted as a `self` type just like for the standard library types. From c7e917d0714b11d171567fcf9fa88758fa4928a2 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 2 Nov 2023 10:48:58 +0000 Subject: [PATCH 07/47] Provide example of current arbitrary_self_types. --- text/3519-arbitrary-self-types-v2.md | 51 +++++++++++++++++++++++----- 1 file changed, 43 insertions(+), 8 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 5b0849c2b5f..1479a8983d9 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -13,7 +13,7 @@ Allow types that implement the new `trait Receiver` to be the recei Today, methods can only be received by value, by reference, or by one of a few blessed smart pointer types from `core`, `alloc` and `std` (`Arc`, `Box`, `Pin

` and `Rc`). -It's been assumed that this will eventually be generalized to support any smart pointer, such as an `CustomPtr`. Since late 2017, it has been available on nightly under the `arbitrary_self_types` feature for types that implement `Deref` and for raw pointers. +It's been assumed that this will eventually be generalized to support any smart pointer, such as an `CustomPtr`. Since late 2017, it has been available on nightly under the `arbitrary_self_types` feature for types that implement `Deref` and for raw pointers. This RFC proposes some changes to the existing nightly feature based on the experience gained, with a view towards stabilizing the feature in the relatively near future. @@ -83,20 +83,55 @@ See also [this blog post](https://medium.com/@adetaylor/the-case-for-stabilizing ## Motivation for the v2 changes -Unstable Rust contains an implementation of arbitrary self types based around the `Deref` trait. Naturally, that trait also provides a means to create a `&T`. +Unstable Rust contains an implementation of arbitrary self types based around the `Deref` trait. Naturally, that trait also provides a means to create a `&T`. Example: -However, if it's OK to create a reference `&T`, you _probably don't need this feature_. You can probably simply use `&self` as your receiver type. +```rust +#[feature(arbitrary_self_types)] + +struct SmartPtr(*const T); + +impl Deref for SmartPtr { + type Target = T; + fn deref(&self) -> &Self::Target { + // never called, but smart pointers need to implement this method + // sometimes it's just not safe to create a reference to self.0 + } +} + +struct ConcreteType; + +impl ConcreteType { + fn some_method(self: SmartPtr) { + + } +} + +fn main() { + let concrete: SmartPtr = ...; + concrete.some_method(); +} +``` + +However, if it's OK to create a reference `&T`, you _probably_ don't need this feature. You can simply use `&self` as your receiver type: + +```rust +impl ConcreteType { + fn some_method(&self) { + + } +} +``` -This feature is fundamentally aimed at smart pointer types `P` where it's not safe to create a reference `&T`. As noted above, that's most commonly because of semantic differences to pointers in other languages, but it might be because references have special meaning or behavior in some pure Rust domain. Either way, it's not OK to create a Rust reference `&T` or `&mut T`, yet we may want to allow methods to be called on some reference-like thing. +This feature is mostly aimed at smart pointer types `P` where it's not safe to create a reference `&T`. As noted above, that's most commonly because of semantic differences to pointers in other languages, but it might be because references have special meaning or behavior in some pure Rust domain. Either way, it's not OK to create a Rust reference `&T` or `&mut T`, yet we may want to allow methods to be called on some reference-like thing. -For this reason, implementing `Deref::deref` is problematic for _nearly everyone who wants to use arbitrary self types_. +For this reason, implementing `Deref::deref` is problematic for most of the likely users of this "arbitrary self types" feature. -If you're implementing a smart pointer `P` yet you can't allow a reference `&T` to exist, any option for implementing `Deref::deref` has drawbacks: +If you're implementing a smart pointer `P`, and you need to allow `impl T { fn method(self: P) { ... }}`, yet you can't allow a reference `&T` to exist, any option for implementing `Deref::deref` has drawbacks: * Specify `Deref::Target=T` and panic in `Deref::deref`. Not good. -* Specify `Deref::Target=*const T`. This works with the current arbitrary self types feature, but is only possible if your smart pointer type contains a `*const T` which you can reference - this isn't the case for (for instance) weak pointers or types containing `NonNull`. +* Specify `Deref::Target=*const T`. This is only possible if your smart pointer type contains a `*const T` which you can reference - this isn't the case for (for instance) weak pointers or types containing `NonNull`. -Therefore, the current Arbitrary Self Types v2 provides a separate `Receiver` trait. +Therefore, the current Arbitrary Self Types v2 provides a separate `Receiver` trait, so that there's no need to provide an awkward `Deref::deref` implementation. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation From ce4aa78d469a9862e2f9671a26c876172c40f968 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Fri, 3 Nov 2023 11:28:48 +0000 Subject: [PATCH 08/47] Explain that the example is abridged. Add a few more notes on safety. --- text/3519-arbitrary-self-types-v2.md | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 1479a8983d9..a6b1f3d7610 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -26,6 +26,8 @@ One use-case is cross-language interop (JavaScript, Python, C++), where other la #[derive(Clone)] /// A C++ reference. Obeys C++ reference semantics, not Rust reference semantics. /// There is no exclusivity; the underlying data may mutate, etc. +/// (This is an abridged example: a real CppRef type would fully document invariants +/// here.) pub struct CppRef { ptr: *const T, } @@ -40,20 +42,20 @@ struct ConcreteCppType { } // all generated by bindings generator; mostly calls into C++ +// In this example these are not marked "unsafe" because we do not dereference +// CppRef::ptr in Rust. However, arbitrary bad things may occur in foreign +// languages so such functions are not fully "safe" either. Safety of FFI +// is orthogonal to this RFC. impl ConcreteCppType { - fn some_cpp_method(self: CppRef) { - } - fn get_int_field(self: &CppRef) -> u32 { - } - fn get_more_complex_field(self: &CppRef) -> CppRef { - } - fn equals(self: &CppRef) -> bool { - } + fn some_cpp_method(self: CppRef) {} + fn get_int_field(self: &CppRef) -> u32 {} + fn get_more_complex_field(self: &CppRef) -> CppRef {} + fn equals(self: &CppRef) -> bool {} } // generated by bindings generator fn get_cpp_reference() -> CppRef { - // ... + // also calls into C++ } fn main() { @@ -464,6 +466,8 @@ fn main() { } ``` +Even if the reader takes the view that all calls into foreign languages are intrinsically unsafe and must be marked as such, hopefully the reader would support building abstractions using the Rust type system to minimize the practical risk of undefined behavior. That's what this RFC aims to enable. + # Prior art [prior-art]: #prior-art From d3ea0b39043a2ebf387067445fe16aa8a1b72e08 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Fri, 3 Nov 2023 11:57:48 +0000 Subject: [PATCH 09/47] Note method candidacy of &T and &mut T. --- text/3519-arbitrary-self-types-v2.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index a6b1f3d7610..78244160945 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -222,6 +222,8 @@ It is also implemented for `&T`, `&mut T`, `*const T` and `*mut T`. The existing Rust [reference section for method calls describes the algorithm for assembling method call candidates](https://doc.rust-lang.org/reference/expressions/method-call-expr.html). This algorithm changes in one simple way: instead of dereferencing types (using the `Deref`) trait, we use the new `Receiver` trait to determine the next step. +(Note that the existing algorithm isn't quite as simple as following the chain of `Deref`. In particular, `&T` and `&mut T` are considered as candidates too at each step; this RFC does not change that.) + Because a blanket implementation is provided for users of the `Deref` trait and for `&T`/`&mut T`, the net behavior is similar. But this provides the opportunity for types which can't implement `Deref` to act as method receivers. Dereferencing a raw pointer usually needs `unsafe` (for good reason!) but in this case, no actual dereferencing occurs. This is used only to determine a list of method candidates; no memory access is performed and thus no `unsafe` is needed. From 3f4493d85940515b366b00199db4c3ff5db67703 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Fri, 3 Nov 2023 12:03:56 +0000 Subject: [PATCH 10/47] Note existing behavior of recursion. --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 78244160945..08a67a40ace 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -178,7 +178,7 @@ impl Receiver for CustomPtr { ## Recursive arbitrary receivers -Receivers are recursive and therefore allowed to be nested. If type `T` implements `Receiver`, and type `U` implements `Receiver`, `T` is a valid receiver (and so on outward). +Receivers are recursive and therefore allowed to be nested. If type `T` implements `Receiver`, and type `U` implements `Receiver`, `T` is a valid receiver (and so on outward). This is the behavior for the current special-cased self types (`Pin`, `Box` etc.) so as we remove the special-casing we need to retain this property. For example, this self type is valid: From 4a42fa3283f0a4a16e6dc4524c4938a320db0e41 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Fri, 3 Nov 2023 14:12:13 +0000 Subject: [PATCH 11/47] Rewrite lifetime elision section. --- text/3519-arbitrary-self-types-v2.md | 42 ++++++++++++++++++++++------ 1 file changed, 33 insertions(+), 9 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 08a67a40ace..3f9f4296586 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -238,26 +238,50 @@ This RFC does not propose any changes to `DispatchFromDyn`. Since `DispatchFromD ## Lifetime elision -As discussed in the [motivation](#motivation), this new facility is _most likely_ to be used in cases where a standard reference can't normally be used. But in other cases a smart pointer self type might wrap a standard Rust reference, and thus might be parameterized by a lifetime. +Arbitrary `self` parameters may involve lifetimes. -Lifetime elision works in the expected fashion: +This RFC currently proposes no changes to the standard lifetime elision rules. There are no known cases where ambiguity results. However, the opposite problem does apply: sometimes explicit lifetimes are required for methods with an arbitrary self type, even if they're not required for an equivalent free function. ```rust -struct SmartPtr<'a, T: ?Sized>(&'a T); +use std::ops::Receiver; -impl<'a, T: ?Sized> Receiver for SmartPtr<'a, T> { +struct SmartPtrByValue(T); + +impl Receiver for SmartPtrByValue { type Target = T; } -struct MyType; +struct SmartPtrByRef<'a, T: ?Sized>(&'a T); -impl MyType { - fn m(self: SmartPtr) {} - fn n(self: SmartPtr<'_, Self>) {} - fn o<'a>(self: SmartPtr<'a, Self>) {} +impl<'a, T: ?Sized> Receiver for SmartPtrByRef<'a, T> { + type Target = T; +} + +struct Concrete(u32); + +impl Concrete { + // n a(self: &SmartPtrByValue) -> &u32 { &self.0.0 } // does not compile + fn b<'a>(self: &'a SmartPtrByValue) -> &'a u32 { &self.0.0 } + // fn c(self: &SmartPtrB) -> &u32 {} // does not compile + fn d<'a, 'b>(self: &'a SmartPtrByRef<'b, Self>) -> &'a u32 { &self.0.0 } + fn e<'a>(self: &'a SmartPtrByRef) -> &'a u32 { &self.0.0 } +} + +fn free_function(param: &SmartPtrByValue) -> &u32 { ¶m.0.0 } + +fn main() { + let by_val = SmartPtrByValue(Concrete(14)); + assert_eq!(*by_val.b(), 14); + let concrete = Concrete(16); + let by_ref = SmartPtrByRef(&concrete); + assert_eq!(*by_ref.d(), 16); + assert_eq!(*by_ref.e(), 16); + assert_eq!(*free_function(by_val), 16); } ``` +In case `a` the lifetime could be elided (as demonstrated by `free_function`) yet an explicit lifetime is currently demanded by the compiler. For now, this extra clarity seems actually desirable. We could relax this restriction in future. (The authors of this RFC are interested in other views here!) + ## Diagnostics The existing branches in the compiler for "arbitrary self types" already emit excellent diagnostics. We will largely re-use them, with the following improvements: From 3577733ea952f790d5d2fd1da9c41a5fa72cdbf2 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Fri, 3 Nov 2023 14:18:54 +0000 Subject: [PATCH 12/47] A bit more on lifetimes. --- text/3519-arbitrary-self-types-v2.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 3f9f4296586..4bb2b7c2a20 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -260,11 +260,15 @@ impl<'a, T: ?Sized> Receiver for SmartPtrByRef<'a, T> { struct Concrete(u32); impl Concrete { - // n a(self: &SmartPtrByValue) -> &u32 { &self.0.0 } // does not compile + // fn a(self: &SmartPtrByValue) -> &u32 { &self.0.0 } // does not compile fn b<'a>(self: &'a SmartPtrByValue) -> &'a u32 { &self.0.0 } // fn c(self: &SmartPtrB) -> &u32 {} // does not compile fn d<'a, 'b>(self: &'a SmartPtrByRef<'b, Self>) -> &'a u32 { &self.0.0 } fn e<'a>(self: &'a SmartPtrByRef) -> &'a u32 { &self.0.0 } + fn f<'a>(self: &'_ SmartPtrByRef<'a, Self>) -> &'a u32 { &self.0.0 } + fn g<'a, 'b>(self: &'a SmartPtrByRef<'b, Self>) -> &'b u32 { &self.0.0 } + fn h<'a>(self: SmartPtrByRef<'a, Self>) -> &'a u32 { &self.0.0 } + // fn i(self: SmartPtrByRef) -> &u32 { &self.0.0 } // does not compile } fn free_function(param: &SmartPtrByValue) -> &u32 { ¶m.0.0 } @@ -274,8 +278,7 @@ fn main() { assert_eq!(*by_val.b(), 14); let concrete = Concrete(16); let by_ref = SmartPtrByRef(&concrete); - assert_eq!(*by_ref.d(), 16); - assert_eq!(*by_ref.e(), 16); + assert_eq!(*by_ref.d(), 16); // same for e, f, g, h assert_eq!(*free_function(by_val), 16); } ``` From e269c006f9ae7a1fe1b0d16ef23a123ed2350156 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 8 Nov 2023 11:58:29 +0000 Subject: [PATCH 13/47] De-emphasize illegality of `&T` Feedback on the RFC points out that there are still lots of valid use-cases for arbitrary self types where it's OK to create `&T`. Adjust emphasis in the RFC to account for this, and cite historical context. --- text/3519-arbitrary-self-types-v2.md | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 4bb2b7c2a20..80cdfd5bcd6 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -19,6 +19,10 @@ This RFC proposes some changes to the existing nightly feature based on the expe ## Motivation for the arbitrary self types feature overall +Originally, "arbitrary self types" was built to allow self types of `Pin<&mut Self>` and similar types as part of async Rust work. At that time, certain types - `Pin`, but also `Rc`, `Box` etc. - were hard coded such that they could be supported as self types. That's been sufficient for many use-cases including async Rust, but it's resulted in these built-in smart pointer types having greater powers (in stable Rust) than user-contributed smart pointers, an undesirable state. + +Since then, other use-cases have become clear where crates need to make their own smart pointer types with similar powers. + One use-case is cross-language interop (JavaScript, Python, C++), where other languages' references can’t guarantee the aliasing and exclusivity semantics required of a Rust reference. For example, the C++ `this` pointer can't be practically or safely represented as a Rust reference because C++ may retain other pointers to the data and it might mutate at any time. Yet, calling C++ methods intrinsically requires a `this` reference. With "arbitrary self types", smart pointer types can be created which obey foreign-language semantics and can be used in safe Rust code: ```rust @@ -114,19 +118,11 @@ fn main() { } ``` -However, if it's OK to create a reference `&T`, you _probably_ don't need this feature. You can simply use `&self` as your receiver type: - -```rust -impl ConcreteType { - fn some_method(&self) { - - } -} -``` +This works well for some smart pointer types where it's OK to create `&T` (but not necessarily `&mut T`). This includes `Pin` and the reference counted pointers. For that reason, the original arbitrary self types feature could be based around `Deref`. But in other smart pointer use-cases (especially those relating to foreign language semantics) it's not OK to create even `&T`. -This feature is mostly aimed at smart pointer types `P` where it's not safe to create a reference `&T`. As noted above, that's most commonly because of semantic differences to pointers in other languages, but it might be because references have special meaning or behavior in some pure Rust domain. Either way, it's not OK to create a Rust reference `&T` or `&mut T`, yet we may want to allow methods to be called on some reference-like thing. +The arbitrary self types feature should be enhanced so it works even when we can't allow `&T`. As noted above, that's most commonly because of semantic differences to pointers in other languages, but it might be because references have special meaning or behavior in some pure Rust domain. Either way, it may not be OK to create a Rust reference `&T`, yet we may want to allow methods to be called on some reference-like thing. -For this reason, implementing `Deref::deref` is problematic for most of the likely users of this "arbitrary self types" feature. +For this reason, implementing `Deref::deref` is problematic for many of the likely users of this "arbitrary self types" feature. If you're implementing a smart pointer `P`, and you need to allow `impl T { fn method(self: P) { ... }}`, yet you can't allow a reference `&T` to exist, any option for implementing `Deref::deref` has drawbacks: From 544f1afa7d488d6bb73a66de9c4229beb9ccc9e1 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 8 Nov 2023 12:13:30 +0000 Subject: [PATCH 14/47] Discuss recursion of traits. --- text/3519-arbitrary-self-types-v2.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 80cdfd5bcd6..48eea70d65d 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -184,6 +184,8 @@ impl MyType { } ``` +The Rust language doesn't provide a way for user code to explore this recursion, so this trait is unlikely to be useful except to the compiler. Nevertheless, we don't intend to _prevent_ use of the `Receiver` trait by user code: since the same recursive property applies to `Deref` yet it's been occasionally useful to [introduce `Deref` bounds](https://doc.rust-lang.org/std/pin/struct.Pin.html#method.new_unchecked). + # Reference-level explanation [reference-level-explanation]: #reference-level-explanation From fad04aee7e432acc31fb39464300debfa9abd244 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 8 Nov 2023 12:24:48 +0000 Subject: [PATCH 15/47] Ban generic receiver types. --- text/3519-arbitrary-self-types-v2.md | 35 ++++++++-------------------- 1 file changed, 10 insertions(+), 25 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 48eea70d65d..e8ab483f720 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -131,6 +131,8 @@ If you're implementing a smart pointer `P`, and you need to allow `impl T { f Therefore, the current Arbitrary Self Types v2 provides a separate `Receiver` trait, so that there's no need to provide an awkward `Deref::deref` implementation. +In addition, this v2 proposes to block generic receivers, which are currently allowed by the v1 (unstable) arbitrary self types feature. See the [diagnostics section for reasoning](#diagnostics). + # Guide-level explanation [guide-level-explanation]: #guide-level-explanation @@ -284,37 +286,20 @@ fn main() { In case `a` the lifetime could be elided (as demonstrated by `free_function`) yet an explicit lifetime is currently demanded by the compiler. For now, this extra clarity seems actually desirable. We could relax this restriction in future. (The authors of this RFC are interested in other views here!) ## Diagnostics +[diagnostics]: #diagnostics The existing branches in the compiler for "arbitrary self types" already emit excellent diagnostics. We will largely re-use them, with the following improvements: - In the case where a self type is invalid because it doesn't implement `Receiver`, the existing excellent error message will be updated. - An easy mistake is to implement `Receiver` for `P`, forgetting to specify `T: ?Sized`. `P` then only works as a `self` parameter in traits `where Self: Sized`, an unusual stipulation. It's not obvious that `Sized`ness is the problem here, so we will identify this case specifically and produce an error giving that hint. - There are certain types which feel like they "should" implement `Receiver` but do not: `Weak` and `NotNull`. If these are encountered as a self type, we should produce a specific diagnostic explaining that they do not implement `Receiver` and suggesting that they could be wrapped in a newtype wrapper if method calls are important. We hope this can be achieved with [diagnostic items](https://rustc-dev-guide.rust-lang.org/diagnostics/diagnostic-items.html). -- Under some circumstances, the compiler identifies method candidates but then discovers that the self type doesn't match. This results currently in a simple "mismatched types" error; we can provide a more specific error message here. The only known case is where a method is generic over `Receiver`, and the caller explicitly specifies the wrong type: - ```rust - #![feature(receiver_trait)] - - use std::ops::Receiver; - - struct SmartPtr<'a, T: ?Sized>(&'a T); - - impl<'a, T: ?Sized> Receiver for SmartPtr<'a, T> { - type Target = T; - } - - struct Foo(u32); - impl Foo { - fn a>(self: R) { } - } - - fn main() { - let foo = Foo(1); - let smart_ptr = SmartPtr(&foo); - smart_ptr.a(); // this compiles - smart_ptr.a::<&Foo>(); // currently results in "mismatched types"; we can probably do better - } - ``` -- If a method `m` is generic over `R: Receiver` (or, perhaps more commonly, `R: Deref`) and `self: R`, then someone calls it with `object_by_value.m()`, it won't work because Rust doesn't know to use `&object_by_value`, and the message `the trait bound Foo: 'Receiver/Deref' is not satisfied` is generated. While correct, this may be surprising because users expect to be able to use `object_by_value.m2()` where `fn m2(&self)`. The resulting error message already suggests that the user create a reference in order to match the `Receiver` trait, so this may be sufficient already, but we may add an additional note here. +- The current unstable arbitrary self types feature allows generic receivers. For instance, + ```rust + impl Foo { + fn a>(self: R) { } + } + ``` + We don't know a use-case for this. There are several cases where this can result in misleading diagnostics. (For instance, if such a method is called with an incorrect type (for example `smart_ptr.a::<&Foo>()` instead of `smart_ptr.a::()`). We could attempt to find and fix all those cases. However, we feel that generic receiver types might risk subtle interactions with method resolutions and other parts of the language. We think it is a safer choice to generate an error on any declaration of a generic `self` type. # Drawbacks [drawbacks]: #drawbacks From 5e63e83facaedfc34af898b3697cba8da2b8f889 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 8 Nov 2023 12:51:14 +0000 Subject: [PATCH 16/47] Tiny clarification. --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index e8ab483f720..428988acc18 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -19,7 +19,7 @@ This RFC proposes some changes to the existing nightly feature based on the expe ## Motivation for the arbitrary self types feature overall -Originally, "arbitrary self types" was built to allow self types of `Pin<&mut Self>` and similar types as part of async Rust work. At that time, certain types - `Pin`, but also `Rc`, `Box` etc. - were hard coded such that they could be supported as self types. That's been sufficient for many use-cases including async Rust, but it's resulted in these built-in smart pointer types having greater powers (in stable Rust) than user-contributed smart pointers, an undesirable state. +Originally, "arbitrary self types" was built to allow self types of `Pin<&mut Self>` and similar types as part of async Rust work. At that time, certain types - `Pin`, but also `Rc`, `Box` etc. - became hard coded such that they could be supported as self types. That's been sufficient for many use-cases including async Rust, but it's resulted in these built-in smart pointer types having greater powers (in stable Rust) than user-contributed smart pointers, an undesirable state. Since then, other use-cases have become clear where crates need to make their own smart pointer types with similar powers. From 69b1edfc07a8926e7a60638a7a9cb336f2cb86ef Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 8 Nov 2023 13:00:05 +0000 Subject: [PATCH 17/47] Emphasize that interop cases are about automation. Based on feedback on the RFC, we need to better emphasize that foreign language interop cases here are mostly about the automatic generation of a foreign language pointer or reference type. --- text/3519-arbitrary-self-types-v2.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 428988acc18..9d93242d325 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -23,7 +23,15 @@ Originally, "arbitrary self types" was built to allow self types of `Pin<&mut Se Since then, other use-cases have become clear where crates need to make their own smart pointer types with similar powers. -One use-case is cross-language interop (JavaScript, Python, C++), where other languages' references can’t guarantee the aliasing and exclusivity semantics required of a Rust reference. For example, the C++ `this` pointer can't be practically or safely represented as a Rust reference because C++ may retain other pointers to the data and it might mutate at any time. Yet, calling C++ methods intrinsically requires a `this` reference. With "arbitrary self types", smart pointer types can be created which obey foreign-language semantics and can be used in safe Rust code: +One use-case is cross-language interop (JavaScript, Python, C++). In many cases, automatic code generation tools need to represent foreign language pointers or references somehow in Rust, and often, we want to call methods on such types. But, other languages' references can’t guarantee the aliasing and exclusivity semantics required of a Rust reference. For example, the C++ `this` pointer can't be practically or safely represented as a Rust reference because C++ may retain other pointers to the data and it might mutate at any time. + +What is a code generator to do? Its options in current stable Rust are poor: + +* It can represent foreign pointers/references as `&T`, with a virtual certainty of undefined behavior due to different guarantees in different languages +* It can represent foreign pointers/references as `*const T` or `*mut T` but can't attach methods. +* It can represent foreign pointers/references as a smart pointer type (`CppRef` or `CppPtr`) but can't attach methods. + + With "arbitrary self types", smart pointer types can be created which obey foreign-language semantics and yet allow method calls: ```rust #[repr(transparent)] @@ -63,7 +71,7 @@ fn get_cpp_reference() -> CppRef { } fn main() { - // Safe Rust code manipulating C++ objects via C++-semantics references + // Rust code manipulating C++ objects via C++-semantics references let cpp_obj_reference: CppRef = get_cpp_reference(); // cpp_obj_reference does not obey Rust reference semantics. Other // "references" to the same data may exist in the Rust or C++ domain. From 72095869a1ee8196728ae3b0482ac9f90665e072 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 8 Nov 2023 13:40:46 +0000 Subject: [PATCH 18/47] Simplify introduction of motivation a little. --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 9d93242d325..30df11cfb09 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -19,7 +19,7 @@ This RFC proposes some changes to the existing nightly feature based on the expe ## Motivation for the arbitrary self types feature overall -Originally, "arbitrary self types" was built to allow self types of `Pin<&mut Self>` and similar types as part of async Rust work. At that time, certain types - `Pin`, but also `Rc`, `Box` etc. - became hard coded such that they could be supported as self types. That's been sufficient for many use-cases including async Rust, but it's resulted in these built-in smart pointer types having greater powers (in stable Rust) than user-contributed smart pointers, an undesirable state. +The Rust async work identified a need to allow `self` types of `Pin<&mut Self>` (and similar). At that time, certain types - `Pin`, `Rc`, `Box` etc. - became hard coded in stable Rust as valid `self` types. That's been sufficient for many use-cases including async Rust, but this special power is currently restricted to these hard-coded types. Since then, other use-cases have become clear where crates need to make their own smart pointer types with similar powers. From 1e97685d51c1c52e25496a76afc47b3f1d7ba130 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 15 Nov 2023 18:03:41 +0000 Subject: [PATCH 19/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Mads Marquart --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 30df11cfb09..00fe033284b 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -79,7 +79,7 @@ fn main() { // through Rust back into C++ let some_value: u32 = cpp_pbj_refence.get_int_field(); let some_field = cpp_obj_reference.get_more_complex_field(); - cpp_obj_reference.compare_with(&get_cpp_reference()); + cpp_obj_reference.equals(&get_cpp_reference()); } ``` From 476d833ef9c40a70031126924e633b8df46714f0 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 15 Nov 2023 21:16:03 +0000 Subject: [PATCH 20/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Matthew House --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 00fe033284b..a99c6f0906a 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -77,7 +77,7 @@ fn main() { // "references" to the same data may exist in the Rust or C++ domain. // But it can effectively be used as an opaque token to pass safely // through Rust back into C++ - let some_value: u32 = cpp_pbj_refence.get_int_field(); + let some_value: u32 = cpp_obj_reference.get_int_field(); let some_field = cpp_obj_reference.get_more_complex_field(); cpp_obj_reference.equals(&get_cpp_reference()); } From 769c31bbe859b1ed102b28aad36aa4c91bc21a2a Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 16 Nov 2023 09:23:05 +0000 Subject: [PATCH 21/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Mads Marquart --- text/3519-arbitrary-self-types-v2.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index a99c6f0906a..5ef02e69b56 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -320,14 +320,16 @@ Why should we *not* do this? ## Method shadowing [method-shadowing]: #method-shadowing -For a smart pointer `P`, a method call `p.m()` might call a method on the smart pointer type itself (`P::m`), or, if the smart pointer implements `Deref`, it might already call `T::m`. This already gives the possibility that `T::m` would be shadowed by `P::m`. +For a smart pointer `P` that implements `Deref`, a method call `p.m()` might call a method `P::m` on the smart pointer type itself, or it might call `T::m`. If both methods are declared, this results in an error. -Current Rust standard library smart pointers are designed with this shadowing behavior in mind: +Rust standard library smart pointers are designed with this shadowing behavior in mind: -* `Box`, `Pin`, `Rc` and `Arc` already heavily use associated functions rather than methods -* Where they use methods, it's often with the _intention_ of shadowing a method in the inner type (e.g. `Arc::clone`) +* `Box`, `Pin`, `Rc` and `Arc` heavily use associated functions rather than methods. +* Where they use methods, it's often with the _intention_ of shadowing a method in the inner type (e.g. `Arc::clone`). -These method shadowing risks are effectively the same for `Deref` and `Receiver`. This RFC does not make things worse (it just adds additional flexibility to the `self` parameter type for `T::m`). However it does mean that the `Receiver` trait cannot be added to smart pointer types which were not designed with these concerns in mind. +Furthermore, the `Deref` trait itself [documents this possible compatibility hazard](https://doc.rust-lang.org/nightly/std/ops/trait.Deref.html#when-to-implement-deref-or-derefmut), and the Rust API Guidelines has [a guideline about avoiding inherent methods on smart pointers](https://rust-lang.github.io/api-guidelines/predictability.html#smart-pointers-do-not-add-inherent-methods-c-smart-ptr). + +These method shadowing risks also apply to `Receiver`. This RFC does not make things worse for types that implement `Deref`, it only adds additional flexibility to the `self` parameter type for `T::m`. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives From f51948c4169cb30dec56bcf0ecaf7a4c2d5118a9 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Tue, 21 Nov 2023 17:15:40 +0000 Subject: [PATCH 22/47] Clarify recursion point. --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 5ef02e69b56..e6e65b04061 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -194,7 +194,7 @@ impl MyType { } ``` -The Rust language doesn't provide a way for user code to explore this recursion, so this trait is unlikely to be useful except to the compiler. Nevertheless, we don't intend to _prevent_ use of the `Receiver` trait by user code: since the same recursive property applies to `Deref` yet it's been occasionally useful to [introduce `Deref` bounds](https://doc.rust-lang.org/std/pin/struct.Pin.html#method.new_unchecked). +The Rust language doesn't provide a way for user code to use this recursive property in generics or iteration, so this trait is unlikely to be useful except to the compiler. Nevertheless, we don't intend to _prevent_ use of the `Receiver` trait by user code: since the same recursive property applies to `Deref` yet it's been occasionally useful to [introduce `Deref` bounds](https://doc.rust-lang.org/std/pin/struct.Pin.html#method.new_unchecked). # Reference-level explanation [reference-level-explanation]: #reference-level-explanation From d9f2316d65ba1e7f2192c23cbe7fed70e4c41b18 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 22 Nov 2023 12:52:53 +0000 Subject: [PATCH 23/47] Emphasize *const Self as receiver. As suggested by @nikomatsakis. --- text/3519-arbitrary-self-types-v2.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index e6e65b04061..8d4295b8464 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -93,6 +93,8 @@ In theory, users can define their own smart pointers. In practice, they're secon This RFC proposes to loosen this restriction to allow custom smart pointer types to be accepted as a `self` type just like for the standard library types. +The current unstable `arbitrary_self_type` feature also allows raw pointers (e.g. `*const Self`) to be a method receiver. This is highly beneficial for unsafe code where the semantics of a reference cannot be guaranteed. + See also [this blog post](https://medium.com/@adetaylor/the-case-for-stabilizing-arbitrary-self-types-b07bab22bb45), especially for a list of more specific use-cases. ## Motivation for the v2 changes @@ -141,6 +143,8 @@ Therefore, the current Arbitrary Self Types v2 provides a separate `Receiver` tr In addition, this v2 proposes to block generic receivers, which are currently allowed by the v1 (unstable) arbitrary self types feature. See the [diagnostics section for reasoning](#diagnostics). +Aside from these differences, Arbitrary Self Types v2 is similar to the existing unstable `arbitrary_self_types` feature, including in its support for raw pointers as method receivers. + # Guide-level explanation [guide-level-explanation]: #guide-level-explanation From 750ec48f9e556cb095229ac0f0477b389ab71022 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 22 Nov 2023 13:04:39 +0000 Subject: [PATCH 24/47] Rewrite Lifetime Elision section. --- text/3519-arbitrary-self-types-v2.md | 48 ++++------------------------ 1 file changed, 7 insertions(+), 41 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 8d4295b8464..0d8d6447558 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -252,50 +252,16 @@ This RFC does not propose any changes to `DispatchFromDyn`. Since `DispatchFromD Arbitrary `self` parameters may involve lifetimes. -This RFC currently proposes no changes to the standard lifetime elision rules. There are no known cases where ambiguity results. However, the opposite problem does apply: sometimes explicit lifetimes are required for methods with an arbitrary self type, even if they're not required for an equivalent free function. +Even in existing stable Rust, there are [bugs in lifetime elision for complex `Self` types such as `&Box`](https://github.com/rust-lang/rust/issues/117715). We're aiming to fix them whether or not this RFC is accepted. The net rules will be: -```rust -use std::ops::Receiver; - -struct SmartPtrByValue(T); - -impl Receiver for SmartPtrByValue { - type Target = T; -} - -struct SmartPtrByRef<'a, T: ?Sized>(&'a T); +* If a parameter is the first parameter, and +* Called `self`, and +* Its type involves `Self` anywhere, and +* Its type contains _exactly one_ lifetime anywhere -impl<'a, T: ?Sized> Receiver for SmartPtrByRef<'a, T> { - type Target = T; -} - -struct Concrete(u32); - -impl Concrete { - // fn a(self: &SmartPtrByValue) -> &u32 { &self.0.0 } // does not compile - fn b<'a>(self: &'a SmartPtrByValue) -> &'a u32 { &self.0.0 } - // fn c(self: &SmartPtrB) -> &u32 {} // does not compile - fn d<'a, 'b>(self: &'a SmartPtrByRef<'b, Self>) -> &'a u32 { &self.0.0 } - fn e<'a>(self: &'a SmartPtrByRef) -> &'a u32 { &self.0.0 } - fn f<'a>(self: &'_ SmartPtrByRef<'a, Self>) -> &'a u32 { &self.0.0 } - fn g<'a, 'b>(self: &'a SmartPtrByRef<'b, Self>) -> &'b u32 { &self.0.0 } - fn h<'a>(self: SmartPtrByRef<'a, Self>) -> &'a u32 { &self.0.0 } - // fn i(self: SmartPtrByRef) -> &u32 { &self.0.0 } // does not compile -} - -fn free_function(param: &SmartPtrByValue) -> &u32 { ¶m.0.0 } - -fn main() { - let by_val = SmartPtrByValue(Concrete(14)); - assert_eq!(*by_val.b(), 14); - let concrete = Concrete(16); - let by_ref = SmartPtrByRef(&concrete); - assert_eq!(*by_ref.d(), 16); // same for e, f, g, h - assert_eq!(*free_function(by_val), 16); -} -``` +then that lifetime may be used to elide lifetimes on return types, and will take precedence over any lifetimes in other parameters. -In case `a` the lifetime could be elided (as demonstrated by `free_function`) yet an explicit lifetime is currently demanded by the compiler. For now, this extra clarity seems actually desirable. We could relax this restriction in future. (The authors of this RFC are interested in other views here!) +If this seems wrong, please discuss this over on [the linked bug](https://github.com/rust-lang/rust/issues/117715) rather than here in this RFC, because none of that should change with this RFC (though it does make it more likely users will run into the current inconsistencies). We'll try to keep this RFC up to date with the outcome of those discussions. ## Diagnostics [diagnostics]: #diagnostics From 0f6c0f62f6904865101c3e0b6d7ac25751f2f6b5 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 22 Nov 2023 13:11:19 +0000 Subject: [PATCH 25/47] Remove out-of-place paragraph. --- text/3519-arbitrary-self-types-v2.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 0d8d6447558..9760c573bf2 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -238,8 +238,6 @@ The existing Rust [reference section for method calls describes the algorithm fo Because a blanket implementation is provided for users of the `Deref` trait and for `&T`/`&mut T`, the net behavior is similar. But this provides the opportunity for types which can't implement `Deref` to act as method receivers. -Dereferencing a raw pointer usually needs `unsafe` (for good reason!) but in this case, no actual dereferencing occurs. This is used only to determine a list of method candidates; no memory access is performed and thus no `unsafe` is needed. - ## Object safety Receivers are object safe if they implement the (unstable) `core::ops::DispatchFromDyn` trait. From 1859f953ada29f83dbc6651077bea735d7966556 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 22 Nov 2023 17:16:08 +0000 Subject: [PATCH 26/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Josh Triplett --- text/3519-arbitrary-self-types-v2.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 9760c573bf2..7615d7b1ac6 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -54,10 +54,10 @@ struct ConcreteCppType { } // all generated by bindings generator; mostly calls into C++ -// In this example these are not marked "unsafe" because we do not dereference -// CppRef::ptr in Rust. However, arbitrary bad things may occur in foreign -// languages so such functions are not fully "safe" either. Safety of FFI -// is orthogonal to this RFC. +// In this example these are not marked "unsafe" because we do not directly use +// CppRef::ptr in Rust. This example assumes that the corresponding C++ functions +// do not themselves have unsafe behavior and thus can be presented to Rust as safe. +// Safety of FFI is orthogonal to this RFC. impl ConcreteCppType { fn some_cpp_method(self: CppRef) {} fn get_int_field(self: &CppRef) -> u32 {} From 156c5355888f57377f84cf9f3225cc6674197346 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 22 Nov 2023 17:24:09 +0000 Subject: [PATCH 27/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Josh Triplett --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 7615d7b1ac6..f51d5e71b2c 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -87,7 +87,7 @@ fn main() { Another case is when the existence of a reference is, itself, semantically important — for example, reference counting, or if relayout of a UI should occur each time a mutable reference ceases to exist. In these cases it's not OK to allow a regular Rust reference to exist, and yet sometimes we still want to be able to call methods on a reference-like thing. -A third motivation is that taking smart pointer types as `self` parameters can enable functions to act on the smart pointer type, not just the underlying data. For example, taking `&Arc` allows the functions to both clone the smart pointer (noting that the underlying `T` might not implement `Clone`) in addition to access the data inside the type, which is useful for some methods. Also, being able to change a method from accepting `&self` to `self: &Arc` can be done in a mostly frictionless way, whereas changing from `&self` to a static method accepting `&Arc` will always require some amount of refactoring. These options are currently open only to Rust's built-in smart pointer types, not to custom smart pointer types. +A third motivation is that taking smart pointer types as `self` parameters can enable functions to act on the smart pointer type, not just the underlying data. For example, taking `&Arc` allows the functions to both clone the smart pointer (noting that the underlying `T` might not implement `Clone`) in addition to access the data inside the type, which is useful for some methods; this also makes it ergonomic in more cases to make `Arc` explicit rather than having `SomeType` contain an `Arc` internally and have `Arc`-like `clone` semantics. Also, being able to change a method from accepting `&self` to `self: &Arc` can be done in a mostly frictionless way, whereas changing from `&self` to a static method accepting `&Arc` will always require some amount of refactoring. These options are currently open only to Rust's built-in smart pointer types, not to custom smart pointer types. In theory, users can define their own smart pointers. In practice, they're second-class citizens compared to the smart pointers in Rust's standard library. A type `T` can accept method calls using smart pointers as the `self` type only if they're one of Rust's built-in smart pointers. From 4b8ba42922edc5992293ef29ce613d0d41265f74 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 22 Nov 2023 17:24:26 +0000 Subject: [PATCH 28/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Josh Triplett --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index f51d5e71b2c..45ce59bbc71 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -150,7 +150,7 @@ Aside from these differences, Arbitrary Self Types v2 is similar to the existing When declaring a method, users can declare the type of the `self` receiver to be any type `T` where `T: Receiver` or `Self`. -The `Receiver` trait is simple and only requires to specify the `Target` type: +The `Receiver` trait is simple and only requires specifying the `Target` type: ```rust trait Receiver { From 12eac0bc45f0d064ef9914bb91c76d2c5ed61c58 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 22 Nov 2023 17:29:49 +0000 Subject: [PATCH 29/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Josh Triplett --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 45ce59bbc71..7d6f778692c 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -159,7 +159,7 @@ trait Receiver { ``` The `Receiver` trait is already implemented for many standard library types: -- smart pointers in the standard library: `Rc`, `Arc`, `Box`, and `Pin>` (and in fact, any type which implements `Deref`) +- smart pointers in the standard library: `Rc`, `Arc`, `Box`, and `Pin>` (and in fact, any type which implements `Deref`) - references: `&Self` and `&mut Self` - pointers: `*const Self` and `*mut Self` From b962d2d5229fbb34a937d095d12758f6ffe4478e Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 23 Nov 2023 12:46:56 +0000 Subject: [PATCH 30/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Josh Triplett --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 7d6f778692c..e2c688694c0 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -148,7 +148,7 @@ Aside from these differences, Arbitrary Self Types v2 is similar to the existing # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -When declaring a method, users can declare the type of the `self` receiver to be any type `T` where `T: Receiver` or `Self`. +When declaring a method, users can also declare the type of the `self` receiver to be any type `T` where `T: Receiver`, in addition to using `Self` by value or reference. The `Receiver` trait is simple and only requires specifying the `Target` type: From 8b2b6df1064b3b179fe51b95a5d6e06fa028a12b Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 23 Nov 2023 13:22:55 +0000 Subject: [PATCH 31/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Johann Hemmann --- text/3519-arbitrary-self-types-v2.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index e2c688694c0..e6c8d7553b0 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -218,11 +218,11 @@ pub trait Receiver { A blanket implementation is provided for any type that implements `Deref`: ```rust -impl Receiver for P +impl Receiver for P where - P: Deref, + P: Deref, { - type Target = T; + type Target =

::Target; } ``` From 113c8bd08f600b577ec2294d944ad4d56ea33dbc Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 21 Dec 2023 17:22:52 +0000 Subject: [PATCH 32/47] Update reference & deshadowing. This changes the RFC in two significant ways: * As requested widely, it now proposes that we implement Receiver for NonNull and Weak. This requires us, for the first time, to add explicit code to spot potentially shadowed methods and avoid such shadowing. This is described. * It expands the Reference section to describe changes to the probing algorithm, which are now a little more extensive than the previous version of the RFC described, because we now search two different chains - one for types into which the receiver can be converted, and another chain for locations to search for possible methods. --- text/3519-arbitrary-self-types-v2.md | 89 ++++++++++++++++++++++++---- 1 file changed, 79 insertions(+), 10 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index e6c8d7553b0..4c5ffc4d573 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -228,15 +228,46 @@ where (See [alternatives](#no-blanket-implementation) for discussion of the tradeoffs here.) -It is also implemented for `&T`, `&mut T`, `*const T` and `*mut T`. +It is also implemented for `&T`, `&mut T`, `Weak`, `NonNull`, `*const T` and `*mut T`. -## Compiler changes +## Compiler changes: method probing -The existing Rust [reference section for method calls describes the algorithm for assembling method call candidates](https://doc.rust-lang.org/reference/expressions/method-call-expr.html). This algorithm changes in one simple way: instead of dereferencing types (using the `Deref`) trait, we use the new `Receiver` trait to determine the next step. +The existing Rust [reference section for method calls describes the algorithm for assembling method call candidates](https://doc.rust-lang.org/reference/expressions/method-call-expr.html), and there's more detail in the [rustc dev guide](https://rustc-dev-guide.rust-lang.org/method-lookup.html). -(Note that the existing algorithm isn't quite as simple as following the chain of `Deref`. In particular, `&T` and `&mut T` are considered as candidates too at each step; this RFC does not change that.) +The key part of the first page is this: -Because a blanket implementation is provided for users of the `Deref` trait and for `&T`/`&mut T`, the net behavior is similar. But this provides the opportunity for types which can't implement `Deref` to act as method receivers. +> Then, for each candidate type `T`, search for a visible method with a receiver of that type in the following places: +> - `T`'s inherent methods (methods implemented directly on `T`). +> Any of the methods provided by a visible trait implemented by `T`. + +This changes. + +The list of candidate types is assembled in exactly the same way, but we now search for a visible method with a receiver of that type in _more_ places. + +Specifically, instead of using the list of candidate types assembled using the `Deref` trait, we search a list assembled using the `Receiver` trait. As `Receiver` is implemented for all types that implement `Deref`, this is a longer list. + +It's particularly important to emphasize that the list of candidate receiver types _does not change_ - that's still assembled using the `Deref` trait just as now. But, a wider set of locations is searched for methods with those receiver types. + +For instance, `Weak` implements `Receiver` but not `Deref`. Imagine you have `let t: Weak = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: Weak)`, `fn some_method(self: &Weak)`, etc. The possible self types are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types. + +## Compiler changes: deshadowing +[compiler-changes-deshadowing]: #compiler-changes-deshadowing + +The major functional change to the compiler is described above, but a couple of extra adjustments are necessary to avoid future compatibility breaks by method shadowing. + +Specifically, that page also states: + +> If this results in multiple possible candidates, then it is an error, and the receiver must be converted to an appropriate receiver type to make the method call. + +This changes. For smart pointer types which implement `Receiver` (such as `NonNull`) the future addition of any method would become an incompatible change, because it would run the risk of this ambiguity if there were a method of the same name within `T`. So, if there are multiple candidates, and if one of those candidates is in a more "nested" level of receiver than the others (that is, further along the chain of `Receiver`), we will choose that candidate and warn instead of producing a fatal error. + +Similarly, + +> Note: the lookup is done for each type in order, which can occasionally lead to surprising results. + +This changes too, for the same reason. We check for matching candidates for `T`, `&T` and `&mut T`, and again, if there's a candidate on an "inner" type (that, is, further along the chain of `Receiver`) we will choose that type in preference to less nested types and emit a warning. + +(The current reference doesn't describe it, but the current algorithm also searches for method receivers of type `*const Self` and handles them explicitly in case the receiver type was `*mut Self`. We do not check for cases where a new `self: *mut Self` method on an outer type might shadow an existing `self: *const SomePtr` method on an inner type. Although this is a theoretical risk, such compatibility breaks should be easy to avoid because `self: *mut Self` are rare. It's not readily possible to apply the same de-shadowing approach to these, because we already intentionally shadow `*const::cast` with `*mut::cast`.) ## Object safety @@ -276,6 +307,11 @@ The existing branches in the compiler for "arbitrary self types" already emit ex } ``` We don't know a use-case for this. There are several cases where this can result in misleading diagnostics. (For instance, if such a method is called with an incorrect type (for example `smart_ptr.a::<&Foo>()` instead of `smart_ptr.a::()`). We could attempt to find and fix all those cases. However, we feel that generic receiver types might risk subtle interactions with method resolutions and other parts of the language. We think it is a safer choice to generate an error on any declaration of a generic `self` type. +- As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will downgrade an existing error to a warning if there are multiple + method candidates found, if one of those candidates is further along the chain of `Receiver`s than the others. +- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen + in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it + in preference to `self: T` or `self: &T` in the outer type. # Drawbacks [drawbacks]: #drawbacks @@ -297,7 +333,44 @@ Rust standard library smart pointers are designed with this shadowing behavior i Furthermore, the `Deref` trait itself [documents this possible compatibility hazard](https://doc.rust-lang.org/nightly/std/ops/trait.Deref.html#when-to-implement-deref-or-derefmut), and the Rust API Guidelines has [a guideline about avoiding inherent methods on smart pointers](https://rust-lang.github.io/api-guidelines/predictability.html#smart-pointers-do-not-add-inherent-methods-c-smart-ptr). -These method shadowing risks also apply to `Receiver`. This RFC does not make things worse for types that implement `Deref`, it only adds additional flexibility to the `self` parameter type for `T::m`. +This RFC does not make things worse for types that implement `Deref`. + +_However_, this RFC allow types to implement `Receiver`, and in fact does so for `NonNull` and `Weak`. `NonNull` and `Weak` were not designed with method shadowing concerns in mind. This would run the risk of breakage: + +```rust +struct Concrete; + +impl Concrete { + fn wardrobe(self: Weak) { } +} + +fn main() { + let concrete: Weak = /* obtain */; + concrete.wardrobe() +} +``` + +If Rust now adds `Weak::wardrobe(self)`, the above valid code would start to error. + +The same would apply in this slightly different circumstance: + +```rust +struct Concrete; + +impl Concrete { + fn wardrobe(self: &Weak) { } // this is now a reference +} + +fn main() { + let concrete: Weak = /* obtain */; + concrete.wardrobe() +} +``` + +If Rust added `Weak::wardrobe(&self)` we would start to produce an error here. If Rust added `Weak::wardrobe(self)` then it would be +even worse - code would start to call `Weak::wardrobe` where it had previously called `Concrete::wardrobe`. + +The [#compiler-changes-deshadowing](deshadowing section of the compiler changes, above), describes how we avoid this. The compiler will take pains to identify any such ambiguities. If it finds them, it will warn of the situation and then choose the innermost method (in the example above, always `Concrete::wardrobe`). # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -367,10 +440,6 @@ This RFC proposes to implement `Receiver` for `*mut T` and `*const T` within the We prefer the option of specifying behavior in the library using the normal trait, though it's a compatibility break for users of Rust who don't adopt the `core` crate (including compiler tests). -## Implement for `Weak` and `NonNull` - -`Weak` and `NonNull` were not supported by the prior unstable arbitrary self types support, but they share the property that it may be desirable to implement method calls to `T` using them as self types. Unfortunately they also share the property that these types have many Rust methods using `self`, `&self` or `&mut self`. If we added to the set of Rust methods in future, we'd [shadow any such method calls](#method-shadowing). We can't implement `Receiver` for these types unless we come up with a policy that all subsequent additions to these types would instead be associated functions. That would make the future APIs for these types a confusing mismash of methods and associated functions, and the extra user complexity doesn't seem merited. - ## Not do it [not-do-it]: #not-do-it From 9c593663115361fb50e4088312ca1c668fa687cc Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Mon, 12 Feb 2024 16:58:39 +0000 Subject: [PATCH 33/47] Include example lint. --- text/3519-arbitrary-self-types-v2.md | 31 +++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 4c5ffc4d573..75b2dc27a44 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -309,9 +309,34 @@ The existing branches in the compiler for "arbitrary self types" already emit ex We don't know a use-case for this. There are several cases where this can result in misleading diagnostics. (For instance, if such a method is called with an incorrect type (for example `smart_ptr.a::<&Foo>()` instead of `smart_ptr.a::()`). We could attempt to find and fix all those cases. However, we feel that generic receiver types might risk subtle interactions with method resolutions and other parts of the language. We think it is a safer choice to generate an error on any declaration of a generic `self` type. - As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will downgrade an existing error to a warning if there are multiple method candidates found, if one of those candidates is further along the chain of `Receiver`s than the others. -- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen - in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it - in preference to `self: T` or `self: &T` in the outer type. +- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. An example warning might be: + + ``` + warning[W0666]: ambiguous function call + --> src/main.rs:13:4 + | + 13 | orbit_weak.retrograde(); + | ^^^^^^^^^^^^ + | + = note: you may have intended a call to `Orbit::retrograde` or + to `Weak::retrograde` + = note: this method won't be called + --> src/rc/rc.rs:136:21 + | + 136 | fn retrograde(&self) { + | ^^^^^^^^^^^^^^^^^ + | + = note: because we'll call this method instead + --> src/space/near_earth.rs:357:68 + | + 357 | fn retrograde(self: Weak) { + | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + | + = help: call as a function not a method: + ~ Orbit::retrograde(orbit_weak) + = help: call as a function not a method: + ~ Weak::retrograde(orbit_weak) + ``` # Drawbacks [drawbacks]: #drawbacks From a5e274b3353275d93dfb23a99338b918de663630 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 14 Feb 2024 18:45:05 +0000 Subject: [PATCH 34/47] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Mads Marquart --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 75b2dc27a44..b2f42524570 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -248,7 +248,7 @@ Specifically, instead of using the list of candidate types assembled using the ` It's particularly important to emphasize that the list of candidate receiver types _does not change_ - that's still assembled using the `Deref` trait just as now. But, a wider set of locations is searched for methods with those receiver types. -For instance, `Weak` implements `Receiver` but not `Deref`. Imagine you have `let t: Weak = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: Weak)`, `fn some_method(self: &Weak)`, etc. The possible self types are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types. +For instance, `Weak` implements `Receiver` but not `Deref`. Imagine you have `let t: Weak = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: Weak)`, `fn some_method(self: &Weak)`, etc. The possible self types in the method call expression are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types. ## Compiler changes: deshadowing [compiler-changes-deshadowing]: #compiler-changes-deshadowing From 7ea82bd025eaf910d31396ccedf526aec91f087e Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 15 Feb 2024 13:49:14 +0000 Subject: [PATCH 35/47] Move lint to the correct place. --- text/3519-arbitrary-self-types-v2.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index b2f42524570..c4ac97a114e 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -307,10 +307,7 @@ The existing branches in the compiler for "arbitrary self types" already emit ex } ``` We don't know a use-case for this. There are several cases where this can result in misleading diagnostics. (For instance, if such a method is called with an incorrect type (for example `smart_ptr.a::<&Foo>()` instead of `smart_ptr.a::()`). We could attempt to find and fix all those cases. However, we feel that generic receiver types might risk subtle interactions with method resolutions and other parts of the language. We think it is a safer choice to generate an error on any declaration of a generic `self` type. -- As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will downgrade an existing error to a warning if there are multiple - method candidates found, if one of those candidates is further along the chain of `Receiver`s than the others. -- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. An example warning might be: - +- As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will downgrade an existing error to a warning if there are multiple method candidates found, if one of those candidates is further along the chain of `Receiver`s than the others. An example warning might be: ``` warning[W0666]: ambiguous function call --> src/main.rs:13:4 @@ -337,6 +334,7 @@ The existing branches in the compiler for "arbitrary self types" already emit ex = help: call as a function not a method: ~ Weak::retrograde(orbit_weak) ``` +- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. (The warning would be very similar to the above.) # Drawbacks [drawbacks]: #drawbacks From 01c6ca7e611ef9d627000e8f7636d12d6f8ca3da Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 15 Feb 2024 13:57:15 +0000 Subject: [PATCH 36/47] Add future work section for MaybeUninit etc. --- text/3519-arbitrary-self-types-v2.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index c4ac97a114e..fc97d73f907 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -555,6 +555,30 @@ Even if the reader takes the view that all calls into foreign languages are intr A previous PR based on the `Deref` alternative has been proposed before https://github.com/rust-lang/rfcs/pull/2362 and was postponed with the expectation that the lang team would [get back to `arbitrary_self_types` eventually](https://github.com/rust-lang/rfcs/pull/2362#issuecomment-527306157). +# Future work + +We could consider implementing `Receiver` for other types, e.g. [`std::cell`](https://doc.rust-lang.org/std/cell/index.html) types, [`std::sync`](https://doc.rust-lang.org/std/sync/index.html) types, [`std::cmp::Reverse`](https://doc.rust-lang.org/std/cmp/struct.Reverse.html), [`std::num::Wrapping`](https://doc.rust-lang.org/nightly/std/num/struct.Wrapping.html), [`std::mem::MaybeUninit`](https://doc.rust-lang.org/std/mem/union.MaybeUninit.html), [`std::task::Poll`](https://doc.rust-lang.org/nightly/std/task/enum.Poll.html), and so on - possibly even for arrays, `Vec`, `BTreeSet` etc. + +There seems to be no disadvantage to doing this - taking `Vec` as an example, it would only have any effect on the behavior of code if somebody implemented a method taking `Vec` as a receiver. On the other hand, it's hard to imagine use-cases for some of these. It seems best to consider these future possibilities based on whether the end-result seems natural or strange. + +```rust +impl Vexation { + fn do_something_to_vec(self: Vec) { } + fn do_something_to_maybeuninit(self: MaybeUninit) {} +} + +fn main { + let v = Vec::new(); + v.push(Vexation); + v.do_something_to_vec(); // this seems weird and I can't imagine a use-case + + let mut m = MaybeUninit::::uninit(); + m.do_something_to_maybeuninit(); // this seems fine and useful and so maybe we should in future implement Receiver for MaybeUninit +} +``` + +For now, though, we should clearly restrict `Receiver` to those types for which there's a demonstrated need. + # Feature gates This RFC is in an unusual position regarding feature gates. There are two existing gates: From 09f46f7c87fe4f6abc06b1f89058546da72cca14 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 15 Feb 2024 13:58:33 +0000 Subject: [PATCH 37/47] Code error. --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index fc97d73f907..feaceacc5d1 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -568,7 +568,7 @@ impl Vexation { } fn main { - let v = Vec::new(); + let mut v = Vec::new(); v.push(Vexation); v.do_something_to_vec(); // this seems weird and I can't imagine a use-case From 6eb91e99181b6e3d94a02884ea0deb425af05aee Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 15 Feb 2024 15:15:45 +0000 Subject: [PATCH 38/47] Suggestions for compiler changes section. --- text/3519-arbitrary-self-types-v2.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index feaceacc5d1..b53fc5adbd2 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -236,17 +236,19 @@ The existing Rust [reference section for method calls describes the algorithm fo The key part of the first page is this: +> The first step is to build a list of **candidate receiver types**. Obtain these by repeatedly dereferencing the receiver expression's type, adding each type encountered to the list, then finally attempting an unsized coercion at the end, and adding the result type if that is successful. Then, for each candidate `T`, add `&T` and `&mut T` to the list immediately after `T`. + > Then, for each candidate type `T`, search for a visible method with a receiver of that type in the following places: > - `T`'s inherent methods (methods implemented directly on `T`). > Any of the methods provided by a visible trait implemented by `T`. -This changes. +We'll call this second list the **candidate methods**. -The list of candidate types is assembled in exactly the same way, but we now search for a visible method with a receiver of that type in _more_ places. +With this RFC, the candidate receiver types are assembled the same way - nothing changes. But, the **candidate methods** are assembled in a different way. Specifically, instead of iterating the candidate receiver types, we assemble a new list of types by following the chain of `Receiver` implementations. As `Receiver` is implemented for all types that implement `Deref`, this may be the same list or a longer list. Aside from following a different trait, the list is assembled the same way, including the insertion of equivalent reference types. -Specifically, instead of using the list of candidate types assembled using the `Deref` trait, we search a list assembled using the `Receiver` trait. As `Receiver` is implemented for all types that implement `Deref`, this is a longer list. +We then search each type for inherent methods or trait methods in the existing fashion - the only change is that we search a potentially longer list of types. -It's particularly important to emphasize that the list of candidate receiver types _does not change_ - that's still assembled using the `Deref` trait just as now. But, a wider set of locations is searched for methods with those receiver types. +It's particularly important to emphasize also that the list of candidate receiver types _does not change_. But, a wider set of locations is searched for methods with those receiver types. For instance, `Weak` implements `Receiver` but not `Deref`. Imagine you have `let t: Weak = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: Weak)`, `fn some_method(self: &Weak)`, etc. The possible self types in the method call expression are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types. From e4fe352bca95018ee1e1e43c5c394685a3147732 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Fri, 5 Apr 2024 15:52:16 +0100 Subject: [PATCH 39/47] Substantial rewrites. There's been much discussion at recent Rust lang team meetings about this feature, followed by further discussion on Zulip and face-to-face at RustNation. The main conclusions have been: * yes we want to do this; Rust for Linux has equally important use-cases * the previously proposed deshadowing algorithm is believed to be sound but is also complex and counterintuitive, so we may want to more broadly rethink method resolution. * so, for now, let's do the most conservative possible version - NOT supporting raw pointers, Weak or NonNull - but instead erroring on any case where there is a possible method conflict between an outer smart pointer type and its contained type. This will give us maximal flexibility to relax restrictions in future. This PR updates the RFC correspondingly. Prototype code can be found at: https://github.com/rust-lang/rust/compare/master...adetaylor:rust:receiver_trait_with_target_simplified_per_rustnation --- text/3519-arbitrary-self-types-v2.md | 178 +++++++++++++++++---------- 1 file changed, 114 insertions(+), 64 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index b53fc5adbd2..65ab0349e50 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -89,12 +89,10 @@ Another case is when the existence of a reference is, itself, semantically impor A third motivation is that taking smart pointer types as `self` parameters can enable functions to act on the smart pointer type, not just the underlying data. For example, taking `&Arc` allows the functions to both clone the smart pointer (noting that the underlying `T` might not implement `Clone`) in addition to access the data inside the type, which is useful for some methods; this also makes it ergonomic in more cases to make `Arc` explicit rather than having `SomeType` contain an `Arc` internally and have `Arc`-like `clone` semantics. Also, being able to change a method from accepting `&self` to `self: &Arc` can be done in a mostly frictionless way, whereas changing from `&self` to a static method accepting `&Arc` will always require some amount of refactoring. These options are currently open only to Rust's built-in smart pointer types, not to custom smart pointer types. -In theory, users can define their own smart pointers. In practice, they're second-class citizens compared to the smart pointers in Rust's standard library. A type `T` can accept method calls using smart pointers as the `self` type only if they're one of Rust's built-in smart pointers. +Finally, there's just a matter of symmetry with Rust's own smart pointer types. [The Rust for Linux project, for instance, requires a custom `Arc` type](https://rust-for-linux.com/arc-in-the-linux-kernel#arbitrary-self-types). In theory, users can define their own smart pointers. In practice, they're second-class citizens compared to the smart pointers in Rust's standard library. A type `T` can accept method calls using smart pointers as the `self` type only if they're one of Rust's built-in smart pointers. This RFC proposes to loosen this restriction to allow custom smart pointer types to be accepted as a `self` type just like for the standard library types. -The current unstable `arbitrary_self_type` feature also allows raw pointers (e.g. `*const Self`) to be a method receiver. This is highly beneficial for unsafe code where the semantics of a reference cannot be guaranteed. - See also [this blog post](https://medium.com/@adetaylor/the-case-for-stabilizing-arbitrary-self-types-b07bab22bb45), especially for a list of more specific use-cases. ## Motivation for the v2 changes @@ -141,9 +139,11 @@ If you're implementing a smart pointer `P`, and you need to allow `impl T { f Therefore, the current Arbitrary Self Types v2 provides a separate `Receiver` trait, so that there's no need to provide an awkward `Deref::deref` implementation. -In addition, this v2 proposes to block generic receivers, which are currently allowed by the v1 (unstable) arbitrary self types feature. See the [diagnostics section for reasoning](#diagnostics). +This v2 version has two other differences relative to the existing unstable `arbitrary_self_type` feature: +* We won't allow raw pointer receivers, yet. It's highly desirable that we do so in future - this is discussed under the [enable for pointers](#enable-for-pointers) section. +* We will block generic receivers. See the [diagnostics section for reasoning](#diagnostics). -Aside from these differences, Arbitrary Self Types v2 is similar to the existing unstable `arbitrary_self_types` feature, including in its support for raw pointers as method receivers. +Aside from these differences, Arbitrary Self Types v2 is similar to the existing unstable `arbitrary_self_types` feature. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation @@ -200,6 +200,22 @@ impl MyType { The Rust language doesn't provide a way for user code to use this recursive property in generics or iteration, so this trait is unlikely to be useful except to the compiler. Nevertheless, we don't intend to _prevent_ use of the `Receiver` trait by user code: since the same recursive property applies to `Deref` yet it's been occasionally useful to [introduce `Deref` bounds](https://doc.rust-lang.org/std/pin/struct.Pin.html#method.new_unchecked). +## Implementing methods on smart pointers + +If your smart pointer type implements `Receiver`, you should not add methods to that smart pointer type after its initial creation. As soon as anyone is using your smart pointer type outside of your crate, they may add methods on a contained type; for example: + +```rust +impl SomeType { + fn do_something(self: your_crate::SmartPointer) {} +} +``` + +If you then add `SmartPointer::do_something`, this is a conflict, and the compiler will produce an error. It's therefore considered to be a compatibility break to add additional methods to `your_crate::SmartPointer`. It's OK to add methods at the outset when you create `SmartPointer`, until the point at which other people start using it. + +This principle has been followed for the types in Rust's standard library which implement `Receiver`; for instance, `Box` and `Rc`. Mostly they offer associated functions rather than methods. + +In the future there might be a deshadowing algorithm that can relax this rule - see the [method shadowing section below](#method-shadowing) for discussion. + # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -228,7 +244,7 @@ where (See [alternatives](#no-blanket-implementation) for discussion of the tradeoffs here.) -It is also implemented for `&T`, `&mut T`, `Weak`, `NonNull`, `*const T` and `*mut T`. +It is also implemented for `&T` and `&mut T`. ## Compiler changes: method probing @@ -250,7 +266,7 @@ We then search each type for inherent methods or trait methods in the existing f It's particularly important to emphasize also that the list of candidate receiver types _does not change_. But, a wider set of locations is searched for methods with those receiver types. -For instance, `Weak` implements `Receiver` but not `Deref`. Imagine you have `let t: Weak = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: Weak)`, `fn some_method(self: &Weak)`, etc. The possible self types in the method call expression are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types. +For instance, suppose `SmartPtr` implements `Receiver` but not `Deref`. Imagine you have `let t: SmartPtr = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: SmartPtr)`, `fn some_method(self: &SmartPtr)`, etc. The possible self types in the method call expression are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types. ## Compiler changes: deshadowing [compiler-changes-deshadowing]: #compiler-changes-deshadowing @@ -261,15 +277,40 @@ Specifically, that page also states: > If this results in multiple possible candidates, then it is an error, and the receiver must be converted to an appropriate receiver type to make the method call. -This changes. For smart pointer types which implement `Receiver` (such as `NonNull`) the future addition of any method would become an incompatible change, because it would run the risk of this ambiguity if there were a method of the same name within `T`. So, if there are multiple candidates, and if one of those candidates is in a more "nested" level of receiver than the others (that is, further along the chain of `Receiver`), we will choose that candidate and warn instead of producing a fatal error. +With arbitrary self types v2, the compiler will actively search for additional conflicts in order to produce this error in more cases. Specifically, it will consider whether autoreffed candidates conflict with by-value candidates, in order to produce an error in situations like this: -Similarly, +```rust +struct Foo; +struct SmartPtr(T): // implements Receiver -> Note: the lookup is done for each type in order, which can occasionally lead to surprising results. +impl SmartPtr { + fn a(&self) {} // by reference +} + +impl Foo { + fn a(self: SmartPtr) {} // by value +} + +fn main() { + let a = SmartPtr(Foo); + a.a(); // produces an error +} +``` -This changes too, for the same reason. We check for matching candidates for `T`, `&T` and `&mut T`, and again, if there's a candidate on an "inner" type (that, is, further along the chain of `Receiver`) we will choose that type in preference to less nested types and emit a warning. +To be precise, the compiler will: +* Search for the best by-value pick +* Search for the best autoreffed pick +* Search for the best autorefmut pick +* For each pair from the above list, consider the first to be the 'shadowing' pick and the second to be the 'shadowed' pick. Show an error if: + * The same number of autoderefs has been applied (confirming the `self` type is identical, aside from any autoreffing) + * One is further along the chain of `Receiver` than another (confirms that it's arbitrary self types causing the conflcit) + * The shadowing pick is an inherent impl (we are concerned about the case that a smart pointer is adding inherent methods shadowing inner types, not cases where traits bring further methods into play) + * The picks don't refer to the same resulting item (which could happen with things like blanket impls for any type) +* Otherwise, choose the pick in order of by-value, autoreffered, autorefmut, or const ptr as it does now. -(The current reference doesn't describe it, but the current algorithm also searches for method receivers of type `*const Self` and handles them explicitly in case the receiver type was `*mut Self`. We do not check for cases where a new `self: *mut Self` method on an outer type might shadow an existing `self: *const SomePtr` method on an inner type. Although this is a theoretical risk, such compatibility breaks should be easy to avoid because `self: *mut Self` are rare. It's not readily possible to apply the same de-shadowing approach to these, because we already intentionally shadow `*const::cast` with `*mut::cast`.) +Aside from production of errors in more cases, there is no change to method picking here. That said, the production of errors requires us to interrogate more candidates to look for potential conflicts, so this could have a compile-time performance penalty which we should measure. + +(The current reference doesn't describe it, but the current algorithm also searches for method receivers of type `*const Self` and handles them explicitly in case the receiver type was `*mut Self`. We do not check for cases where a new `self: *mut Self` method on an outer type might shadow an existing `self: *const SomePtr` method on an inner type. Although this is a theoretical risk, such compatibility breaks should be easy to avoid because `self: *mut Self` are rare. It's not readily possible to produce errors in these cases, because we already intentionally shadow `*const::cast` with `*mut::cast`.) ## Object safety @@ -279,6 +320,8 @@ As not all receivers might want to permit object safety or are unable to support This RFC does not propose any changes to `DispatchFromDyn`. Since `DispatchFromDyn` is unstable at the moment, object-safe receivers might be delayed until `DispatchFromDyn` is stabilized. `Receiver` is not blocked on further `DispatchFromDyn` work, since non-object-safe receivers already cover a big chunk of the use-cases. +It's been proposed that, instead of `DispatchFromDyn`, a `#[derive(SmartPointer)]` mechanism may be stabilized instead. Again, this doesn't block our work on `Receiver`. There are some use cases for `Receiver` that won't suit either `DispatchFromDyn` nor `#[derive(SmartPointer)]`, most notably the [Rust for Linux `Wrapper` type described here](https://rust-for-linux.com/arc-in-the-linux-kernel#nextprev-pointers-and-dynamic-dispatch). + ## Lifetime elision Arbitrary `self` parameters may involve lifetimes. @@ -309,34 +352,7 @@ The existing branches in the compiler for "arbitrary self types" already emit ex } ``` We don't know a use-case for this. There are several cases where this can result in misleading diagnostics. (For instance, if such a method is called with an incorrect type (for example `smart_ptr.a::<&Foo>()` instead of `smart_ptr.a::()`). We could attempt to find and fix all those cases. However, we feel that generic receiver types might risk subtle interactions with method resolutions and other parts of the language. We think it is a safer choice to generate an error on any declaration of a generic `self` type. -- As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will downgrade an existing error to a warning if there are multiple method candidates found, if one of those candidates is further along the chain of `Receiver`s than the others. An example warning might be: - ``` - warning[W0666]: ambiguous function call - --> src/main.rs:13:4 - | - 13 | orbit_weak.retrograde(); - | ^^^^^^^^^^^^ - | - = note: you may have intended a call to `Orbit::retrograde` or - to `Weak::retrograde` - = note: this method won't be called - --> src/rc/rc.rs:136:21 - | - 136 | fn retrograde(&self) { - | ^^^^^^^^^^^^^^^^^ - | - = note: because we'll call this method instead - --> src/space/near_earth.rs:357:68 - | - 357 | fn retrograde(self: Weak) { - | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - | - = help: call as a function not a method: - ~ Orbit::retrograde(orbit_weak) - = help: call as a function not a method: - ~ Weak::retrograde(orbit_weak) - ``` -- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. (The warning would be very similar to the above.) +- As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will produce a "multiple method candidates" error if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. # Drawbacks [drawbacks]: #drawbacks @@ -360,22 +376,22 @@ Furthermore, the `Deref` trait itself [documents this possible compatibility haz This RFC does not make things worse for types that implement `Deref`. -_However_, this RFC allow types to implement `Receiver`, and in fact does so for `NonNull` and `Weak`. `NonNull` and `Weak` were not designed with method shadowing concerns in mind. This would run the risk of breakage: +_However_, this RFC allow types to implement `Receiver`. This would run the risk of breakage: ```rust struct Concrete; impl Concrete { - fn wardrobe(self: Weak) { } + fn wardrobe(self: SmartPointerWhichImplementsReceiver) { } } fn main() { - let concrete: Weak = /* obtain */; + let concrete: SmartPointerWhichImplementsReceiver = /* obtain */; concrete.wardrobe() } ``` -If Rust now adds `Weak::wardrobe(self)`, the above valid code would start to error. +If `SmartPointerWhichImplementsReceiver` now adds `SmartPointerWhichImplementsReceiver::wardrobe(self)`, the above valid code would start to error. The same would apply in this slightly different circumstance: @@ -383,19 +399,21 @@ The same would apply in this slightly different circumstance: struct Concrete; impl Concrete { - fn wardrobe(self: &Weak) { } // this is now a reference + fn wardrobe(self: &SmartPointerWhichImplementsReceiver) { } // this is now a reference } fn main() { - let concrete: Weak = /* obtain */; + let concrete: SmartPointerWhichImplementsReceiver = /* obtain */; concrete.wardrobe() } ``` -If Rust added `Weak::wardrobe(&self)` we would start to produce an error here. If Rust added `Weak::wardrobe(self)` then it would be -even worse - code would start to call `Weak::wardrobe` where it had previously called `Concrete::wardrobe`. +If Rust added `SmartPointerWhichImplementsReceiver::wardrobe(&self)` we would start to produce an error here. If `SmartPointerWhichImplementsReceiver` added `SmartPointerWhichImplementsReceiver::wardrobe(self)` then it would be +even worse - code would start to call `SmartPointerWhichImplementsReceiver::wardrobe` where it had previously called `SmartPointerWhichImplementsReceiver::wardrobe`. + +The [#compiler-changes-deshadowing](deshadowing section of the compiler changes, above), describes how we avoid this. The compiler will take pains to identify any such ambiguities and it will show an error. -The [#compiler-changes-deshadowing](deshadowing section of the compiler changes, above), describes how we avoid this. The compiler will take pains to identify any such ambiguities. If it finds them, it will warn of the situation and then choose the innermost method (in the example above, always `Concrete::wardrobe`). +We have (extensively) considered algorithms to pick the intended method instead - see [picking the shadowed method](#picking-the-shadowed-method), below. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -410,7 +428,7 @@ As noted in the rationale section, the currently nightly implementation implemen ## No blanket implementation for `Deref` [no-blanket-implementation]: #no-blanket-implementation -The other major approach previously discussed is to have a `Receiver` trait, as proposed in this RFC, but without a blanket implementation for `T: Deref`. Blanket implementations are unusual for core Rust traits, but the authors of this RFC believe it's necessary in this case. +Another major approach previously discussed is to have a `Receiver` trait, as proposed in this RFC, but without a blanket implementation for `T: Deref`. Blanket implementations are unusual for core Rust traits, but the authors of this RFC believe it's necessary in this case. Specifically, this RFC proposes that the existing method search algorithm is modified to search the `Receiver` chain _instead of_ the `Deref` chain. @@ -451,19 +469,51 @@ If some use-case presents itself where a type _must_ implement `Deref` but not ` Change the trait definition to have a generic parameter instead of an associated type. There might be permutations here which could allow a single smart pointer type to dispatch method calls to multiple possible receivers - but this would add complexity, no known use case exists, and it might cause worst-case O(n^2) performance on method lookup. -## Do not enable for pointers +## Enable for raw pointers (or `Weak` or `NonNull`) +[enable-for-pointers]: #enable-for-pointers -It would be possible to respect the `Receiver` trait without allowing dispatch onto raw pointers - they are essentially independent changes to the candidate deduction algorithm. +This RFC, unlike the original Arbitrary Self Types nightly feature, does not allow raw pointer `self` types. We are led to believe that raw pointer receivers are quite important for the future of safe Rust, because stacked borrows makes it illegal to materialize references in many positions, and there are a lot of operations (like going from a raw pointer to a raw pointer to a field) where users don't need to or want to do that. -We don't want to encourage the use of raw pointers, and would prefer rather that raw pointers are wrapped in a custom smart pointer that encodes and documents the invariants. So, there's an argument not to add the raw pointer support. +On the other hand, we don't want to encourage the use of raw pointers, and would prefer rather that raw pointers are wrapped in a custom smart pointer that encodes and documents the invariants. -However, the current unstable `arbitrary_self_types` feature provides support for raw pointer receivers, and with years of experience no major concerns have been spotted. We would prefer not to deviate from the existing proposal more than necessary. Moreover, we are led to believe that raw pointer receivers are quite important for the future of safe Rust, because stacked borrows makes it illegal to materialize references in many positions, and there are a lot of operations (like going from a raw pointer to a raw pointer to a field) where users don't need to or want to do that. We think the utility of including raw pointer receivers outweighs the risks of tempting people to over-use raw pointers. +The main problem, though, is that raw pointers _have methods_ and Rust wants to add more methods to them in future - especially around pointer provenance. As noted in the [deshadowing section](#compiler-changes-deshadowing), we would start to generate errors in arbitrary crates if ever we added such additional methods to raw pointers. That's clearly not OK. So, to add support for raw pointers as self types, we'd need to use a cleverer deshadowing algorithm. This is discussed in the next section, but overall has been judged to be too complicated _for now_. -## Provide compiler support for dereferencing pointers +Instead, this version of Arbitrary Self Types is as conservative as possible, such that we ought to be able to adopt such an algorithm in a future enhancement. -This RFC proposes to implement `Receiver` for `*mut T` and `*const T` within the library. This is slightly different from the unstable arbitrary self types support, which instead hard-codes pointer support into the candidate deduction algorithm in the compiler (because obviously `Deref` can't be implemented for pointers.) +## Pick shadowed methods instead of erroring +[pick-shadowed-methods-instead-of-erroring]: #pick-shadowed-methods-instead-of-erroring -We prefer the option of specifying behavior in the library using the normal trait, though it's a compatibility break for users of Rust who don't adopt the `core` crate (including compiler tests). +As explained in the [deshadowing section](#compiler-changes-deshadowing), the Rust compiler will generate errors in case of a conflict between a method on a smart pointer and an inner type. For example: + +```rust +struct Foo; +struct SmartPtr(T): // implements Receiver + +impl SmartPtr { + fn a(self) {} +} + +impl Foo { + fn a(self: SmartPtr) {} +} + +fn main() { + let a = SmartPtr(Foo); + a.a(); // produces an error +} +``` + +There has been extensive discussion (and prototyping) about cleverer "deshadowing" algorithms here. The current leading contender is to: + +* If there are conflicts, + * Always pick the "inner" method; + * Show a warning, and ask the user to disambiguate using UFC syntax (or [future alternatives](https://internals.rust-lang.org/t/idea-paths-in-method-names/6834?u=scottmcm)). + +The rationale is that the author of the "inner" method is always aware of pre-existing methods on the "outer" (smart pointer) type. If a conflict arises, this means that the new method was added to the outer type, and therefore Rust can maintain existing behavior by picking the method on the inner type. (This logic falls down in the case of race conditions as crates are published, but it's broadly true.) This logic is believed to be sound, but it's counterintuitive: in all other circumstances Rust method probing works outside-in. This algorithm is also quite complex, and there's a risk of unknown unknowns. + +There has also been some discussion about broader changes to method resolution in future, for example a crate-by-crate approach or even a `name-resolution.lock` file. + +The decision has been taken, then, to restrict the current RFC to the most conserative possible version - one which errors on _any_ conflicts, and firmly advises the creators of smart pointers to avoid adding new methods. This gives us maximum flexibility in future to allow more possibilities by relaxing some of those errors to warnings. This is a high priority primarily because of the desire to allow method calls on raw pointers (see the previous section). ## Not do it [not-do-it]: #not-do-it @@ -559,7 +609,9 @@ A previous PR based on the `Deref` alternative has been proposed before https:// # Future work -We could consider implementing `Receiver` for other types, e.g. [`std::cell`](https://doc.rust-lang.org/std/cell/index.html) types, [`std::sync`](https://doc.rust-lang.org/std/sync/index.html) types, [`std::cmp::Reverse`](https://doc.rust-lang.org/std/cmp/struct.Reverse.html), [`std::num::Wrapping`](https://doc.rust-lang.org/nightly/std/num/struct.Wrapping.html), [`std::mem::MaybeUninit`](https://doc.rust-lang.org/std/mem/union.MaybeUninit.html), [`std::task::Poll`](https://doc.rust-lang.org/nightly/std/task/enum.Poll.html), and so on - possibly even for arrays, `Vec`, `BTreeSet` etc. +As [discussed above](#pick-shadowed-methods-instead-of-erroring) we anticipate a future version which will relax some errors into warnings, and thus allow us to add support for raw pointers, `Weak` and `NonNull` as self types. + +Thereafter, we could consider implementing `Receiver` for other types, e.g. [`std::cell`](https://doc.rust-lang.org/std/cell/index.html) types, [`std::sync`](https://doc.rust-lang.org/std/sync/index.html) types, [`std::cmp::Reverse`](https://doc.rust-lang.org/std/cmp/struct.Reverse.html), [`std::num::Wrapping`](https://doc.rust-lang.org/nightly/std/num/struct.Wrapping.html), [`std::mem::MaybeUninit`](https://doc.rust-lang.org/std/mem/union.MaybeUninit.html), [`std::task::Poll`](https://doc.rust-lang.org/nightly/std/task/enum.Poll.html), and so on - possibly even for arrays, `Vec`, `BTreeSet` etc. There seems to be no disadvantage to doing this - taking `Vec` as an example, it would only have any effect on the behavior of code if somebody implemented a method taking `Vec` as a receiver. On the other hand, it's hard to imagine use-cases for some of these. It seems best to consider these future possibilities based on whether the end-result seems natural or strange. @@ -590,13 +642,11 @@ This RFC is in an unusual position regarding feature gates. There are two existi Although we presumably have no obligation to maintain compatibility for users of the unstable `arbitrary_self_types` feature, we should consider the least disruptive way to introduce this feature. -Options are: - -* Use the `arbitrary_self_types` feature gate, and remove the `receiver_trait` feature gate immediately. -* Use the `receiver_trait` feature gate and remove the `arbitrary_self_types` feature gate immediately. -* Invent a new feature gate. +The plan is: -This RFC proposes the first course of action, since `arbitrary_self_types` is used externally and we think all currently use-cases should continue to work. +- the `receiver_trait` gate continues to control the existing `Receiver` trait used solely within the standard library, which is renamed to `LegacyReceiver` or `FixedReceiver` or something (and will be removed assuming we stabilize this feature) +- `arbitrary_self_types` comes to control the new behavior, with a new `Receiver` trait containing a `Target` associated type. As noted, this does not include raw pointers, though we hope to find a way to stabilize this in a future RFC. +- Add a new `arbitrary_self_types_pointers` feature gate which retains support for raw pointers. # Summary From b367f590ea56b28741c65f32f65f754fe9c2da71 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Tue, 16 Apr 2024 16:24:29 +0100 Subject: [PATCH 40/47] Fix anchor error Co-authored-by: Slanterns --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 65ab0349e50..5390d5b34dd 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -242,7 +242,7 @@ where } ``` -(See [alternatives](#no-blanket-implementation) for discussion of the tradeoffs here.) +(See [alternatives](#no-blanket-implementation-for-deref) for discussion of the tradeoffs here.) It is also implemented for `&T` and `&mut T`. From 8d018789357381f65bf444f7cadfe653392ed0c6 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 18 Apr 2024 22:35:38 +0100 Subject: [PATCH 41/47] Update text/3519-arbitrary-self-types-v2.md Clarify a long sentence Co-authored-by: teor --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 5390d5b34dd..843bda5925f 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -188,7 +188,7 @@ impl Receiver for CustomPtr { ## Recursive arbitrary receivers -Receivers are recursive and therefore allowed to be nested. If type `T` implements `Receiver`, and type `U` implements `Receiver`, `T` is a valid receiver (and so on outward). This is the behavior for the current special-cased self types (`Pin`, `Box` etc.) so as we remove the special-casing we need to retain this property. +Receivers are recursive and therefore allowed to be nested. If type `T` implements `Receiver`, and type `U` implements `Receiver`, `T` is a valid receiver (and so on outward). This is the behavior for the current special-cased self types (`Pin`, `Box` etc.), so as we remove the special-casing, we need to retain this property. For example, this self type is valid: From 3413135d8df1e6575af69bcddc7c27186abc737e Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 18 Apr 2024 22:38:10 +0100 Subject: [PATCH 42/47] Fix a couple of mistakes A few mentions of implementing `Receiver` for raw pointers snuck through from earlier drafts. That's not in scope for this RFC, for the reasons explained in the "Enable for raw pointers" alternatives section. --- text/3519-arbitrary-self-types-v2.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 843bda5925f..6d4e917d935 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -161,7 +161,6 @@ trait Receiver { The `Receiver` trait is already implemented for many standard library types: - smart pointers in the standard library: `Rc`, `Arc`, `Box`, and `Pin>` (and in fact, any type which implements `Deref`) - references: `&Self` and `&mut Self` -- pointers: `*const Self` and `*mut Self` Shorthand exists for references, so that `self` with no ascription is of type `Self`, `&self` is of type `&Self` and `&mut self` is of type `&mut Self`. @@ -172,8 +171,6 @@ impl Foo { fn by_value(self /* self: Self */); fn by_ref(&self /* self: &Self */); fn by_ref_mut(&mut self /* self: &mut Self */); - fn by_ptr(self: *const Self); - fn by_mut_ptr(self: *mut Self); fn by_box(self: Box); fn by_rc(self: Rc); fn by_custom_ptr(self: CustomPtr); From a93ac4b2bd5863cd59795d7096251d4528ef6d38 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 18 Apr 2024 22:41:29 +0100 Subject: [PATCH 43/47] Fix link syntax. --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 6d4e917d935..062e275da16 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -408,7 +408,7 @@ fn main() { If Rust added `SmartPointerWhichImplementsReceiver::wardrobe(&self)` we would start to produce an error here. If `SmartPointerWhichImplementsReceiver` added `SmartPointerWhichImplementsReceiver::wardrobe(self)` then it would be even worse - code would start to call `SmartPointerWhichImplementsReceiver::wardrobe` where it had previously called `SmartPointerWhichImplementsReceiver::wardrobe`. -The [#compiler-changes-deshadowing](deshadowing section of the compiler changes, above), describes how we avoid this. The compiler will take pains to identify any such ambiguities and it will show an error. +The [deshadowing section of the compiler changes](#compiler-changes-deshadowing), describes how we avoid this. The compiler will take pains to identify any such ambiguities and it will show an error. We have (extensively) considered algorithms to pick the intended method instead - see [picking the shadowed method](#picking-the-shadowed-method), below. From 7c43c5018b61c83211a1c894e7a938a373bc1082 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Fri, 19 Apr 2024 07:53:36 +0100 Subject: [PATCH 44/47] Update text/3519-arbitrary-self-types-v2.md Fix a link Co-authored-by: Tyler Mandry --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 062e275da16..91c208423c2 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -349,7 +349,7 @@ The existing branches in the compiler for "arbitrary self types" already emit ex } ``` We don't know a use-case for this. There are several cases where this can result in misleading diagnostics. (For instance, if such a method is called with an incorrect type (for example `smart_ptr.a::<&Foo>()` instead of `smart_ptr.a::()`). We could attempt to find and fix all those cases. However, we feel that generic receiver types might risk subtle interactions with method resolutions and other parts of the language. We think it is a safer choice to generate an error on any declaration of a generic `self` type. -- As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will produce a "multiple method candidates" error if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. +- As noted in [the section about compiler changes for deshadowing](#compiler-changes-deshadowing) we will produce a "multiple method candidates" error if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. # Drawbacks [drawbacks]: #drawbacks From 5cec046a2a610daff8fe24073faf2cc09d28f775 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Fri, 19 Apr 2024 07:54:03 +0100 Subject: [PATCH 45/47] Update text/3519-arbitrary-self-types-v2.md Fix a link Co-authored-by: Tyler Mandry --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 91c208423c2..36bd85822f4 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -447,7 +447,7 @@ In any case, we think a blanket implementation is desirable: * It prevents `Deref` and `Receiver` having different `Target`s. That could possible lead to confusion if it prompted the compiler to explore different chains for these two different purposes. * If smart pointer type `P` is in a crate, users of `P` to create `P` will be able to use it as a `self` type for `MyConcreteType` without waiting for a new release of the `P` crate. -We found that [some crates use `Deref` to express an is-a not a has-a relationship](ttps://gist.github.com/davidhewitt/d0ed031fb05f6db98ee249ae089b268e) and so, ideally, might have preferred the option of setting up `Deref` and `self` candidacy separately. But, on discussion, we concluded that traits would be a better way to model those relationships. +We found that [some crates use `Deref` to express an is-a not a has-a relationship](https://gist.github.com/davidhewitt/d0ed031fb05f6db98ee249ae089b268e) and so, ideally, might have preferred the option of setting up `Deref` and `self` candidacy separately. But, on discussion, we concluded that traits would be a better way to model those relationships. ## Explore both `Receiver` and `Deref` chains while identifying method candidates From 131d7e9f6bd383844f3c2ab5edda04a59ee67af6 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Fri, 19 Apr 2024 08:05:41 +0100 Subject: [PATCH 46/47] Remove mention of impl Receiver for Vec Since Vec already implements Deref. --- text/3519-arbitrary-self-types-v2.md | 22 ++-------------------- 1 file changed, 2 insertions(+), 20 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 36bd85822f4..a5fab0a45fc 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -608,27 +608,9 @@ A previous PR based on the `Deref` alternative has been proposed before https:// As [discussed above](#pick-shadowed-methods-instead-of-erroring) we anticipate a future version which will relax some errors into warnings, and thus allow us to add support for raw pointers, `Weak` and `NonNull` as self types. -Thereafter, we could consider implementing `Receiver` for other types, e.g. [`std::cell`](https://doc.rust-lang.org/std/cell/index.html) types, [`std::sync`](https://doc.rust-lang.org/std/sync/index.html) types, [`std::cmp::Reverse`](https://doc.rust-lang.org/std/cmp/struct.Reverse.html), [`std::num::Wrapping`](https://doc.rust-lang.org/nightly/std/num/struct.Wrapping.html), [`std::mem::MaybeUninit`](https://doc.rust-lang.org/std/mem/union.MaybeUninit.html), [`std::task::Poll`](https://doc.rust-lang.org/nightly/std/task/enum.Poll.html), and so on - possibly even for arrays, `Vec`, `BTreeSet` etc. +Thereafter, we could consider implementing `Receiver` for other types, e.g. [`std::cell`](https://doc.rust-lang.org/std/cell/index.html) types, [`std::sync`](https://doc.rust-lang.org/std/sync/index.html) types, [`std::cmp::Reverse`](https://doc.rust-lang.org/std/cmp/struct.Reverse.html), [`std::num::Wrapping`](https://doc.rust-lang.org/nightly/std/num/struct.Wrapping.html), [`std::mem::MaybeUninit`](https://doc.rust-lang.org/std/mem/union.MaybeUninit.html), [`std::task::Poll`](https://doc.rust-lang.org/nightly/std/task/enum.Poll.html), and so on - possibly even for arrays, etc. -There seems to be no disadvantage to doing this - taking `Vec` as an example, it would only have any effect on the behavior of code if somebody implemented a method taking `Vec` as a receiver. On the other hand, it's hard to imagine use-cases for some of these. It seems best to consider these future possibilities based on whether the end-result seems natural or strange. - -```rust -impl Vexation { - fn do_something_to_vec(self: Vec) { } - fn do_something_to_maybeuninit(self: MaybeUninit) {} -} - -fn main { - let mut v = Vec::new(); - v.push(Vexation); - v.do_something_to_vec(); // this seems weird and I can't imagine a use-case - - let mut m = MaybeUninit::::uninit(); - m.do_something_to_maybeuninit(); // this seems fine and useful and so maybe we should in future implement Receiver for MaybeUninit -} -``` - -For now, though, we should clearly restrict `Receiver` to those types for which there's a demonstrated need. +There seems to be no disadvantage to doing this - taking `Cell` as an example, it would only have any effect on the behavior of code if somebody implemented a method taking `Cell` as a receiver. On the other hand, it's hard to imagine use-cases for some of these. For now, though, we should clearly restrict `Receiver` to those types for which there's a demonstrated need. # Feature gates From ab248bbd88fe549f78a2a445931980fe7417a18a Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 00:49:08 +0000 Subject: [PATCH 47/47] Prepare RFC 3519 to be merged The FCP for RFC 3519 ("Arbitrary self types v2") is complete. Let's reuse the existing `arbitrary_self_types` feature flag for this work, and similarly, let's reuse the existing tracking issue. --- text/3519-arbitrary-self-types-v2.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index a5fab0a45fc..6ffc600ba12 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -1,7 +1,7 @@ -- Feature Name: Arbitrary Self Types 2.0 +- Feature Name: `arbitrary_self_types` - Start Date: 2023-05-04 - RFC PR: [rust-lang/rfcs#3519](https://github.com/rust-lang/rfcs/pull/3519) -- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) +- Tracking Issue: [rust-lang/rust#44874](https://github.com/rust-lang/rust/issues/44874) # Summary [summary]: #summary