Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arbitrary self types v2 #3519

Merged
merged 49 commits into from
May 6, 2024

Conversation

adetaylor
Copy link
Contributor

@adetaylor adetaylor commented Nov 1, 2023

This PR suggests small changes to the existing unstable "aribtrary self types" feature to make it more flexible. In particular, it suggests driving this feature from a new (ish) Receiver trait instead of from Deref, but to maintain compatibility by having a blanket implementation for all Deref types.

This is a squashed commit of many edits by various folks including @Urhengulas, @Veykril , @madsmtm and myself. Thanks also to @davidhewitt, @Manishearth and many folks over on Zulip for feedback.

Rendered

Tracking issue:

adetaylor and others added 2 commits November 1, 2023 18:06
This PR suggests small changes to the existing unstable "aribtrary self types"
feature to make it more flexible. In particular, it suggests driving this
feature from a new (ish) "Receiver" trait instead of from Deref, but to
maintain compatibility by having a blanket implementation for all Deref types.

This is a squashed commit of much work by various folks including Johann
Hemmann, Lukas Wirth, Mads Marquart and myself. Thanks also to David Hewitt and
Manish Goregaokar for feedback.

Co-authored-by: Johann Hemmann <johann.hemmann@code.berlin>
@adetaylor
Copy link
Contributor Author

@traviscross traviscross added the T-lang Relevant to the language team, which will review and decide on the RFC. label Nov 1, 2023

We don't want to encourage the use of raw pointers, and would prefer rather that raw pointers are wrapped in a custom smart pointer that encodes and documents the invariants. So, there's an argument not to add the raw pointer support.

However, the current unstable `arbitrary_self_types` feature provides support for raw pointer receivers, and with years of experience no major concerns have been spotted. We would prefer not to deviate from the existing proposal more than necessary. Moreover, we are led to believe that raw pointer receivers are quite important for the future of safe Rust, because stacked borrows makes it illegal to materialize references in many positions, and there are a lot of operations (like going from a raw pointer to a raw pointer to a field) where users don't need to or want to do that. We think the utility of including raw pointer receivers outweighs the risks of tempting people to over-use raw pointers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that mean that adding methods on *const T, like add(), is a breaking change? How has this been dealt with so far?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. There are various Rust standard library types - Box, Arc etc. - where by policy no new methods are added because it would be a breaking change. Instead, associated functions are added.

If arbitrary self types is enabled (either the existing nightly unstable version, or our slightly tweaked version here) then that policy would need to be extended to the raw pointer types as well.

The Rust community therefore needs to decide which is higher priority:

  • being able to add new methods to raw pointer types; or
  • being able to receive method calls by raw pointer.

Unless someone can see some workaround or compromise?

@Manishearth hello, I think you were especially keen that arbitrary self types continues to support raw pointers. Do you have any views?

(Personally I think this might be a good argument to enable arbitrary self types without raw pointers, and therefore to require people to write their own newtype wrappers if they want method dispatch).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(there's a little bit more discussion of this in this comment thread below which I'm going to close to avoid duplication)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to reply to your comment about changing priority: I think this new problem is a much better problem. The original issue is "the pointer type *const T can't add any new methods ever", the new issue is "crate A can add foo(*const Self) methods, except that if they're from the set of known *const T methods, this is a major semver change". This is a bit surprising but highly manageable.

What worries me more is the high confusion cost of making method resolution more complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still thinking this through but I wanted to note that we Rust doesn't current shadow things invisibly; instead it gives an error message if there's ambiguity.

   = warning: once this associated item is added to the standard library, the ambiguity may cause an error or change in behavior!
   = note: for more information, see issue #48919 <https://github.com/rust-lang/rust/issues/48919>
   = help: call with fully qualified syntax `MyType::addr(...)` to keep using the current method
   = help: add `#![feature(strict_provenance)]` to the crate attributes to enable `ptr::const_ptr::<impl *const T>::addr`
   = note: `#[warn(unstable_name_collisions)]` on by default

This RFC should be more explicit somewhere about the existence of these errors and warnings, especially in the Method Shadowing section - I'll adjust.

Of course, this doesn't solve the issue here: if Rust adds a new <raw pointer>::repaint() method, that will cause downstream crates to cease building if they had:

struct Bedroom;
impl Bedroom {
  fn repaint(self: *const Bedroom) {}
}

@Nadrieril 's proposal, as I understand it, is that if method resolution does result in ambiguities in such a situation, where raw pointers are involved, we don't show an error for the ambiguity but instead automatically pick Bedroom::repaint and show a warning.

I do agree that seems to be manageable and I don't think it makes method resolution cognitively more complex, because any such situations will result in a nice clear warning explaining the situation. It might make the code significantly more complex - I'll look into it.

Copy link
Member

@Nadrieril Nadrieril Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We mustn't change what type would be picked by Deref indeed. My guess is that the rule we want would apply only in case of an ambiguity that involves a Receiver type, and would favor the self: P<Self> method over P's inherent methods. This shouldn't be a breaking change if there isn't a Receiver involved, or if the Receiver has no inherent methods (like Box or Rc). I'd need to write it down clearly to be fully convinced that works though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an example the other day: say x: &Arc<Box<T>> and I call x.foo(). Here are the methods that could be called, in my proposed order of priority ("most specific first"):

  1. foo(&Arc<Box<Self>>) on T
  2. foo(&Arc<Self>) on Box<T>
  3. foo(&Box<Self>) on T
  4. foo(&self) on Arc<T>
  5. foo(&self) on Box<T>
  6. foo(&self) on T
  7. any trait methods

If we want "adding inherent methods to a Receiver type is not a breaking change", we need 4, 5 and 6 to be the lowest priority before trait methods. Question is: does this order break anything compared to what's allowed on stable today?

The current implementation of arbitrary_self_types sometimes errors on some of these combinations. The interesting cases for us is when 3 and 5 are available, or 1 and 4. This currently errors as ambiguous; we would instead pick 3 and 1 respectively.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so I did a thorough experiment (try out my madness here). It appears the algorithm picks 4 over 3 in my list above, which could appear to break my proposal. The rest of the order is ok, so today's method resolution algorithm is compatible with this ordering:

a. foo(&Arc<Box<Self>>) on T
b. foo(&Arc<Self>) on Box<T>
c. foo(&self) on Arc<T>
d. foo(&Box<Self>) on T
e. foo(&self) on Box<T>
f. foo(&self) on T

However! All is still fine. Indeed, if Arc was not Deref, we would only consider the following cases:

a. foo(&Arc<Box<Self>>) on T
b. foo(&Arc<Self>) on Box<T>
c. foo(&self) on Arc<T>

Here adding an inherent method to Arc would not be a breaking change. The reason adding an inherent method to Arc is breaking is because Arc: Deref, and we already know that adding methods to Deref types is a breaking change.

In short: if we take the current implementation of arbitrary_self_types and simply accept more cases that are currently errors, we can get the order above and the property that "adding inherent methods to a Receiver type is not a breaking change".

Moreover! Given that the only stable Receiver types don't have inherent methods (I think, right?), I believe we can even change this order a bit without breakage. It will only break users of the unstable arbitrary_self_types feature in that one corner case. So we can pick my initial "most specific first" order too if we prefer.

I think that proves that we're good? Did I miss anything?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grepped through std: the stable Receivers are exactly Box, Rc, Arc, Pin, &T, and &mut T. None of these have inherent methods, except Pin, which has e.g. Pin::as_ref(). So we can't change the order. But we can still do the first thing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Manishearth hello, I think you were especially keen that arbitrary self types continues to support raw pointers. Do you have any views?

Missed this. Yeah, I think it's important for the ergonomics of unsafe code. Wrapper types are an okay workaround, as are free functions.

Raw pointers are basically the native rust version of "I want to be able to hold a CppPointer<T> without breaking it", so it would be ideal for

I think #[fundamental]'s behavior may be useful for dealing with raw pointer receivers: I don't think there's anything new here that doesn't apply to &T as well. I think it's fine for Rust to need new editions for new methods on *const T; and there are some ways of making that work even without editions.

Comment on lines 117 to 126
A blanket implementation is provided for any type that implements `Deref`:

```rust
impl<P: ?Sized, T: ?Sized> Receiver for P
where
P: Deref<Target = T>,
{
type Target = T;
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually a problem here that I didn't realize back when we talked about this. Implementing Receiver is effectively a promise to not add any new methods to the implementing type (given the reasons already outlined here, as adding methods could break downstream users for these kinds of types). This impl means, this promise now propagates to Deref implementations as well which seems like a very much unwanted side effect, especially given all the current Deref impls out there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already true of Deref implementations? Adding a new method to a type A implementing Deref<Target = B> will shadow a method of the same name on B, so I think there is no change in semantics by adding Receiver to the mix?

Copy link
Contributor

@davidhewitt davidhewitt Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@Veykril Veykril Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because you can't add inherent impls to a foreign type, so you can't add new inherent methods to them (ignoring fundamental types). With the Receiver trait you can kind of add new inherent methods to foreign types though.

//- crate: foo
pub struct Foo<T>(pub T);
impl<T> Receiver for Foo<T> {
    type Target = T;
}
//- crate: bar (depends on foo)
use foo::Foo;
struct Bar;
impl Bar {
    fn foobar(self: Foo<Bar>) {}
}
fn main() {
    Foo(Bar).foobar();
}

Adding a fn foobar(self) method to Foo would break the crate bar in this case, so adding a method to a Receiver implementing type is a breaking change. And since Deref here implies implementing Receiver, the same issue arises from just Deref implementations, that is the same issue would occur if instead of the Receiver impl in that example we had

impl<T> Deref for Foo<T> {
    type Target = T;
    fn deref(&self) -> &Self::Target { &self.0 }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discarding Receiver from this RFC for the moment, I'm still missing what makes you take the stance that Deref implementors on today's Rust don't already carry the same promise not to add new methods. As far as I understand, this problem already exists with Deref, so adding Receiver to all Deref types changes nothing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree that it's a breaking but non-major change. Adding a blanket impl for an existing trait, implementing a new method on an existing type, and newly implementing a trait for an existing type are all breaking changes (and ones deemed non-major) regardless of arbitrary self types.

Copy link
Member

@Nadrieril Nadrieril Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. Doesn't Deref already have this exact problem though? That's exactly why Box/Rc/etc have no normal methods, since adding a fn foo(&self) to Box would break x.foo() on x: Box<Foo> if Foo has fn foo(&self)

Copy link
Member

@Veykril Veykril Nov 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how Deref already has this problem (unless you are referring to the current Deref version of this feature prior to this RFC). Box, Rc, Pin etc have htis problem because they are already allowed as arbitrary receivers as they have been special cased. Deref has no say in that. With the RFC as written, all Deref implementations will have this problem. Anyways, even if this is considered a minor breakage only (which I'd argue against personally, it very much feels like a major breakage to me), the RFC should definitely talk about this, both in that implementing Receiver has this semver problem as well as that the blanket impl will extend it to Deref impls.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I mean plain Deref. Let me spell it out:

//- crate: foo (depends on nothing)
pub struct Foo<T>(pub T);
impl<T> Deref for Foo<T> {
    type Target = T;
    fn deref(...) {...}
}
//- crate: bar (depends on nothing)
pub struct Bar;
impl Bar {
    pub fn foobar(&self) {}
}
//- crate: qux (depends on bar and foo)
fn main() {
    Foo(Bar).foobar(); // resolves to `Bar::foobar`
}

Now, if crate foo adds the following:

impl<T> Foo<T> {
    pub fn foobar(&self) {}
}

then Foo(Bar).foobar() in crate qux now resolves to Foo<T>::foobar. I imagine this is a major breaking change since now qux could still compile yet be using a completely different function.

This is the same problem; Deref pointers already have to deal with it today. Hence I conclude: adding Receiver into the mix should not change anything, all is good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @Nadrieril, this is a non-issue for this RFC, since the problem is already present on Deref types.

The RFC actually has a section that talks about this, I've updated that in #3519 (review) to include a few more references to external resources that further explain the problem.

I'll also link to rust-lang/rust-clippy#11820, which proposes a new clippy lint to catch inherent methods on smart pointers, to hopefully help implementers of such types (including in the future, implementers of Receiver) to avoid this pitfall.

@programmerjake
Copy link
Member

another good use case for arbitrary self types that may be worth mentioning in the rfc:

#[derive(Default)]
pub struct GCArena {
    objects: Vec<Rc<dyn Any>>,
}

impl GCArena {
    pub fn alloc<T: 'static>(&mut self, value: T) -> Gc<T> {
        let rc: Rc<T> = Rc::new(value);
        let weak = Rc::downgrade(&rc);
        self.objects.push(rc);
        Gc(weak)
    }
}

#[derive(Clone)]
pub struct Gc<T: 'static + ?Sized>(Weak<T>);

impl PartialEq for Gc<T: 'static + ?Sized> {
    fn eq(&self, other: &Gc<T>) -> bool {
        self.0.ptr_eq(&other.0)
    }
}

impl Eq for Gc<T: 'static + ?Sized> {}

impl Hash for Gc<T: 'static + ?Sized> {
    fn hash<H: Hasher>(&self, state: &mut H) {
        (self.0.as_ptr() as *const ()).hash(state);
    }
}

impl<T: 'static + ?Sized> Receiver for Gc<T> {
    type Target = T;
}

impl<T: 'static + ?Sized> Receiver for Gc<T> {
    pub fn get(&self) -> Rc<T> {
        self.0.upgrade().expect("GCArena has been dropped")
    }
}

// demo of using Gc:

pub struct Node<T> {
    pub edges: Vec<Gc<Node<T>>>,
    pub data: T,
}

impl<T> Node<T> {
    pub fn walk(self: &Gc<Self>, seen: &mut HashSet<Gc<Self>>, f: &mut impl FnMut(&Gc<Self>)) {
        if seen.insert(self.clone()) {
            f(self);
            for i in &self.get().edges {
                i.walk(seen, f);
            }
        }
    }
}

@clarfonthey
Copy link
Contributor

@programmerjake I think this is now covered by the paragraph I wrote that is now part of the RFC?

For example, taking &Arc allows me to both clone the smart pointer (noting that the underlying T might not implement Clone) in addition to access the data inside the type, which is useful for some methods. Also, being able to change a method from accepting &self to self: &Arc can be done in a mostly frictionless way, whereas changing from &self to a static method accepting &Arc will always require some amount of refactoring.

This feels like the same case you present: you want the ability to both access the inner type and clone the smart pointer, and simply accessing the inner type isn't enough.

@programmerjake
Copy link
Member

This feels like the same case you present: you want the ability to both access the inner type and clone the smart pointer, and simply accessing the inner type isn't enough.

well, i had intended to demonstrate a type where you can't just impl Deref, since the inner value may have already been dropped -- I didn't do a very good job of that.

another use case: a type where the inner value doesn't exist locally, e.g. for RPC with promise pipelining:

pub struct Id(u32);
pub struct Promise<T>(Id, PhantomData<T>);

impl<T> Receiver for Promise<T> {
    type Target = T;
}

pub struct MyRemoteData;
pub struct SomeIntermediate;

impl MyRemoteData {
    pub fn get() -> Promise<Self> {
        ...
    }
    pub fn foo(self: Promise<Self>, a: i32) -> Promise<SomeIntermediate> {
        ...
    }
}

impl SomeIntermediate {
    pub fn bar(self: Promise<Self>) -> Promise<()> {
        ...
    }
}

pub async fn demo() {
    MyRemoteData::get().foo(5).bar().await;
}

@rfcbot
Copy link
Collaborator

rfcbot commented Apr 19, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@traviscross traviscross removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Apr 24, 2024
The FCP for RFC 3519 ("Arbitrary self types v2") is complete.  Let's
reuse the existing `arbitrary_self_types` feature flag for this work,
and similarly, let's reuse the existing tracking issue.
@traviscross traviscross merged commit 0d835b1 into rust-lang:master May 6, 2024
@traviscross
Copy link
Contributor

The lang team has accepted this RFC, and we've now merged it.

Thanks to @adetaylor for pushing forward this important work, and thanks to all of the reviewers who contributed helpful feedback.

For further updates on this work, follow the tracking issue:

bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 16, 2024
…_deshadowing_probes, r=<try>

Arbitrary self types v2: probe for more methods.

This PR is one of the first steps towards arbitrary self types v2, RFC [3519](rust-lang/rfcs#3519) (rust-lang#44874). Specifically, it is [step 2 of the plan outlined here](rust-lang#44874 (comment)). That says:

> 1. **Search for potentially conflicting method candidates** per the [deshadowing part of the RFC](https://github.com/rust-lang/rfcs/blob/master/text/3519-arbitrary-self-types-v2.md#compiler-changes-deshadowing). This is the riskiest part - per
    > the production of errors requires us to interrogate more candidates to look for potential conflicts, so this could have a compile-time performance penalty which we should measure
>    let's attempt to land this first and see if performance issues show up. If so, it's back to the drawing board.

So - this commit isn't useful in itself but we need to determine its performance and any unexpected compatibility problems. That's what this PR is for. We shouldn't necessarily merge it.

Here's what this commit does.

Rust prioritizes method candidates in this order:
1. By value;
2. By reference;
3. By mutable reference;
4. By const ptr.

Previously, if a suitable candidate was found in one of these earlier categories, Rust wouldn't even move onto probing the other categories.

As part of the arbitrary self types work, we're going to need to change that - even if we choose a method from one of the earlier categories, we will sometimes need to probe later categories to search for methods that we may be shadowing.

This commit adds those extra searches for shadowing, but it does not yet produce an error when such shadowing problems are found. That will come in a subsequent commit, by filling out the 'check_for_shadowing' method.

We're adding the extra searches in a preliminary commit because they are the risky bit. We want to find any functional or performance problems from these extra searches before we proceed with the actual work on arbitrary self types and specifically on extra deshadowing errors.
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 8, 2024
…_deshadowing_probes_v2, r=<try>

Arbitrary self types v2: probe for more methods, take two

This is a new version of rust-lang#127812. Here's what I said last time, which still applies:

This PR is one of the first steps towards arbitrary self types v2, RFC [3519](rust-lang/rfcs#3519) (rust-lang#44874). Specifically, it is [step 2 of the plan outlined here](rust-lang#44874 (comment)). That says:

> 1. **Search for potentially conflicting method candidates** per the [deshadowing part of the RFC](https://github.com/rust-lang/rfcs/blob/master/text/3519-arbitrary-self-types-v2.md#compiler-changes-deshadowing). This is the riskiest part - per
    > the production of errors requires us to interrogate more candidates to look for potential conflicts, so this could have a compile-time performance penalty which we should measure
>    let's attempt to land this first and see if performance issues show up. If so, it's back to the drawing board.

So - this commit isn't useful in itself but we need to determine its performance and any unexpected compatibility problems. That's what this PR is for. We shouldn't necessarily merge it.

Here's what this PR does.

Rust prioritizes method candidates in this order:
1. By value;
2. By reference;
3. By mutable reference;
4. By const ptr.

Previously, if a suitable candidate was found in one of these earlier categories, Rust wouldn't even move onto probing the other categories.

As part of the arbitrary self types work, we're going to need to change that - even if we choose a method from one of the earlier categories, we will sometimes need to probe later categories to search for methods that we may be shadowing.

This commit adds those extra searches for shadowing, but it does not yet produce an error when such shadowing problems are found. That will come in a subsequent commit, by filling out the 'check_for_shadowing' method.

We're adding the extra searches in a preliminary commit because they are the risky bit. We want to find any functional or performance problems from these extra searches before we proceed with the actual work on arbitrary self types and specifically on extra deshadowing errors.

--

What's different this time relative to rust-lang#127812? Well, this is a more invasive change to the probing algorithm which hopefully minimizes the amount of work done during the extra probing for shadowed methods. Specifically, we only need to warn about potentially shadowed methods if three conditions are true:

* the shadowed method is not the same as the shadower (different `DefId`s)
* the potential shadower involves the same `self` type as the shadowed method
* the potential shadower has been found at a different position along the chain of `Receiver` traits than the shadowed.

Previously, we checked these conditions only when a potentially shadowed pick was found. We now apply them during the search process, which should minimize the amount of work required.
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 8, 2024
…_deshadowing_probes_v2, r=<try>

Arbitrary self types v2: probe for more methods, take two

This is a new version of rust-lang#127812. Here's what I said last time, which still applies:

This PR is one of the first steps towards arbitrary self types v2, RFC [3519](rust-lang/rfcs#3519) (rust-lang#44874). Specifically, it is [step 2 of the plan outlined here](rust-lang#44874 (comment)). That says:

> 1. **Search for potentially conflicting method candidates** per the [deshadowing part of the RFC](https://github.com/rust-lang/rfcs/blob/master/text/3519-arbitrary-self-types-v2.md#compiler-changes-deshadowing). This is the riskiest part - per
    > the production of errors requires us to interrogate more candidates to look for potential conflicts, so this could have a compile-time performance penalty which we should measure
>    let's attempt to land this first and see if performance issues show up. If so, it's back to the drawing board.

So - this commit isn't useful in itself but we need to determine its performance and any unexpected compatibility problems. That's what this PR is for. We shouldn't necessarily merge it.

Here's what this PR does.

Rust prioritizes method candidates in this order:
1. By value;
2. By reference;
3. By mutable reference;
4. By const ptr.

Previously, if a suitable candidate was found in one of these earlier categories, Rust wouldn't even move onto probing the other categories.

As part of the arbitrary self types work, we're going to need to change that - even if we choose a method from one of the earlier categories, we will sometimes need to probe later categories to search for methods that we may be shadowing.

This commit adds those extra searches for shadowing, but it does not yet produce an error when such shadowing problems are found. That will come in a subsequent commit, by filling out the 'check_for_shadowing' method.

We're adding the extra searches in a preliminary commit because they are the risky bit. We want to find any functional or performance problems from these extra searches before we proceed with the actual work on arbitrary self types and specifically on extra deshadowing errors.

--

What's different this time relative to rust-lang#127812? Well, this is a more invasive change to the probing algorithm which hopefully minimizes the amount of work done during the extra probing for shadowed methods. Specifically, we only need to warn about potentially shadowed methods if three conditions are true:

* the shadowed method is not the same as the shadower (different `DefId`s)
* the potential shadower involves the same `self` type as the shadowed method
* the potential shadower has been found at a different position along the chain of `Receiver` traits than the shadowed.

Previously, we checked these conditions only when a potentially shadowed pick was found. We now apply them during the search process, which should minimize the amount of work required.
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 4, 2024
Adding tests for cases where diagnostics aren't as good as they should
be in the case of arbitrary self types.

These are slightly duplicative of the existing
arbitrary-self-from-method-substs.rs
test, but test more permutations so it seems worth adding to the
coverage as we explore improving the diagnostics here.

Improved diagnostics were suggested in commit 05c5caa

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
and specifically the sub-issue exploring questions around
generics,
rust-lang#129147
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 4, 2024
Adding tests for cases where diagnostics aren't as good as they should
be in the case of arbitrary self types.

These are slightly duplicative of the existing
arbitrary-self-from-method-substs.rs
test, but test more permutations so it seems worth adding to the
coverage as we explore improving the diagnostics here.

Restritions here were suggested in commit 05c5caa

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874
and specifically the sub-issue exploring questions around
generics,
rust-lang#129147
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 4, 2024
Adding tests for cases where diagnostics aren't as good as they should
be in the case of arbitrary self types.

These are slightly duplicative of the existing
arbitrary-self-from-method-substs.rs
test, but test more permutations so it seems worth adding to the
coverage as we explore improving the diagnostics here.

Restritions here were suggested in commit 05c5caa

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874
and specifically the sub-issue exploring questions around
generics,
rust-lang#129147
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 5, 2024
Adding tests for cases where diagnostics aren't as good as they should
be in the case of arbitrary self types.

These are slightly duplicative of the existing
arbitrary-self-from-method-substs.rs
test, but test more permutations so it seems worth adding to the
coverage as we explore improving the diagnostics here.

Restritions here were suggested in commit 05c5caa

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874
and specifically the sub-issue exploring questions around
generics,
rust-lang#129147
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 8, 2024
The RFC for arbitrary self types v2 declares that we should reject
"generic" self types. This commit does so.

The definition of "generic" was unclear in the RFC, but has been
explored in
rust-lang#129147
and the conclusion is that "generic" means any `self` type which
is a type parameter defined on the method itself, or references
to such a type.

This approach was chosen because other definitions of "generic"
don't work. Specifically,
* we can't filter out generic type _arguments_, because that would
  filter out Rc<Self> and all the other types of smart pointer
  we want to support;
* we can't filter out all type params, because Self itself is a
  type param, and because existing Rust code depends on other
  type params declared on the type (as opposed to the method).

This PR decides to make a new error code for this case, instead of
reusing the existing E0307 error. This makes the code a
bit more complex, but it seems we have an opportunity to provide
specific diagnostics for this case so we should do so.

This PR filters out generic self types whether or not the
'arbitrary self types' feature is enabled. However, it's believed
that it can't have any effect on code which uses stable Rust, since
there are no stable traits which can be used to indicate a valid
generic receiver type, and thus it would have been impossible to
write code which could trigger this new error case.
It is however possible that this could break existing code which
uses either of the unstable `arbitrary_self_types` or
`receiver_trait` features. This breakage is intentional; as
we move arbitrary self types towards stabilization we don't want
to continue to support generic such types.

This PR adds lots of extra tests to arbitrary-self-from-method-substs.
Most of these are ways to trigger a "type mismatch" error which
https://github.com/rust-lang/rust/blob/9b82580c7347f800c2550e6719e4218a60a80b28/compiler/rustc_hir_typeck/src/method/confirm.rs#L519
hopes can be minimized by filtering out generics in this way.
We remove a FIXME from confirm.rs suggesting that we make this change.
It's still possible to cause type mismatch errors, and a subsequent
PR may be able to improve diagnostics in this area, but it's harder
to cause these errors without contrived uses of the turbofish.

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874

r? @wesleywiser
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 8, 2024
The RFC for arbitrary self types v2 declares that we should reject
"generic" self types. This commit does so.

The definition of "generic" was unclear in the RFC, but has been
explored in
rust-lang#129147
and the conclusion is that "generic" means any `self` type which
is a type parameter defined on the method itself, or references
to such a type.

This approach was chosen because other definitions of "generic"
don't work. Specifically,
* we can't filter out generic type _arguments_, because that would
  filter out Rc<Self> and all the other types of smart pointer
  we want to support;
* we can't filter out all type params, because Self itself is a
  type param, and because existing Rust code depends on other
  type params declared on the type (as opposed to the method).

This PR is a second attempt at achieving this, based on producing a new
well-formedness checking context which is only aware of the type params
of the impl block, not of the method itself.

It doesn't currently work because it turns out we do need some method
params under some circumstances, but raising it for completeness.

This PR adds lots of extra tests to arbitrary-self-from-method-substs.
Most of these are ways to trigger a "type mismatch" error which
https://github.com/rust-lang/rust/blob/9b82580c7347f800c2550e6719e4218a60a80b28/compiler/rustc_hir_typeck/src/method/confirm.rs#L519
hopes can be minimized by filtering out generics in this way.
We remove a FIXME from confirm.rs suggesting that we make this change.
It's still possible to cause type mismatch errors, and a subsequent
PR may be able to improve diagnostics in this area, but it's harder
to cause these errors without contrived uses of the turbofish.

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874

r? @wesleywiser
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Sep 10, 2024
The `arbitrary_self_types` RFC [1] is accepted and the feature is being
worked on [2]. As part of the RFC, a new `Receiver` trait is to be added,
and it will be automatically implemented for all types that have `Deref`
implementation. This is different from the existing `Receiver` trait
that we use, which is a custom implementation that opts-in a type into
being used as receiver.

To prepare us for the change, remove the `Receiver` implementation and
the associated feature. To still allow `Arc` and others to be used as
method receivers, turn on `arbitrary_self_types` feature instead.

Cc: Adrian Taylor <ade@hohum.me.uk>
Link: rust-lang/rfcs#3519 [1]
Link: rust-lang/rust#44874 [2]
Signed-off-by: Gary Guo <gary@garyguo.net>
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 11, 2024
As part of the "arbitrary self types v2" project, we are going to
replace the current `Receiver` trait with a new mechanism based on a
new, different `Receiver` trait.

This PR renames the old trait to get it out the way. Naming is hard.
Options considered included:
* HardCodedReceiver (because it should only be used for things in the
  standard library, and hence is sort-of hard coded)
* LegacyReceiver
* TargetLessReceiver
* OldReceiver

These are all bad names, but fortunately this will be temporary.
Assuming the new mechanism proceeds to stabilization as intended, the
legacy trait will be removed altogether.

Although we expect this trait to be used only in the standard library,
we suspect it may be in use elsehwere, so we're landing this change
separately to identify any surprising breakages.

It's known that this trait is used within the Rust for Linux project; a
patch is in progress to remove their dependency.

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874

r? @wesleywiser
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 11, 2024
As part of the "arbitrary self types v2" project, we are going to
replace the current `Receiver` trait with a new mechanism based on a
new, different `Receiver` trait.

This PR renames the old trait to get it out the way. Naming is hard.
Options considered included:
* HardCodedReceiver (because it should only be used for things in the
  standard library, and hence is sort-of hard coded)
* LegacyReceiver
* TargetLessReceiver
* OldReceiver

These are all bad names, but fortunately this will be temporary.
Assuming the new mechanism proceeds to stabilization as intended, the
legacy trait will be removed altogether.

Although we expect this trait to be used only in the standard library,
we suspect it may be in use elsehwere, so we're landing this change
separately to identify any surprising breakages.

It's known that this trait is used within the Rust for Linux project; a
patch is in progress to remove their dependency.

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874

r? @wesleywiser
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 11, 2024
As part of the "arbitrary self types v2" project, we are going to
replace the current `Receiver` trait with a new mechanism based on a
new, different `Receiver` trait.

This PR renames the old trait to get it out the way. Naming is hard.
Options considered included:
* HardCodedReceiver (because it should only be used for things in the
  standard library, and hence is sort-of hard coded)
* LegacyReceiver
* TargetLessReceiver
* OldReceiver

These are all bad names, but fortunately this will be temporary.
Assuming the new mechanism proceeds to stabilization as intended, the
legacy trait will be removed altogether.

Although we expect this trait to be used only in the standard library,
we suspect it may be in use elsehwere, so we're landing this change
separately to identify any surprising breakages.

It's known that this trait is used within the Rust for Linux project; a
patch is in progress to remove their dependency.

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874

r? @wesleywiser
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 11, 2024
As part of the "arbitrary self types v2" project, we are going to
replace the current `Receiver` trait with a new mechanism based on a
new, different `Receiver` trait.

This PR renames the old trait to get it out the way. Naming is hard.
Options considered included:
* HardCodedReceiver (because it should only be used for things in the
  standard library, and hence is sort-of hard coded)
* LegacyReceiver
* TargetLessReceiver
* OldReceiver

These are all bad names, but fortunately this will be temporary.
Assuming the new mechanism proceeds to stabilization as intended, the
legacy trait will be removed altogether.

Although we expect this trait to be used only in the standard library,
we suspect it may be in use elsehwere, so we're landing this change
separately to identify any surprising breakages.

It's known that this trait is used within the Rust for Linux project; a
patch is in progress to remove their dependency.

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874

r? @wesleywiser
adetaylor added a commit to adetaylor/rust that referenced this pull request Sep 11, 2024
As part of the "arbitrary self types v2" project, we are going to
replace the current `Receiver` trait with a new mechanism based on a
new, different `Receiver` trait.

This PR renames the old trait to get it out the way. Naming is hard.
Options considered included:
* HardCodedReceiver (because it should only be used for things in the
  standard library, and hence is sort-of hard coded)
* LegacyReceiver
* TargetLessReceiver
* OldReceiver

These are all bad names, but fortunately this will be temporary.
Assuming the new mechanism proceeds to stabilization as intended, the
legacy trait will be removed altogether.

Although we expect this trait to be used only in the standard library,
we suspect it may be in use elsehwere, so we're landing this change
separately to identify any surprising breakages.

It's known that this trait is used within the Rust for Linux project; a
patch is in progress to remove their dependency.

This is a part of the arbitrary self types v2 project,
rust-lang/rfcs#3519
rust-lang#44874

r? @wesleywiser
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-lang Relevant to the language team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.