Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closure PartialEq and Eq #3538

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ColonelThirtyTwo
Copy link

@ColonelThirtyTwo ColonelThirtyTwo commented Dec 7, 2023

@pthariensflame
Copy link
Contributor

Rendered link is broken after second commit.

@ColonelThirtyTwo
Copy link
Author

Rendered link is broken after second commit.

Fixed

@SOF3
Copy link

SOF3 commented Dec 7, 2023

Adding an explicit impl PartialEq doesn't make the writer more aware of the equality constraint than not adding it when it comes to large closures, because one might still add a new variable to capture without realizing its impacts on PartialEq. If explicit annotation to impl PartialEq is a concern, perhaps it would be useful to have the captures explicit too (e.g. #3512).

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 7, 2023
@clarfonthey
Copy link
Contributor

clarfonthey commented Dec 7, 2023

Honestly, I feel like the subtlety of having PartialEq for closures isn't really worth it, but I'll defer to others in terms of that argument. The main issue is of what you're comparing, and it feels like a majority of this exists to get around poorly designed APIs rather than anything particularly fundamental.

One concern I have is with the syntax, since the + in the impl PartialEq + Eq could be ambiguous with addition, and closures are expressions. I don't know enough about the parser to be sure; hopefully someone more knowledgeable can comment.

The main thing that comes to mind here is that it feels very weird to want to ask if something has changed, by cloning it? For example, if we have an example like:

let v = 10;
let closure = move || println!("{v}");
Wrapper(closure)

Why not instead make the value parameterised, and then pass it to the closure?

let v = 10;
let closure = |v: i32| println!("{v}");
Wrapper(v, closure)

Or, if you can't permit this, perhaps the closure can take ownership of a cell or atomic instead?

let v = Arc::new(AtomicI32::new(10));
let v2 = v.clone();
let closure = move || println!("{}", v2.load(Ordering::Relaxed));
Wrapper(v, closure)

Like, something certainly doesn't feel right to me about the examples provided, like they're just trying to work around a poorly designed API instead of doing the right thing. But they could also just be toy examples and more representative examples could benefit from the proposal.

I dunno, it just feels like this has too many caveats to really be useful, and that the ability to implement PartialEq for closures will ultimately be more harmful than helpful. But again, this is after a first pass and not understanding the full context, so, I could be wrong. Just want to put out my initial thoughts.

@burdges
Copy link

burdges commented Dec 7, 2023

As noted in zulip, you want this declared explicitly both in the trait bounds, as well as when instantiating the closure, so maybe..

fn foo(f: impl PartialEq+Fn() -> T) { ... }
fn bar() {
    foo(#[derive(PartialEq)] || { ... })
}

If you missed either side, then you'll have counter intuitive behavior.

As serde_closure exists, I suppose "procmacro closures" sound promising, by which I mean some procmacros interface which exposes information about the captured variables, etc, so the procmacro code then write impls.

Already PartialEq can change meaning, but closures make this much more surprising: Raw pointer types use pointer equality, not structural equality, so refining unsafe code can cause breakage. Cow<'a, B> uses B not its <B as ToOwned>::Owned. HashMap, Vec, etc cause unpredictable performance. Also, if you compare Fn*() -> T closures, then you'd typically have meant to compare their evaluations, but maybe forgot the ().

@ColonelThirtyTwo
Copy link
Author

One concern I have is with the syntax, since the + in the impl PartialEq + Eq could be ambiguous with addition, and closures are expressions. I don't know enough about the parser to be sure; hopefully someone more knowledgeable can comment.

I don't think this is an issue since impl Trait on its own doesn't have any meaning as an expression.

The main thing that comes to mind here is that it feels very weird to want to ask if something has changed, by cloning it? For example, if we have an example like:

...

The main use case I have in mind for this is React-like UI frameworks, where the user's render function returns a tree of the components they want, and the library compares that tree with the current state of the UI, adding, removing, and moving components based on what the render function returned.

The library tries not to burden the render function with the current state of the UI so that the render function can be written in a declarative manner, in which the render function will create new closures every time it's called. The tree diffing algorithm would then need to compare those closures with the current state of the tree in order to know whether to refresh subcomponents or not.

Here's a bit more of a concrete example using Yew:

#[functional_component]
fn TodoView() -> Html {
    let todos = use_state(|| (0..3).map(|i| Todo { id: i }).collect::<Vec<_>>());
    html!{
        <main>
            {for todos.iter().map(|todo| html!{
                <SingleTodo
                    key={todo.id}
                    todo={todo.clone()}
                    delete_button_cb={impl PartailEq || {
                        // Delete code here
                    }}
                />
            })}
        </main>
    }
}

Today, delete_button_cb has to be put behind an Rc<dyn Fn> wrapper that implements PartialEq via Rc::ptr_eq - in this case, that would mean re-rendering every SingleTodo any time the TodoView changes since the delete_button_cb would be "different". If we could get PartialEq closures, the diff algorithm could see that the delete_button_cb won't change and prevent re-rendering subcomponents.

@ColonelThirtyTwo
Copy link
Author

As noted in zulip, you want this declared explicitly both in the trait bounds, as well as when instantiating the closure, so maybe..

fn foo(f: impl PartialEq+Fn() -> T) { ... }
fn bar() {
    foo(#[derive(PartialEq)] || { ... })
}

If you missed either side, then you'll have counter intuitive behavior.

I'm not sure if I'm understanding you properly, because I believe this is what my proposal already says, just with a different proposed syntax:

fn foo(f: impl PartialEq + Fn() -> T) { ... }
fn bar() {
    foo(impl PartailEq || { ... });
}

As serde_closure exists, I suppose "procmacro closures" sound promising, by which I mean some procmacros interface which exposes information about the captured variables, etc, so the procmacro code then write impls.

serde_closure doesn't have great ergonomics. Arguments have to be specified in generics and passed as tuples since it can't use the Fn* traits on its own objects, and I haven't been able to figure out how to work around rust-lang/rust#91966 with serde_closure.

Already PartialEq can change meaning, but closures make this much more surprising: Raw pointer types use pointer equality, not structural equality, so refining unsafe code can cause breakage. Cow<'a, B> uses B not its <B as ToOwned>::Owned. HashMap, Vec, etc cause unpredictable performance.

I'm not sure what you mean here. I don't see how closures are special w.r.t. pointer equality, or what Cow, HashMap, or Vec have to do with this.

Also, if you compare Fn*() -> T closures, then you'd typically have meant to compare their evaluations, but maybe forgot the ().

I don't think this is a huge footgun, since it exists in every dynamically typed language with first class functions, and doesn't appear to be a huge issue there.

@VitWW
Copy link

VitWW commented Dec 7, 2023

Technically saying, functions require checks for full permutations of possible arguments (which grows to infinity quite fast) and all inner states to make a conclusion of Equality and for practical purposes such checking is useless.

@burdges
Copy link

burdges commented Dec 7, 2023

I never claimed existing procmacros should be used for this feature. I suggested that "procmacro closures" would make rustc expose information about the captured variables, which then makes better ergonomics possible in serde_closure, PartialEq + Fn*, etc.

Cow<'a,B>: PartialEq is defined from B: PartialEq. If you change a captured x: <B as ToOwned>::Owned to x: Cow<'a,B> then your PartialEq can change, perhaps dramatically. It's common that A: Borrow<B> with A: PartialEq differing from B: PartialEq. It's less common with ToOwned, but still perfectly given some Default, so expect examples form rust-lang/rust#77688 reappear.

Afaik Vec<T> and [T] wind up just being preformance footguns because someone could suck them into some closure. In optimized code, you manually switch to pointer equality, by analyizing the underlying false negatives, but that's not an option here.

Anyways..

At a high level, this feels niche & footgun prone. If "procmacro closures" could provide this, then (a) fewer people use this, which overall reduces footgun exposure, while simultaniously (b) more projects benefit from closure syntax by doing whatever traits their niche requires, including core ones like Default, PartialOrd, etc. I'm arguing worse maybe better here.

@ColonelThirtyTwo
Copy link
Author

I never claimed existing procmacros should be used for this feature. I suggested that "procmacro closures" would make rustc expose information about the captured variables, which then makes better ergonomics possible in serde_closure, PartialEq + Fn*, etc.

I see - you mean a new form of procedural macros that expose the underlying closure fields in a way that the proc macro can see into. Is there an RFC for that?

Cow<'a,B>: PartialEq is defined from B: Partial. If you change a captured x: <B as ToOwned>::Owned to x: Cow<'a,B> then your PartialEq can change, perhaps dramatically. It's common that A: Borrow<B> with A: PartialEq differing from B: PartialEq. It's less common with ToOwned, but still perfectly given some Default, so expect examples form rust-lang/rust#77688 reappear.

If I'm understanding this right, the concern is that it's hard to tell what variables are being closed over and their types, so the types could change (ex. String to Cow<'a, str>) and that may affect the closure's PartialEq implementation?

Afaik Vec<T> and [T] wind up just being preformance footguns because someone could suck them into some closure. In optimized code, you manually switch to pointer equality, by analyizing the underlying false negatives, but that's not an option here.

And this is a concern that a closure closing over, say, a Vec<T> may cause the closure PartialEq implementation to take a long time to check each element? Which is more obvious in a structure where you have to name out the type, but not in a closure where you don't.

At a high level, this feels niche & footgun prone. If "procmacro closures" could provide this, then (a) fewer people use this, which overall reduces footgun exposure, while simultaniously (b) more projects benefit from closure syntax by doing whatever traits their niche requires, including core ones like Default, PartialOrd, etc. I'm arguing worse maybe better here.

I'm not going to disagree that its niche. Procmacro closures indeed sounds like a more flexible alternative. I brought up this RFC because from what I understand, it should be fairly easy to implement via copying what is already done for Clone.

@Kobzol
Copy link

Kobzol commented Dec 9, 2023

I understand the motivation of this feature for virtual DOM frameworks, but the syntax seems very verbose for that use case. Already move || is relatively verbose, but writing impl PartialEq move || for each UI callback, which can range in tens or hundreds even for a moderately sized app, seems like a lot of overhead.

@Aloso
Copy link

Aloso commented Dec 9, 2023

Perhaps a #[component] macro could add the required trait bounds automatically.

@scottmcm
Copy link
Member

This makes the exact details of what's being captured -- particularly fields vs full structs -- matter even more than it did before. At least with Copy, it was fine for us to change from capturing the whole value to capturing the field. But it's a huge difference, observable as more than just "did it or didn't it compile" to have PartialEq on the struct, even with an explicit opt-in on the closure.

As such, I remain skeptical of more traits for anything with implicit captures.

With an explicit exhaustive list of captures, though, things might be be different -- the fragility of behaviour to seemingly-irrelevant changes would turn into "no, you didn't capture that" errors instead. That could also give an order to be able to extend this to Ord, say, and names for use in a Debug.

@ahicks92
Copy link

So, two things I don't see spelled out at least on the GitHub side of things. In languages like JS one is getting reference equality, but in Rust this would be value equality.

First, it is very easy to capture (say) a large Vec for one mostly inconsequential operation but the closure must compare all fields rather than some explicit subsets of captures. While one can assume that this won't happen if fully in control of both sides, an API which invisibly relies on this could easily capture enormous data structures in contexts where it is assumed that comparisons are reasonably cheap, thus defeating the purpose. This would be basically impossible to debug, not only because it's hard to spot but because it is often the case that you end up with collections that mostly compare unequal until suddenly it has to do the first 99% to find the element at the end because of some pathological path.

Secondly, in JS at least, evaluating the expression () => { body } does not return the same reference each time. This means that you already have to pull such things out, because JS equality is reference equality. It seems to me that the problem could be solved in Rust with a little bit of macro or something like that--possibly even a non-macro API will do--by offering the same "actually I need to evaluate construction of this callback once" abstractions, which box closures behind a type which compares them by reference.

Notably, React's state abstractions will re-render if the closure is changed even if the new closure wouldn't do something different; today's status quo in Rust is at least as good, in the sense that a library should be able to offer closures-but-reference-equality to users that is at least as good as JS.

I honestly see the no explicit list of captures problem as large enough that the explicit syntax here seems like a waste, if nothing else; it's nice, but it's not going to really lead to better error messages, and it's not going to guard against my first concern around accidentally expensive impls. Public API surface area primarily occurs at function parameter and return value boundaries, and that is expressed in the constraints on type parameters. I don't see why an error message along the lines of "this closure should be Eq because it's used here and this capture isn't" doesn't solve the problem, modulo implementation difficulty which I can't comment on. The syntax is currently an annoying middle ground where it's not explicit enough to be useful, and it's not required enough on the face of it to give good error messages or constrain public APIs that we wouldn't get anyway.

@ColonelThirtyTwo
Copy link
Author

ColonelThirtyTwo commented Dec 22, 2023

I certainly could undo the changes I made to add in the explicit impl ... || {} syntax - I added it because of the zulip discussion. I could look into capturing syntax too.

Though I'm not sure if I should bother. I'm not sure what the next steps are for RFC proposals after posting them and it seems like this one is just sitting here with not really great reception.

@ahicks92
Copy link

The team has to look at it essentially. They have limited bandwidth and this isn't solving a huge ecosystem-wide problem. E.g. it's not Tait for instance. Not every good RFC gets a quick response, not that I've seen. I'd not take this as a sign of their opinion on your own especially not this close to the holidays.

My reception is neutral because while I agree that this solves a problem for some definition of solves and some definition of problem, it's also entirely possible to port React hooks to Rust and also to get JS-style reference semantics on closure comparisons without language help. I'm not sure if anyone has written crates for either, but I have given both thought in the past as well as ways to place dyn Fn optionally inside a SmallVec equivalent.

I'm just more concerned about the problems it introduces. Otherwise yes, neat solution to a somewhat rare but still common enough use case. I ask myself questions like "if we were doing a code review, how could I spot problems before production?" and here I cannot, when that problem is either "comparing things that don't make sense" or "comparing things that can be hugely expensive to compare". Consequently, on a team, this feature would--by whatever amount--reduce the reliability of the software the team produces. It is hard to quantify by how much in practice without widespread usage to draw conclusions from, but certainly in theory.

Also has anyone brought up the point that this can unintentionally cause one to rely on Eq impls from libraries, which may or may not change what they do? That's not necessarily a problem because it seems it'd be rare to have a library introduce a bug in PartialEq, but performance regressions seem like they'd be a reasonable thing to see happen and--as with the large containers style bug--this makes it super easy to just eagerly use every PartialEq impl on the planet.

@ahicks92
Copy link

So, here is a problem. Rust supports splitting borrows when using closures. See this: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=74e660a2bb07b2223ae831eba03cb828

In that example, Split doesn't have a derive, but c1 and c2 are still, technically, comparable. If there were a syntax for explicit captures, it'd need to handle this case; if not, what should the behavior here be? I don't think that letting a closure be comparable because it happens only to capture fields of a non-comparable type is very sensible, but doing anything else widens the impact of this change by turning off splitting these borrows.

@burdges
Copy link

burdges commented Dec 25, 2023

It shows the splitting better if you use mutable borrows like https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=95984003646c3144ac94f5ca3d201a35

If we #[derive(PartialEq)] for Split then c1 and c2 cannot access Split: PartialEq, since they cannot access Split. Yes, this looks problematic for explicit captures proposals.

As I said above, rustc could expose the anonymous type and implicit caputres to proc macros, which simplifies proc macros for this, serde_closure, and other traits.

@ahicks92
Copy link

I think it would be possible to write a proc macro to find the captured identifiers of a closure, though not split fields, and make that generic. I think that would let me write generic closures with the serde_closure trick so I've been thinking about it but it's not a small undertaking.

Good call on the mutable borrows. Didn't spot that myself.

Something else I thought of: it's not strictly guaranteed that comparison traits are pure. That is, a cmp b may return a different value each time. The trivial way to do it is random numbers, but the in-practice way is something like arc_swap or even just stdlib atomics and a wrapper struct that implements them. I can't immediately think of a good reason to do this, but it probably should be considered.

There's also the case of mutable statics. Both this and the above point may mean that for any closure c c == c may return different values at different times. Again not a common case, but I've certainly reached for mutable statics when I need to enable some testing logic that's not easily captured via cfg for instance.

Neither of those are condemnations from the language perspective--it's entirely possible to define the semantics and say you deserve what you get if you're evil--but in my view they do decrease the motivation. Reference equality avoids these pitfalls.

But here's a good one: if a closure is equal to itself and then mutates something it captures, what should libraries trying to use this feature do? For example just incrementing a counter. Saying "well Fn" isn't enough; atomics exist and are commonly used for such.

This RFC conflates Eq as in data with Eq as in execution, and is trying (or at least will be perceived as trying) to say that if two closures are Eq then they do the same thing. I see a few pieces here:

  • Something to propose explicit captures. Might be nice to get capture by clone on that but that's beside the point.
  • Something around function purity, so that Rust gets proper "no really this can't modify anything at all really we swear" semantics somewhere.
  • Finally, this RFC, but (probably) limited to pure closures

If you have explicit captures (thus avoiding the pitfalls around just what's included) and the requirement of purity, I think this gets somewhere; in that case you can know if two closures do the same thing. Otherwise, this is the much weaker guarantee that some bytes happen to be the same right now, effectively; in many ways that's even less useful than the reference equality of JS, since the thing involved in the comparison can mutate its own representation.

Finally I'll throw out that I'm pretty sure people want generic closures. I certainly do at any rate and I'm pretty sure that's had a lot of discussion over the years. This should probably consider that future as well.

The more I think about this RFC the less I like it in this form but the more I think of it as interesting. Useful, I'm not sure, but there's certainly good ideas here. It's just all the corners that nibble at it until there's enough problems that it falls over.

@burdges
Copy link

burdges commented Dec 25, 2023

I doubt explicit captures serve much purpose, likely #![feature(fn_traits)] superceeds them in practice.

I think it would be possible to write a proc macro to find the captured identifiers of a closure, though not split fields

We discussed this above, presumably serde_closure does this, and breaks on split borrows, but anyways this sounds nasty & unmaintainable.

Ideally, rustc could tell proc macros how it writes a given closure, and give the proc macro field access, so then derive code looks relatively normal. It's fine if humans cannot write this code, specify captures, etc.

Also, there are many traits which almost resemble closures, but cannot be closures for technical reasons, like they provide default methods that a few types override. #![derive(MyTrait)] |x,y| { .. } would permit interfaces that implement MyTrait from the supplied code.

@ahicks92
Copy link

If rustc tells proc macros how it writes a given closure, then that means running macros after that point. That would likely be a new kind of macro, would it not? Seems equivalent to solving the general problem wherein figuring out properties of types is entirely left to the ecosystem, which is fragile in way more ways than just this problem in any case.

In this case, I'd imagine split borrows are computed only after the borrow checker. So, maybe the macro spits out code but presumably that'd then need to backtrack and run the borrow checker again.

Opening this up to custom derives also makes the representation of closures fixed. So for example no raw pointers or compiler-generated undefined behavior but this is the compiler optimization or anything. In particular with the comparison traits, and with closures as they are today, I don't think anything stops the closure representation from being a single pointer to the current stack frame from which captures are taken under the condition that the closure only borrows from that frame. I don't think Rust does but it could if it wanted. Derives of built-in traits could still optimize some traits of course but the general case suddenly makes closures bigger/slower in the theoretical world where rustc got clever. I'm pretty sure that llvm couldn't do this kind of optimization because llvm doesn't do type layout changes and a generated struct with n fields always has n fields. Such things very much seem in rustc's wheelhouse unless someone knows things I don't.

Also this again circles back to generic closures. With derives as they are today, it is unlikely Rust will add a new kind of core type on which one may wish to use a custom derive, but generic closures would be a significant new closure syntax that would almost certainly break such derives. Are generic closures dead at this point? If so that's a shame, because I could really use them to port some C tricks that I'm having to use proc macros for.

@ColonelThirtyTwo
Copy link
Author

If rustc tells proc macros how it writes a given closure, then that means running macros after that point. That would likely be a new kind of macro, would it not? Seems equivalent to solving the general problem wherein figuring out properties of types is entirely left to the ecosystem, which is fragile in way more ways than just this problem in any case.

I think that's what @burdges was saying above - adding a new type of proc macro for closures that gets the info about the closed-over values. Though, I'm not sure how possible that is, especially with borrowing struct fields - the compiler would have to parse and typecheck enough to know that a field access is a direct access and not though a Deref, but not enough so that the tokens that the proc macro implementation returns can still be inserted.

@burdges
Copy link

burdges commented Dec 29, 2023

Yes, rustc would run the proc macro after rustc generates the closure. It'd only add code though, so I doubt this requires rustc "backtrack", merely that earlier passes run over the added code. I'd expect this imposes restrictions upon generated code of course.

Opening this up to custom derives also makes the representation of closures fixed.

Interesting, we seemingly do not care about the representation of the closure itself, but merely what values the closure provides. Again not impossible, but I think your point is: This is not easy. Alright fair, thanks.

@ahicks92
Copy link

Well, the thing is that nothing stops the macro from being evil and spitting out code that tries to access the closure's captures. Modifying the closure is also a problem but nothing prevents doing something like:

capture.x = 5;

// Actual closure stuff

What should one do if this is after type/borrow checking, and capture is immutable? You have to run again to know the write is happening.

But really the bigger problem is that running code this late is running after type inference. You can't just plop new code in because it can change type inference all the way up to the boundary established by the fn item, and even beyond if the fn is using impl trait (because the return type would change after this rewrite). Indeed it is quite possible that the "first pass" of type checking wouldn't be able to complete without the expansion, since the derived trait is presumably required. But the boundary here is potentially the entire crate graph if the user returns a closure.

I suspect macros like this are possible, but only if it's something more like "generates new files" and even then it wouldn't be easy and still has lots of these problems since that file could implement a trait, thus causing the same chaos. But if one is generating entirely new files rather than being able to modify the local scope, it's not possible to get access to the closure in a way which would allow for this.

I'm pretty sure that I could invent a counterexample for any simple enough to be useful version of custom derives on closures which would require rebuilding significant amounts of code, even without being up to date on compiler internals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.