Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JoinHandle::into_join_future(). #131389

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

kpreid
Copy link
Contributor

@kpreid kpreid commented Oct 8, 2024

This allows spawned threads to be incorporated into Future-based concurrency control without needing to add separate result-reporting channels and an additional layer of catch_unwind() to the thread functions. I believe this will be useful to async/blocking interop and for various applications which want to manage parallel tasks in a lightweight way.

There is a small additional cost which is paid even if the mechanism is unused: the algorithm built in to the shutdown of a spawned thread must obtain and invoke a Waker, and the Packet internal struct is larger by one Mutex<Waker>. In the future, this Mutex should be replaced by something equivalent to futures::task::AtomicWaker, which will be more efficient and eliminate deadlock and blocking hazards, but std doesn't contain one of those yet. I am unsure whether the added cost is significant relative to the overall cost of thread creation and destruction.

This is not an impl IntoFuture for JoinHandle so that it can avoid being insta-stable; particularly because during the design discussion, concerns were raised that a proper implementation should obey structured concurrency via an AsyncDrop that forces waiting for the thread. I personally think that would be a mistake, and structured spawning should be its own thing, but this choice of API permits either option in the future by keeping everything unstable, where a trait implementation would not.

Previous discussion, including why this is worth doing:

@rustbot label: +T-libs-api -T-libs

r? libs-api

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Oct 8, 2024
@kpreid
Copy link
Contributor Author

kpreid commented Oct 8, 2024

@rustbot label -T-libs

@rustbot rustbot removed the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Oct 8, 2024
@bors
Copy link
Contributor

bors commented Oct 12, 2024

☔ The latest upstream changes (presumably #131612) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Oct 24, 2024

☔ The latest upstream changes (presumably #123550) made this pull request unmergeable. Please resolve the merge conflicts.

@Amanieu
Copy link
Member

Amanieu commented Nov 3, 2024

The implementation looks fine, however it suffers from a similar problem as #116237: this isn't a proper "join" operation since the thread may still be executed when the future indicates readiness. This can be a problem since TLS destructors may still be executing in that thread.

@Amanieu
Copy link
Member

Amanieu commented Nov 3, 2024

I don't see a good solution except to document this as a caveat (and do the same for Thread::scope).

We could try to register a TLS destructor for Packet before any other code runs, but TLS destruction order is unspecified in both pthread and C++ so it's not something we can rely on.

This allows spawned threads to be incorporated into `Future`-based
concurrency control without needing to add separate result-reporting
channels and an additional layer of `catch_unwind()` to the thread
functions. I believe this will be useful to async/blocking interop
and for various applications which want to manage parallel tasks in
a lightweight way.

There is a small additional cost which is paid even if the mechanism is
unused: the algorithm built into the shutdown of a spawned thread must
obtain and invoke a `Waker`, and the `Packet` internal struct is larger
by one `Mutex<Waker>`. In the future, this `Mutex` should be replaced by
something equivalent to `futures::task::AtomicWaker`, which will be more
efficient and eliminate deadlock and blocking hazards, but `std` doesn't
contain one of those yet.

This is not an `impl IntoFuture for JoinHandle` so that it can avoid
being insta-stable; particularly because during the design discussion,
concerns were raised that a proper implementation should obey structured
concurrency via an `AsyncDrop` that forces waiting for the thread.
I personally think that would be a mistake, and structured spawning
should be its own thing, but this choice of API permits either option in
the future by keeping everything unstable, where a trait implementation
would not.
@kpreid
Copy link
Contributor Author

kpreid commented Nov 14, 2024

I don't see a good solution except to document this as a caveat (and do the same for Thread::scope).

I’ve added a mention of this consideration to the documentation that this PR adds (i.e. not including Thread::scope). I wasn’t sure how best to insert it into the documentation without too many miscellaneous paragraphs on scattered topics; I decided to create a list of # Behavior details, but I’d like to know if there's something that would be more in line with existing std documentation idiom.

/// * Unlike [`JoinHandle::join()`], the thread may still exist when the future resolves.
///   In particular, it may still be executing destructors for thread-local values.

I also note that this API design prevents one possible solution: run the JoinFuture, then after it completes, call blocking JoinHandle::join() on the assumption that it will complete quickly enough. But, that result can still be achieved by using an explicit channel instead of JoinFuture (the status quo), so this is fine as long as we see into_join_future() as a convenience feature rather than “should be able to be used for everything”.

Another approach, not reasonable in current std, would be to poll for thread completion on a timer, perhaps informed by a TLS destructor hook of its own, to avoid blocking — hm, actually, Thread::is_finished() has the exact same considerations, and it does have some documentation:

/// This might return `true` for a brief moment after the thread's main
/// function has returned, but before the thread itself has stopped running.
/// However, once this returns `true`, [`join`][Self::join] can be expected
/// to return quickly, without blocking for any significant amount of time.

This doesn’t mention TLS specifically, and it makes the implicit claim that TLS destruction will complete quickly, which isn’t guaranteed. Should this be changed too? Regardless, it’s some additional precedent for not worrying too much about TLS destruction.

library/std/src/thread/mod.rs Outdated Show resolved Hide resolved
Comment on lines 1984 to 1985
/// * Unlike [`JoinHandle::join()`], the thread may still exist when the future resolves.
/// In particular, it may still be executing destructors for thread-local values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct. Since JoinInner::join is called in take_result, the thread will be properly joined, resulting in blocking during the execution of the TLS destructors. IMHO that's a very reasonable behaviour as TLS destructors shouldn't be doing anything except freeing resources anyway, but it's not what's described here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point: take_result will actually block here if the thread is marked as finished but still executing TLS destructors. This is arguably the more correct behavior, but maybe not what async users are expecting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now updated the documentation to say that it currently blocks on TLS destruction, but is not guaranteed to continue doing that (because that seems an appropriately conservative starting point).

mem::replace(&mut *guard, placeholder)
};

// Here `their_packet` gets dropped, and if this is the last `Arc` for that packet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just randomly looking at this patch and found a potential issue.

Don't you have a race here?
What if the Future is polled right now?
We already took the waker which was still the noop when we cloned it and released the Mutex on the waker.
Now the Future sets the waker and then check again that it is not finished. But we as we haven't decremented the ref-count yet (we do that only in the next line) so the Future will return Pending and nobody is ever going to wake it.

Since you are using a Mutex for the synchronization of the Waker, I suggest to also use this lock to communicate to the future that we are finished and that the result can be taken out even if the ref-count is still 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants