Make `task::Builder::spawn*` methods fallible #4823

ipetkov · 2022-07-11T20:23:25Z

Motivation

Tokio's current stable spawn* methods are all infallible but may internally panic on unrecoverable errors (for example, if the blocking threadpool is empty and the OS refuses to spawn additional threads). Unfortunately, this means that applications cannot opt-into gracefully handling such situations.

Solution

Convert all task::Builder::spawn* methods to return a fallible io::Result<_>, giving applications the opportunity to opt-into handling spawn errors themselves. This also is a fitting analogue to std::thread::Builder which also has fallible spawn methods (contrasted with std::thread::spawn which panic on failure).

Given that the task::Builder and its APIs are currently marked as unstable, this is a good time to make the change before we start stabilizing them. Currently spawn_blocking was the only API which I could tell internally panics on some errors instead of yielding them, so it has been updated to surface those errors via the task::Builder::spawn_blocking* APIs.

Note that the behavior of tokio::task::spawn_blocking is maintained as it was before this change:

invocations will panic if the threadpool has no available threads and the OS cannot spawn additional ones
if the runtime is shutting down a dummy JoinHandle is returned without panicking.

Making the `task::Builder::spawn*` methods fallible allows applications to gracefully handle spawn errors (e.g. due to resource exhaustion) without tokio panicking internally. This change is also a good analogue for `std::thread::Builder` which has fallible spawn methods (whereas `std::thread::spawn` internally panics)

Using `tokio::task::spawn_blocking` continues to exhibit the previous behavior (panic if there aren't any worker threads available to accept the task, but return a dummy handle if the runtime is shutting down)

tokio/src/runtime/blocking/pool.rs

This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dependencies | minor | `1.20.1` -> `1.21.0` | | [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dev-dependencies | minor | `1.20.1` -> `1.21.0` | --- ### Release Notes <details> <summary>tokio-rs/tokio</summary> ### [`v1.21.0`](https://github.com/tokio-rs/tokio/releases/tag/tokio-1.21.0) [Compare Source](tokio-rs/tokio@tokio-1.20.1...tokio-1.21.0) ##### 1.21.0 (September 2, 2022) This release is the first release of Tokio to intentionally support WASM. The `sync,macros,io-util,rt,time` features are stabilized on WASM. Additionally the wasm32-wasi target is given unstable support for the `net` feature. ##### Added - net: add `device` and `bind_device` methods to TCP/UDP sockets ([#4882]) - net: add `tos` and `set_tos` methods to TCP and UDP sockets ([#4877]) - net: add security flags to named pipe `ServerOptions` ([#4845]) - signal: add more windows signal handlers ([#4924]) - sync: add `mpsc::Sender::max_capacity` method ([#4904]) - sync: implement Weak version of `mpsc::Sender` ([#4595]) - task: add `LocalSet::enter` ([#4765]) - task: stabilize `JoinSet` and `AbortHandle` ([#4920]) - tokio: add `track_caller` to public APIs ([#4805], [#4848], [#4852]) - wasm: initial support for `wasm32-wasi` target ([#4716]) ##### Fixed - miri: improve miri compatibility by avoiding temporary references in `linked_list::Link` impls ([#4841]) - signal: don't register write interest on signal pipe ([#4898]) - sync: add `#[must_use]` to lock guards ([#4886]) - sync: fix hang when calling `recv` on closed and reopened broadcast channel ([#4867]) - task: propagate attributes on task-locals ([#4837]) ##### Changed - fs: change panic to error in `File::start_seek` ([#4897]) - io: reduce syscalls in `poll_read` ([#4840]) - process: use blocking threadpool for child stdio I/O ([#4824]) - signal: make `SignalKind` methods const ([#4956]) ##### Internal changes - rt: extract `basic_scheduler::Config` ([#4935]) - rt: move I/O driver into `runtime` module ([#4942]) - rt: rename internal scheduler types ([#4945]) ##### Documented - chore: fix typos and grammar ([#4858], [#4894], [#4928]) - io: fix typo in `AsyncSeekExt::rewind` docs ([#4893]) - net: add documentation to `try_read()` for zero-length buffers ([#4937]) - runtime: remove incorrect panic section for `Builder::worker_threads` ([#4849]) - sync: doc of `watch::Sender::send` improved ([#4959]) - task: add cancel safety docs to `JoinHandle` ([#4901]) - task: expand on cancellation of `spawn_blocking` ([#4811]) - time: clarify that the first tick of `Interval::tick` happens immediately ([#4951]) ##### Unstable - rt: add unstable option to disable the LIFO slot ([#4936]) - task: fix incorrect signature in `Builder::spawn_on` ([#4953]) - task: make `task::Builder::spawn*` methods fallible ([#4823]) [#4595]: tokio-rs/tokio#4595 [#4716]: tokio-rs/tokio#4716 [#4765]: tokio-rs/tokio#4765 [#4805]: tokio-rs/tokio#4805 [#4811]: tokio-rs/tokio#4811 [#4823]: tokio-rs/tokio#4823 [#4824]: tokio-rs/tokio#4824 [#4837]: tokio-rs/tokio#4837 [#4840]: tokio-rs/tokio#4840 [#4841]: tokio-rs/tokio#4841 [#4845]: tokio-rs/tokio#4845 [#4848]: tokio-rs/tokio#4848 [#4849]: tokio-rs/tokio#4849 [#4852]: tokio-rs/tokio#4852 [#4858]: tokio-rs/tokio#4858 [#4867]: tokio-rs/tokio#4867 [#4877]: tokio-rs/tokio#4877 [#4882]: tokio-rs/tokio#4882 [#4886]: tokio-rs/tokio#4886 [#4893]: tokio-rs/tokio#4893 [#4894]: tokio-rs/tokio#4894 [#4897]: tokio-rs/tokio#4897 [#4898]: tokio-rs/tokio#4898 [#4901]: tokio-rs/tokio#4901 [#4904]: tokio-rs/tokio#4904 [#4920]: tokio-rs/tokio#4920 [#4924]: tokio-rs/tokio#4924 [#4928]: tokio-rs/tokio#4928 [#4935]: tokio-rs/tokio#4935 [#4936]: tokio-rs/tokio#4936 [#4937]: tokio-rs/tokio#4937 [#4942]: tokio-rs/tokio#4942 [#4945]: tokio-rs/tokio#4945 [#4951]: tokio-rs/tokio#4951 [#4953]: tokio-rs/tokio#4953 [#4956]: tokio-rs/tokio#4956 [#4959]: tokio-rs/tokio#4959 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] If you want to rebase/retry this PR, click this checkbox. --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).  Co-authored-by: cabr2-bot <cabr2.help@gmail.com> Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1532 Reviewed-by: crapStone <crapstone@noreply.codeberg.org> Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org> Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>

huntc · 2022-09-15T07:34:36Z

A question in relation to this: does a JobHandle's is_finished method return of true also signify that a thread has become available given the absence of any other spawn_blocking? I'm hoping so as I can then defer my spawn_blocking if is_finished is false, at least until this fix becomes widely available.

Darksonn · 2022-09-17T14:12:26Z

@huntc This question is quite confusing to me. What problem do you consider this a fix for?

huntc · 2022-09-17T20:13:29Z

@huntc This question is quite confusing to me. What problem do you consider this a fix for?

If I can test that a previous task has entirely completed then I’m in a position to spawn a new one. My question is whether this test is reliable. If the spawn method was failable then I wouldn’t need the test. Make sense?

Darksonn · 2022-09-17T21:11:38Z

No, it doesn't really make sense. The spawn_blocking function would not fail just because there already are tasks, even with this change. The failures have to do with stuff like "runtime has shut down" or "we've hit the OS limit on the number of threads". If there are enough spawn_blocking tasks to fill the thread-pool, then new tasks just go in a queue and they will start when another task exits.

As for what is_finished means, well, it means that the task has finished. Whether this means that a thread is available would depend on whether the queue of new tasks is empty or not.

huntc · 2022-09-17T21:58:21Z

If there are enough spawn_blocking tasks to fill the thread-pool, then new tasks just go in a queue and they will start when another task exits.

I didn’t appreciate the queue aspect to this. Thanks. In the case of spawn_blocking where I’ve allocated a pool size of two blocking threads, what would be the queue size? Again, I’m looking to rely on determining whether a blocking job has finished before spawning another.

Darksonn · 2022-09-17T22:13:18Z

The queue of pending spawn_blocking tasks is unbounded. I do not recommend using the limit on the total number of spawn_blocking threads to impose a limit on one particular type of task. If you wish to give it a bound, then I recommend a tokio::sync::Semaphore.

huntc · 2022-09-17T22:38:37Z

The queue of pending spawn_blocking tasks is unbounded. I do not recommend using the limit on the total number of spawn_blocking threads to impose a limit on one particular type of task. If you wish to give it a bound, then I recommend a tokio::sync::Semaphore.

Thanks for the clarifications. I'm limiting the number of blocking threads to minimise memory usage (running embedded Linux). I'm unsure that a semaphore would assist here as I'm looking for the best indication that a previous task has finished outside of an async context, and so that I can avoid queuing another i.e. I don't want to queue to an unbounded buffer as I'd potentially run out of memory very quickly. Anyhow, you've answered my question by informing me of the unbounded queue so I'm good. I must say though that the queue being unbounded is a surprise. Perhaps separately being able to specify a bound and having the spawn methods fail given the bounds being exceeded would be fine, but then that may be another topic.

Darksonn · 2022-09-18T09:35:35Z

You can definitely put a limit on the number of tasks using a Semaphore. Its non-async methods works outside of async just fine.

huntc · 2022-09-19T01:57:38Z

You can definitely put a limit on the number of tasks using a Semaphore. Its non-async methods works outside of async just fine.

I see now that I can use a Semaphore from a non-async context via its try_ methods. However, why would using a semaphore be better than calling the job handle's is_finished method? Thanks again for this continued dialog. I hope it isn't going too off-topic now given that we've already established that there's an unbounded queue and I'd not expect spawn_blocking to fail on exhausting its pool of threads.

Darksonn · 2022-09-19T05:25:10Z

Unless you only want one task running concurrently, I would find the Semaphore more convenient to use. But they would both work.

ipetkov added 2 commits July 11, 2022 11:52

task: surface spawn_blocking errors through the Builder

c09b90e

Using `tokio::task::spawn_blocking` continues to exhibit the previous behavior (panic if there aren't any worker threads available to accept the task, but return a dummy handle if the runtime is shutting down)

ipetkov requested review from carllerche and hawkw July 11, 2022 20:23

github-actions bot added the R-loom Run loom tests on this PR label Jul 11, 2022

ipetkov added M-runtime Module: tokio/runtime A-tokio Area: The main tokio crate labels Jul 11, 2022

ipetkov force-pushed the ivan/fallible-task-builder branch from 1f24b72 to f77957d Compare July 11, 2022 20:43

udoprog reviewed Jul 11, 2022

View reviewed changes

tokio/src/runtime/blocking/pool.rs Outdated Show resolved Hide resolved

fix build

cf181e8

ipetkov force-pushed the ivan/fallible-task-builder branch from 1e323f2 to cf181e8 Compare July 12, 2022 02:40

Noah-Kennedy approved these changes Jul 12, 2022

View reviewed changes

ipetkov merged commit 3b6c74a into master Jul 12, 2022

ipetkov deleted the ivan/fallible-task-builder branch July 12, 2022 22:56

ipetkov mentioned this pull request Jul 20, 2022

task: stabilize Builder #4850

Open

4 tasks

ipetkov mentioned this pull request Aug 29, 2022

API proposal: tokio::task::try_spawn_blocking #4955

Open

Darksonn mentioned this pull request Sep 1, 2022

chore: prepare Tokio v1.21.0 #4967

Merged

This was referenced Sep 2, 2022

build error on main due to tokio 1.21.0 unstable feature penumbra-zone/penumbra#1355

Closed

console-subscriber build error vs tokio v1.21.0 tokio-rs/console#373

Closed

zackradisic mentioned this pull request Mar 4, 2023

Unwrap .spawn() to support updated tokio task builder API serenity-rs/serenity#2349

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `task::Builder::spawn*` methods fallible #4823

Make `task::Builder::spawn*` methods fallible #4823

ipetkov commented Jul 11, 2022

huntc commented Sep 15, 2022

Darksonn commented Sep 17, 2022

huntc commented Sep 17, 2022

Darksonn commented Sep 17, 2022 •

edited

Loading

huntc commented Sep 17, 2022

Darksonn commented Sep 17, 2022

huntc commented Sep 17, 2022

Darksonn commented Sep 18, 2022

huntc commented Sep 19, 2022

Darksonn commented Sep 19, 2022

Make task::Builder::spawn* methods fallible #4823

Make task::Builder::spawn* methods fallible #4823

Conversation

ipetkov commented Jul 11, 2022

Motivation

Solution

huntc commented Sep 15, 2022

Darksonn commented Sep 17, 2022

huntc commented Sep 17, 2022

Darksonn commented Sep 17, 2022 • edited Loading

huntc commented Sep 17, 2022

Darksonn commented Sep 17, 2022

huntc commented Sep 17, 2022

Darksonn commented Sep 18, 2022

huntc commented Sep 19, 2022

Darksonn commented Sep 19, 2022

Make `task::Builder::spawn*` methods fallible #4823

Make `task::Builder::spawn*` methods fallible #4823

Darksonn commented Sep 17, 2022 •

edited

Loading