Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make task::Builder::spawn* methods fallible #4823

Merged
merged 3 commits into from
Jul 12, 2022
Merged

Conversation

ipetkov
Copy link
Member

@ipetkov ipetkov commented Jul 11, 2022

Motivation

Tokio's current stable spawn* methods are all infallible but may internally panic on unrecoverable errors (for example, if the blocking threadpool is empty and the OS refuses to spawn additional threads). Unfortunately, this means that applications cannot opt-into gracefully handling such situations.

Solution

Convert all task::Builder::spawn* methods to return a fallible io::Result<_>, giving applications the opportunity to opt-into handling spawn errors themselves. This also is a fitting analogue to std::thread::Builder which also has fallible spawn methods (contrasted with std::thread::spawn which panic on failure).

Given that the task::Builder and its APIs are currently marked as unstable, this is a good time to make the change before we start stabilizing them. Currently spawn_blocking was the only API which I could tell internally panics on some errors instead of yielding them, so it has been updated to surface those errors via the task::Builder::spawn_blocking* APIs.

Note that the behavior of tokio::task::spawn_blocking is maintained as it was before this change:

  • invocations will panic if the threadpool has no available threads and the OS cannot spawn additional ones
  • if the runtime is shutting down a dummy JoinHandle is returned without panicking.

Making the `task::Builder::spawn*` methods fallible allows applications
to gracefully handle spawn errors (e.g. due to resource exhaustion)
without tokio panicking internally.

This change is also a good analogue for `std::thread::Builder` which has
fallible spawn methods (whereas `std::thread::spawn` internally panics)
Using `tokio::task::spawn_blocking` continues to exhibit the previous
behavior (panic if there aren't any worker threads available to accept
the task, but return a dummy handle if the runtime is shutting down)
@ipetkov ipetkov requested review from carllerche and hawkw July 11, 2022 20:23
@github-actions github-actions bot added the R-loom Run loom tests on this PR label Jul 11, 2022
@ipetkov ipetkov added M-runtime Module: tokio/runtime A-tokio Area: The main tokio crate labels Jul 11, 2022
@ipetkov ipetkov force-pushed the ivan/fallible-task-builder branch from 1f24b72 to f77957d Compare July 11, 2022 20:43
@ipetkov ipetkov force-pushed the ivan/fallible-task-builder branch from 1e323f2 to cf181e8 Compare July 12, 2022 02:40
@ipetkov ipetkov merged commit 3b6c74a into master Jul 12, 2022
@ipetkov ipetkov deleted the ivan/fallible-task-builder branch July 12, 2022 22:56
@ipetkov ipetkov mentioned this pull request Jul 20, 2022
4 tasks
crapStone pushed a commit to Calciumdibromid/CaBr2 that referenced this pull request Sep 11, 2022
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dependencies | minor | `1.20.1` -> `1.21.0` |
| [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dev-dependencies | minor | `1.20.1` -> `1.21.0` |

---

### Release Notes

<details>
<summary>tokio-rs/tokio</summary>

### [`v1.21.0`](https://github.com/tokio-rs/tokio/releases/tag/tokio-1.21.0)

[Compare Source](tokio-rs/tokio@tokio-1.20.1...tokio-1.21.0)

##### 1.21.0 (September 2, 2022)

This release is the first release of Tokio to intentionally support WASM. The `sync,macros,io-util,rt,time` features are stabilized on WASM. Additionally the wasm32-wasi target is given unstable support for the `net` feature.

##### Added

-   net: add `device` and `bind_device` methods to TCP/UDP sockets ([#&#8203;4882])
-   net: add `tos` and `set_tos` methods to TCP and UDP sockets ([#&#8203;4877])
-   net: add security flags to named pipe `ServerOptions` ([#&#8203;4845])
-   signal: add more windows signal handlers ([#&#8203;4924])
-   sync: add `mpsc::Sender::max_capacity` method ([#&#8203;4904])
-   sync: implement Weak version of `mpsc::Sender` ([#&#8203;4595])
-   task: add `LocalSet::enter` ([#&#8203;4765])
-   task: stabilize `JoinSet` and `AbortHandle` ([#&#8203;4920])
-   tokio: add `track_caller` to public APIs ([#&#8203;4805], [#&#8203;4848], [#&#8203;4852])
-   wasm: initial support for `wasm32-wasi` target ([#&#8203;4716])

##### Fixed

-   miri: improve miri compatibility by avoiding temporary references in `linked_list::Link` impls ([#&#8203;4841])
-   signal: don't register write interest on signal pipe ([#&#8203;4898])
-   sync: add `#[must_use]` to lock guards ([#&#8203;4886])
-   sync: fix hang when calling `recv` on closed and reopened broadcast channel ([#&#8203;4867])
-   task: propagate attributes on task-locals ([#&#8203;4837])

##### Changed

-   fs: change panic to error in `File::start_seek` ([#&#8203;4897])
-   io: reduce syscalls in `poll_read` ([#&#8203;4840])
-   process: use blocking threadpool for child stdio I/O ([#&#8203;4824])
-   signal: make `SignalKind` methods const ([#&#8203;4956])

##### Internal changes

-   rt: extract `basic_scheduler::Config` ([#&#8203;4935])
-   rt: move I/O driver into `runtime` module ([#&#8203;4942])
-   rt: rename internal scheduler types ([#&#8203;4945])

##### Documented

-   chore: fix typos and grammar ([#&#8203;4858], [#&#8203;4894], [#&#8203;4928])
-   io: fix typo in `AsyncSeekExt::rewind` docs ([#&#8203;4893])
-   net: add documentation to `try_read()` for zero-length buffers ([#&#8203;4937])
-   runtime: remove incorrect panic section for `Builder::worker_threads` ([#&#8203;4849])
-   sync: doc of `watch::Sender::send` improved ([#&#8203;4959])
-   task: add cancel safety docs to `JoinHandle` ([#&#8203;4901])
-   task: expand on cancellation of `spawn_blocking` ([#&#8203;4811])
-   time: clarify that the first tick of `Interval::tick` happens immediately ([#&#8203;4951])

##### Unstable

-   rt: add unstable option to disable the LIFO slot ([#&#8203;4936])
-   task: fix incorrect signature in `Builder::spawn_on` ([#&#8203;4953])
-   task: make `task::Builder::spawn*` methods fallible ([#&#8203;4823])

[#&#8203;4595]: tokio-rs/tokio#4595

[#&#8203;4716]: tokio-rs/tokio#4716

[#&#8203;4765]: tokio-rs/tokio#4765

[#&#8203;4805]: tokio-rs/tokio#4805

[#&#8203;4811]: tokio-rs/tokio#4811

[#&#8203;4823]: tokio-rs/tokio#4823

[#&#8203;4824]: tokio-rs/tokio#4824

[#&#8203;4837]: tokio-rs/tokio#4837

[#&#8203;4840]: tokio-rs/tokio#4840

[#&#8203;4841]: tokio-rs/tokio#4841

[#&#8203;4845]: tokio-rs/tokio#4845

[#&#8203;4848]: tokio-rs/tokio#4848

[#&#8203;4849]: tokio-rs/tokio#4849

[#&#8203;4852]: tokio-rs/tokio#4852

[#&#8203;4858]: tokio-rs/tokio#4858

[#&#8203;4867]: tokio-rs/tokio#4867

[#&#8203;4877]: tokio-rs/tokio#4877

[#&#8203;4882]: tokio-rs/tokio#4882

[#&#8203;4886]: tokio-rs/tokio#4886

[#&#8203;4893]: tokio-rs/tokio#4893

[#&#8203;4894]: tokio-rs/tokio#4894

[#&#8203;4897]: tokio-rs/tokio#4897

[#&#8203;4898]: tokio-rs/tokio#4898

[#&#8203;4901]: tokio-rs/tokio#4901

[#&#8203;4904]: tokio-rs/tokio#4904

[#&#8203;4920]: tokio-rs/tokio#4920

[#&#8203;4924]: tokio-rs/tokio#4924

[#&#8203;4928]: tokio-rs/tokio#4928

[#&#8203;4935]: tokio-rs/tokio#4935

[#&#8203;4936]: tokio-rs/tokio#4936

[#&#8203;4937]: tokio-rs/tokio#4937

[#&#8203;4942]: tokio-rs/tokio#4942

[#&#8203;4945]: tokio-rs/tokio#4945

[#&#8203;4951]: tokio-rs/tokio#4951

[#&#8203;4953]: tokio-rs/tokio#4953

[#&#8203;4956]: tokio-rs/tokio#4956

[#&#8203;4959]: tokio-rs/tokio#4959

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox.

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzMi4xODcuMCIsInVwZGF0ZWRJblZlciI6IjMyLjE4Ny4wIn0=-->

Co-authored-by: cabr2-bot <cabr2.help@gmail.com>
Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1532
Reviewed-by: crapStone <crapstone@noreply.codeberg.org>
Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
@huntc
Copy link
Contributor

huntc commented Sep 15, 2022

A question in relation to this: does a JobHandle's is_finished method return of true also signify that a thread has become available given the absence of any other spawn_blocking? I'm hoping so as I can then defer my spawn_blocking if is_finished is false, at least until this fix becomes widely available.

@Darksonn
Copy link
Contributor

@huntc This question is quite confusing to me. What problem do you consider this a fix for?

@huntc
Copy link
Contributor

huntc commented Sep 17, 2022

@huntc This question is quite confusing to me. What problem do you consider this a fix for?

If I can test that a previous task has entirely completed then I’m in a position to spawn a new one. My question is whether this test is reliable. If the spawn method was failable then I wouldn’t need the test. Make sense?

@Darksonn
Copy link
Contributor

Darksonn commented Sep 17, 2022

No, it doesn't really make sense. The spawn_blocking function would not fail just because there already are tasks, even with this change. The failures have to do with stuff like "runtime has shut down" or "we've hit the OS limit on the number of threads". If there are enough spawn_blocking tasks to fill the thread-pool, then new tasks just go in a queue and they will start when another task exits.

As for what is_finished means, well, it means that the task has finished. Whether this means that a thread is available would depend on whether the queue of new tasks is empty or not.

@huntc
Copy link
Contributor

huntc commented Sep 17, 2022

If there are enough spawn_blocking tasks to fill the thread-pool, then new tasks just go in a queue and they will start when another task exits.

I didn’t appreciate the queue aspect to this. Thanks. In the case of spawn_blocking where I’ve allocated a pool size of two blocking threads, what would be the queue size? Again, I’m looking to rely on determining whether a blocking job has finished before spawning another.

@Darksonn
Copy link
Contributor

The queue of pending spawn_blocking tasks is unbounded. I do not recommend using the limit on the total number of spawn_blocking threads to impose a limit on one particular type of task. If you wish to give it a bound, then I recommend a tokio::sync::Semaphore.

@huntc
Copy link
Contributor

huntc commented Sep 17, 2022

The queue of pending spawn_blocking tasks is unbounded. I do not recommend using the limit on the total number of spawn_blocking threads to impose a limit on one particular type of task. If you wish to give it a bound, then I recommend a tokio::sync::Semaphore.

Thanks for the clarifications. I'm limiting the number of blocking threads to minimise memory usage (running embedded Linux). I'm unsure that a semaphore would assist here as I'm looking for the best indication that a previous task has finished outside of an async context, and so that I can avoid queuing another i.e. I don't want to queue to an unbounded buffer as I'd potentially run out of memory very quickly. Anyhow, you've answered my question by informing me of the unbounded queue so I'm good. I must say though that the queue being unbounded is a surprise. Perhaps separately being able to specify a bound and having the spawn methods fail given the bounds being exceeded would be fine, but then that may be another topic.

@Darksonn
Copy link
Contributor

You can definitely put a limit on the number of tasks using a Semaphore. Its non-async methods works outside of async just fine.

@huntc
Copy link
Contributor

huntc commented Sep 19, 2022

You can definitely put a limit on the number of tasks using a Semaphore. Its non-async methods works outside of async just fine.

I see now that I can use a Semaphore from a non-async context via its try_ methods. However, why would using a semaphore be better than calling the job handle's is_finished method? Thanks again for this continued dialog. I hope it isn't going too off-topic now given that we've already established that there's an unbounded queue and I'd not expect spawn_blocking to fail on exhausting its pool of threads.

@Darksonn
Copy link
Contributor

Unless you only want one task running concurrently, I would find the Semaphore more convenient to use. But they would both work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime R-loom Run loom tests on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants