Optimize thread_worker #13056

rockwotj · 2023-08-28T21:31:37Z

Use a ring buffer to bulk pass tasks around, and also support creating a
twin worker thread for every reactor thread.

This amortizes the cost of running a task on an alien thread, this came about because we're going to be switching transforms over to Wasmtime running on an alien thread.

Before:

test                                      iterations      median         mad         min         max      allocs       tasks        inst
thread_worker_test.1                            3354   216.749us    47.062ns   216.590us   216.951us       5.000       4.000   1610844.2
thread_worker_test.10                            460     2.088ms   870.335ns     2.087ms     2.089ms      70.000      58.000  25391981.4
thread_worker_test.100                            61    16.465ms     1.733us    16.461ms    16.469ms     700.000     598.000 208106393.1
thread_worker_test.1000                           50    19.925ms     6.187us    19.919ms    19.937ms    7007.000    5998.000 214776807.2

After:

test                                      iterations      median         mad         min         max      allocs       tasks        inst
thread_worker_test.1                            1855   216.515us    77.552ns   216.330us   216.593us       4.000       4.000   1549639.0
thread_worker_test.10                            413     2.085ms   394.971ns     2.084ms     2.086ms      40.000      40.000  24921287.6
thread_worker_test.100                           113     8.518ms     3.891us     8.513ms     8.524ms     400.000     400.000 106121583.5
thread_worker_test.1000                           93    10.424ms   605.968ns    10.423ms    10.426ms    4880.000    5744.000 127114775.7

Backports Required

Release Notes

none

src/v/ssx/thread_worker.h

dotnwat

lgtm

src/v/ssx/thread_worker.h

BenPope

Looks great.

I wonder if it makes sense to add an abort_source, or fail pending requests on stop()?

src/v/ssx/thread_worker.h

Use a ring buffer to bulk pass tasks around, and also support creating a twin worker thread for every reactor thread. Before: ``` test iterations median mad min max allocs tasks inst thread_worker_test.1 3354 216.749us 47.062ns 216.590us 216.951us 5.000 4.000 1610844.2 thread_worker_test.10 460 2.088ms 870.335ns 2.087ms 2.089ms 70.000 58.000 25391981.4 thread_worker_test.100 61 16.465ms 1.733us 16.461ms 16.469ms 700.000 598.000 208106393.1 thread_worker_test.1000 50 19.925ms 6.187us 19.919ms 19.937ms 7007.000 5998.000 214776807.2 ``` After: ``` test iterations median mad min max allocs tasks inst thread_worker_test.1 1855 216.515us 77.552ns 216.330us 216.593us 4.000 4.000 1549639.0 thread_worker_test.10 413 2.085ms 394.971ns 2.084ms 2.086ms 40.000 40.000 24921287.6 thread_worker_test.100 113 8.518ms 3.891us 8.513ms 8.524ms 400.000 400.000 106121583.5 thread_worker_test.1000 93 10.424ms 605.968ns 10.423ms 10.426ms 4880.000 5744.000 127114775.7 ``` Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

rockwotj · 2023-08-30T14:02:11Z

I wonder if it makes sense to add an abort_source, or fail pending requests on stop()?

I am not really sure how to do this correctly, as I don't think seastar classes are designed to support work outside of seastar, unless there is a different technique you're suggesting.

Same goes for failing pending stuff on stop, I am not sure how to tell the alien thread that it needs to stop, maybe with an atomic boolean it checks before every task? Thoughts?

BenPope · 2023-08-30T14:15:31Z

I wonder if it makes sense to add an abort_source, or fail pending requests on stop()?

I am not really sure how to do this correctly, as I don't think seastar classes are designed to support work outside of seastar, unless there is a different technique you're suggesting.

Same goes for failing pending stuff on stop, I am not sure how to tell the alien thread that it needs to stop, maybe with an atomic boolean it checks before every task? Thoughts?

The stop signal was previously passed via the ss::writeable_eventfd, once stop is detected, I imagine it would be possible to call fail_tasks.

I'm not saying it has to be done, it's just a thought. Propagating an abort (somehow) might be more useful than just cancelling the work. Or it might just make sense to drain the queue; it probably depends what kind of work it's running.

rockwotj · 2023-08-30T14:27:56Z

The stop signal was previously passed via the ss::writeable_eventfd, once stop is detected, I imagine it would be possible to call fail_tasks.

That's a little hard to do now that there are multiple signals in the normal submit path. signal adds values up and then read drains the value from my reading of the man pages, so calling signal(1) twice means that you'll read 2 as the result.

it probably depends what kind of work it's running.

Agreed, I don't know about kerberos, but for wasm, cancelling does make since, but that will have already happened as the transform processor needs to shutdown before wasm does.

BenPope · 2023-08-31T12:46:08Z

it probably depends what kind of work it's running.

Agreed, I don't know about kerberos, but for wasm, cancelling does make since, but that will have already happened as the transform processor needs to shutdown before wasm does.

Kerberos is usually just reading a local file, so I/O, but not too bad, and unlikely to build a large backlog.

BenPope

LGTM, let's keep an eye on the queue length at shutdown. Should there be metrics for queue length, time in queue, etc?

Feel free to add a followup issue.

rockwotj · 2023-08-31T13:28:50Z

CI Failures: #12120 and #12659

rockwotj · 2023-08-31T13:29:48Z

LGTM, let's keep an eye on the queue length at shutdown.

I added a followup task for metrics - Also the queue is limited to 128 items for now.

github-actions bot added the area/redpanda label Aug 28, 2023

rockwotj requested review from BenPope, dotnwat and michael-redpanda August 29, 2023 01:30

rockwotj marked this pull request as ready for review August 29, 2023 01:30

rockwotj force-pushed the optimize-thread-worker branch from 392564c to 4980680 Compare August 29, 2023 01:37

BenPope reviewed Aug 29, 2023

View reviewed changes

src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved

src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved

src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved

src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved

rockwotj force-pushed the optimize-thread-worker branch 2 times, most recently from dfab236 to 9e3490d Compare August 29, 2023 14:01

rockwotj requested a review from BenPope August 29, 2023 14:02

rockwotj force-pushed the optimize-thread-worker branch from 9e3490d to 76ab1e6 Compare August 29, 2023 14:20

dotnwat reviewed Aug 30, 2023

View reviewed changes

src/v/ssx/thread_worker.h Show resolved Hide resolved

src/v/ssx/thread_worker.h Show resolved Hide resolved

dotnwat previously approved these changes Aug 30, 2023

View reviewed changes

rockwotj mentioned this pull request Aug 30, 2023

Consider using a cpuset for sharded alien threads #13090

Closed

BenPope reviewed Aug 30, 2023

View reviewed changes

src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved

rockwotj dismissed dotnwat’s stale review via 1edf4c8 August 30, 2023 13:59

rockwotj force-pushed the optimize-thread-worker branch from 76ab1e6 to 1edf4c8 Compare August 30, 2023 13:59

rockwotj requested review from BenPope and dotnwat August 30, 2023 14:02

dotnwat approved these changes Aug 31, 2023

View reviewed changes

BenPope approved these changes Aug 31, 2023

View reviewed changes

rockwotj mentioned this pull request Aug 31, 2023

Metrics for ssx::thread_worker #13164

Open

michael-redpanda self-assigned this Aug 31, 2023

michael-redpanda assigned rockwotj and unassigned michael-redpanda Aug 31, 2023

rockwotj merged commit 603c9bc into redpanda-data:dev Aug 31, 2023

rockwotj deleted the optimize-thread-worker branch August 31, 2023 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize thread_worker #13056

Optimize thread_worker #13056

rockwotj commented Aug 28, 2023 •

edited

Loading

dotnwat left a comment

BenPope left a comment •

edited

Loading

rockwotj commented Aug 30, 2023

BenPope commented Aug 30, 2023

rockwotj commented Aug 30, 2023

BenPope commented Aug 31, 2023

BenPope left a comment •

edited

Loading

rockwotj commented Aug 31, 2023

rockwotj commented Aug 31, 2023

Optimize thread_worker #13056

Optimize thread_worker #13056

Conversation

rockwotj commented Aug 28, 2023 • edited Loading

Backports Required

Release Notes

dotnwat left a comment

Choose a reason for hiding this comment

BenPope left a comment • edited Loading

Choose a reason for hiding this comment

rockwotj commented Aug 30, 2023

BenPope commented Aug 30, 2023

rockwotj commented Aug 30, 2023

BenPope commented Aug 31, 2023

BenPope left a comment • edited Loading

Choose a reason for hiding this comment

rockwotj commented Aug 31, 2023

rockwotj commented Aug 31, 2023

rockwotj commented Aug 28, 2023 •

edited

Loading

BenPope left a comment •

edited

Loading

BenPope left a comment •

edited

Loading