[DNM] Don't queue tasks on workers #6584

gjoseph92 · 2022-06-16T01:20:58Z

With this change, workers are not sent more tasks than they have threads. This eliminates root task overproduction and removes the need for work stealing.

I didn't originally intend to make this change (originally, I wanted to remove queuing of root tasks only, not all tasks). I just discovered that it was the natural outcome of #6560 without co-assignment. I thought that was interesting, so I'm opening this for discussion.

This is the minimal diff to enact this change. There is lots we would change/rip out if we went forward with this. The purpose of this PR is just to try the change on some common workloads and see how performance is affected.

Performance will likely be poor on some workloads without speculative task assignment and root task co-assignment.

I believe there's a relatively simple optimization we could make here that would bring performance about in line with current scheduling (minus co-assignment): do allow queuing extra tasks onto workers when those tasks meet specific criteria. The criteria would match the fan-out style tasks you see in a task-based shuffle, rechunk, or map_overlap. But that's an optimization to play with later.

Why?

I was working on the "ignore root task co-assignment" side of withholding root tasks #6560. Initially I was going to try withholding just root tasks.

The root task co-assignment logic becomes useless, so it should just be ripped out. All it would do is co-assign the first ws.nthreads tasks per worker. All other tasks would end up assigned via the normal logic. So I just removed the co-assign code for simplicity. Now root tasks aren't special anymore.
I added logic saying "pick a worker with decide_worker, but if that worker is full, put the task in no-worker instead". But that raises an obvious question: what if there was a different, not-full worker available we could have picked instead?
If we're going to toss out assignments to busy workers, then we shouldn't even consider busy workers in decide_worker.
If you restrict decide_worker to only the idle workers set, you get this PR.

See the test added for an example of an embarrassingly-parallel workload that's currently unrunnable on main (workers repeatedly run out of memory and die), which runs smoothly in a constant amount of memory with this PR.

Tests added / passed
Passes pre-commit run --all-files

With this change, workers are not sent more tasks than they have threads. This eliminates root task overproduction and removes the need for work stealing. This is optimized for a minimal diff. There is _lots_ we might change/rip out if we went forward with this. Performance will likely be poor on some workloads without speculative task assignment and root task co-assignment.

gjoseph92 · 2022-06-16T01:22:10Z

distributed/tests/test_scheduler.py

+        "distributed.scheduler.work-stealing": False,
+    },
+)
+async def test_root_task_overproduction(c, s, *nannies):


This simple test fails on main with a KilledWorkerError after the root tasks run a worker out of memory and restart it 3 times.

github-actions · 2022-06-16T02:26:16Z

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      15 files ±0       15 suites ±0 9h 37m 35s ⏱️ + 3h 6m 21s
  2 869 tests +1   2 654 ✔️ - 134   81 💤 +  1 133 ❌ +133 1 🔥 +1
21 254 runs +9 19 324 ✔️ - 983 954 💤 +16 975 ❌ +975 1 🔥 +1

For more details on these failures and errors, see this check.

Results for commit d93fd4b. ± Comparison against base commit cb88e3b.

fjetter · 2022-06-16T08:54:00Z

Principally, removing the need for work stealing is very exciting since we're having a bunch of problems with it.

However, just assigning as many tasks to workers as there are threads is something I am skeptical of. Assigning more tasks to the workers ahead of time has a couple of benefits, for instance

Reduced latency for very fast tasks (this is to a certain degree orthogonal to what you're trying to achieve)
Allow fetching data ahead of time

I could also see the scheduler and network becoming more of a bottleneck for a couple of workloads

gjoseph92 · 2022-06-22T23:09:56Z

Closing in favor of #6614, which only applies to root tasks.

gjoseph92 commented Jun 16, 2022

View reviewed changes

This was referenced Jun 21, 2022

[WIP] Queue root tasks on scheduler, not workers [with co-assignment] #6598

Draft

Withhold root tasks [no co assignment] #6614

Merged

gjoseph92 closed this Jun 22, 2022

gjoseph92 deleted the no-task-queuing branch June 22, 2022 23:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DNM] Don't queue tasks on workers #6584

[DNM] Don't queue tasks on workers #6584

gjoseph92 commented Jun 16, 2022

gjoseph92 Jun 16, 2022

github-actions bot commented Jun 16, 2022

fjetter commented Jun 16, 2022

gjoseph92 commented Jun 22, 2022

[DNM] Don't queue tasks on workers #6584

[DNM] Don't queue tasks on workers #6584

Conversation

gjoseph92 commented Jun 16, 2022

Why?

gjoseph92 Jun 16, 2022

Choose a reason for hiding this comment

github-actions bot commented Jun 16, 2022

Unit Test Results

fjetter commented Jun 16, 2022

gjoseph92 commented Jun 22, 2022