Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store ready and constrained tasks in heapsets #6711

Merged
merged 3 commits into from
Jul 13, 2022

Conversation

@crusaderky crusaderky self-assigned this Jul 11, 2022
@crusaderky crusaderky marked this pull request as ready for review July 11, 2022 18:17
@github-actions
Copy link
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

       15 files  ±  0         15 suites  ±0   6h 49m 48s ⏱️ - 12m 41s
  2 971 tests +  9    2 881 ✔️ +10       88 💤 +1  2  - 2 
22 033 runs  +62  20 992 ✔️ +60  1 038 💤 +3  3  - 1 

For more details on these failures, see this check.

Results for commit 1ddce73. ± Comparison against base commit 91a9e5d.

Copy link
Collaborator

@gjoseph92 gjoseph92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this refactor! _next_ready_task is great, and using HeapSets makes things simpler and more readable.

I'm concerned about this making test_steal_concurrent_simple and maybe other stealing tests flaky. Indeed, that seems to be happening: https://github.com/dask/distributed/runs/7288303161. Adding something like popright to HeapSet would be a very simple fix that would let us get this in quickly (instead of having to rewrite those tests to not rely on race conditions).

@@ -2182,12 +2175,11 @@ def _transition_constrained_executing(
self, ts: TaskState, *, stimulus_id: str
) -> RecsInstrs:
if self.validate:
assert ts.state == "constrained"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I thought these sorts of assertions were superfluous (the transition function is guaranteed to only be called if this is the case (unless of course it's been registered for tasks of the wrong state))?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily; there are a lot of _transition_* methods that are called directly from other _transition_ methods.

@@ -1063,12 +1063,12 @@ async def test_steal_concurrent_simple(c, s, *workers):
await asyncio.sleep(0.1)

# ready is a heap but we don't need last, just not the next
_, victim_key = w0.state.ready[-1]
victim_key = w0.state.ready.peek().key
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above would make me think peek isn't right for these (since it is the next). Seems like this test is relying on race conditions already (that move_task_request happens faster than slowinc); picking the first task might significantly increase the probability of those race conditions being triggered. I would think assert ws1.has_what may fail intermittently if ws0 starts executing victim_key before move_task_request lands?

In #6598 I added a HeapSet.popright method which might be useful here: https://github.com/dask/distributed/pull/6598/files#diff-8d01f1d2f0aff66e9a55d31f036edddb0c44d82c8bfcc1eefc015b9ac8d661edR100-R109. Note that this isn't a PR I'd ever expect to merge, so not sure if popright would ever have any usage beyond this test—just mentioning that it's come up once before.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks for spotting it!

@crusaderky crusaderky merged commit 3248353 into dask:main Jul 13, 2022
@crusaderky crusaderky deleted the ready_heapset branch July 13, 2022 00:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants