Skip to content

Commit

Permalink
hash TaskStates by id, not key
Browse files Browse the repository at this point in the history
This fixes `test_deadlock_resubmit_queued_tasks_fast`. The problem:
- an "old" TaskState `f-0` is forgotten and removed from `queued` HeapSet. Internally, this removes it from a set, but a weakref is still on the heap
- a new TaskState `f-0` is added to the HeapSet. It's pushed onto the heap, but since its priority is higher (newer generation), it comes after the old `f-0` object that's still on the front of the heap
- `pop` pops old `f-0` off the heap. The weakref is still alive (for whatever reason). `value in self._data` (the set) is True, because the _hash_ of old `f-0` and new `f-0` are the same (same key). So we return the stale, old TaskState object.

Much like `WorkerState`s, there should be exactly 1 `TaskState` instance per task. If there are multiple instances with the same key, they are different tasks, and have different state. xref dask#7372, dask#7356.
  • Loading branch information
gjoseph92 committed Dec 16, 2022
1 parent 9b057d8 commit 68a1e51
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions distributed/scheduler.py
Original file line number Diff line number Diff line change
Expand Up @@ -1328,7 +1328,7 @@ class TaskState:

def __init__(self, key: str, run_spec: object, state: TaskStateState):
self.key = key
self._hash = hash(key)
self._hash = hash(id(self))
self.run_spec = run_spec
self._state = state
self.exception = None
Expand Down Expand Up @@ -1366,7 +1366,7 @@ def __hash__(self) -> int:
return self._hash

def __eq__(self, other: object) -> bool:
return isinstance(other, TaskState) and self.key == other.key
return self is other

@property
def state(self) -> TaskStateState:
Expand Down

0 comments on commit 68a1e51

Please sign in to comment.