Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Check task pause first when driver leave suspended state (facebookinc…
…ubator#11006) Summary: Pull Request resolved: facebookincubator#11006 When driver thread leave suspension state, it first check if the task has been terminated or not. If it is terminated, then it returns without waiting for the task has been resumed or not. This assumes that the driver thread will only leave suspend state if there is no spilling on the associated query task. This assumption might always be true with global arbitration optimization which decouple the memory arbitration request and memory arbitration operation. So we need to let driver thread wait until the task has been resumed. Otherwise, the driver thread might continue execution after leave suspended state until it checks the task state. This might cause concurrent updates to the operator state. The sequence to trigger this with global arbitration optimization: T1. driver call try reserve memory (put the driver thread in reclaimable state) which trigger memory arbitration T2. driver memory arbitration succeeds and about to leave T3. the background global memory arbitration kicks off and try to reclaim from the driver as it is in reclaimable state. The memory arbitration will pause the task execution. T4. the task is terminated by coordinator for some reason T5. driver thread tries to leave suspended state and realize that the task has been terminated so it just leaves the suspended state. T6. driver thread continue execution after memory reservation as it doesn't notice the task has been terminated. Given that, both the driver execution and spill could operate on the same thread in parallel. This PR changes driver thread to wait for the task resume signal first when leave from suspended state in T5. Unit test is added and verified with global arbitration optimization shadow Reviewed By: tanjialiang, oerling Differential Revision: D62727855 fbshipit-source-id: c711e8c7c90873a6ea14ec249f78aa0e071f724b
- Loading branch information