Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stageless: avoid deadlock in TaskPool::scope by polling all tasks #7492

Closed
wants to merge 12 commits into from

Conversation

maniwani
Copy link
Contributor

@maniwani maniwani commented Feb 3, 2023

Objective

Replace #7448 with the "fair polling" alternative it mentioned, so that we can continue to mirror std::thread::scope (i.e. keep "nested scope spawns").

It might be possible to poll all the tasks in get_results in the scope. If we could do that, then the scope could panic if one of the tasks panics. It would require a data structure that we could both poll and push futures through a shared ref to it. I tried FuturesUnordered, but it requires an exclusive ref to poll it.

Solution

  • Revert those changes.
  • Eagerly take spawned tasks from the queue and maintain a list of the pending ones.
  • Poll pending tasks for completion.
  • If any task fails, cancel all taken-but-unfinished tasks and then panic.

@maniwani maniwani changed the title Stageless: avoid executor deadlock Stageless: avoid deadlock in TaskPool::scope by polling all tasks Feb 3, 2023
@alice-i-cecile alice-i-cecile added A-ECS Entities, components, systems, and events A-Tasks Tools for parallel and async work labels Feb 3, 2023
@alice-i-cecile alice-i-cecile added this to the 0.10 milestone Feb 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2023

Example alien_cake_addict failed to run, please try running it locally and check the result.

@hymm
Copy link
Contributor

hymm commented Feb 4, 2023

I did some preliminary checking with cargo bench systems. Both the linked list and the current pr version are faster than main. With the linked list version being a little faster on busy systems and contrived systems. I think I like the linked list version better than the double buffer, but I wonder if there's a cleaner approach than that.

In general I'm in favor of something like this. It's pretty close to what I was thinking of for the "fair polling" alternative I mentioned in the other pr.

@hymm
Copy link
Contributor

hymm commented Feb 5, 2023

edit: reran on main and still seeing a regression on many_foxes with this merged into stageless

image

@hymm
Copy link
Contributor

hymm commented Feb 6, 2023

Thinking about this some more and I'm not sure this is the right approach. This doesn't properly cooperate with the tasks. The version of get_results on main will wait for the task that it is awaiting to complete before waking. This version doesn't do that. It will constantly poll the tasks.

We probably want to write a custom future that can properly wait for a wake signal from all the tasks before rechecking things.

@cart
Copy link
Member

cart commented Feb 7, 2023

Yeah I'm also not a fan of the constant polling. Feels too much like we're circumventing the executor.

@cart cart removed this from the 0.10 milestone Feb 7, 2023
@alice-i-cecile
Copy link
Member

@maniwani shall we close this out for now?

@maniwani
Copy link
Contributor Author

Sure.

@maniwani maniwani closed this Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events A-Tasks Tools for parallel and async work
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants