Skip to content
This repository has been archived by the owner on Aug 18, 2020. It is now read-only.

Window based scheduling #67

Merged
merged 16 commits into from
Jul 17, 2020
Merged

Window based scheduling #67

merged 16 commits into from
Jul 17, 2020

Conversation

magik6k
Copy link
Collaborator

@magik6k magik6k commented Jul 9, 2020

This should fix resource starvation issues caused by small tasks continuously using all resources, and making it impossible to run bigger tasks.

This is particularly visible if a worker was setup to run both precommit1+2: Assuming that there's a constant stream of both task types, and the worker can either run 2 PC1 tasks, or 1 PC2 task - when we were running 2 PC1 tasks, and one finished, we'd immediately schedule another PC1, even if we have 100 PC2 tasks waiting

Visually this would something like this

|-----PC1-----|-----PC1-----|-----PC1-----|-----PC1-----|.....
   |-----PC1-----|-----PC1-----|-----PC1-----|-----PC1-----|..

With this PR, we are scheduling tasks in batches / windows, where (currently, this can be made configurable later) one window is defined by what a worker can run in parallel

With this PR this should something like this

|-----PC1-----|  |---PC2---|---PC2---|-----PC1-----|---PC2---|---PC2---|...
   |-----PC1-----|                   |-----PC1-----|

|      WIN0      |   WIN1  |   WIN2  |     WIN3    |   WIN4  |   WIN5  |

This already passes tests in lotus, now testing on my miner

@magik6k magik6k changed the base branch from next to master July 9, 2020 13:19
if scheduled {
heap.Remove(sh.schedQueue, i)
i--
if len(acceptableWindows[sqi]) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean that the task will not get scheduled?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, because there's no window which can run it

@whyrusleeping
Copy link
Member

@magik6k

|-----PC1-----|  |---PC2---|---PC2---|-----PC1-----|---PC2---|---PC2---|...
   |-----PC1-----|                   |-----PC1-----|

|      WIN0      |   WIN1  |   WIN2  |     WIN3    |   WIN4  |   WIN5  |

It seems like the offset PC1's in your diagram here can never actually happen. All tasks in a window start when the window starts, and new tasks won't be added to the window while its running, correct?

@whyrusleeping
Copy link
Member

@magik6k it would be really great to have some tests of the scheduler that mock out time, so we can simulate different tasks taking different amounts of time and resources, and assert that things look like we expect them to.

We can also use this to test the efficiency of the scheduler

@magik6k magik6k changed the base branch from master to next July 17, 2020 11:04
Copy link
Member

@whyrusleeping whyrusleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not perfect yet, but it looks better

Copy link

@Kubuxu Kubuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGWM

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants