-
Notifications
You must be signed in to change notification settings - Fork 45
Conversation
if scheduled { | ||
heap.Remove(sh.schedQueue, i) | ||
i-- | ||
if len(acceptableWindows[sqi]) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean that the task will not get scheduled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, because there's no window which can run it
It seems like the offset PC1's in your diagram here can never actually happen. All tasks in a window start when the window starts, and new tasks won't be added to the window while its running, correct? |
@magik6k it would be really great to have some tests of the scheduler that mock out time, so we can simulate different tasks taking different amounts of time and resources, and assert that things look like we expect them to. We can also use this to test the efficiency of the scheduler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not perfect yet, but it looks better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGWM
This should fix resource starvation issues caused by small tasks continuously using all resources, and making it impossible to run bigger tasks.
This is particularly visible if a worker was setup to run both precommit1+2: Assuming that there's a constant stream of both task types, and the worker can either run 2 PC1 tasks, or 1 PC2 task - when we were running 2 PC1 tasks, and one finished, we'd immediately schedule another PC1, even if we have 100 PC2 tasks waiting
Visually this would something like this
With this PR, we are scheduling tasks in batches / windows, where (currently, this can be made configurable later) one window is defined by what a worker can run in parallel
With this PR this should something like this
This already passes tests in lotus, now testing on my miner