Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: fix quadratic performance with spread blocks #11712

Merged
merged 1 commit into from
Dec 21, 2021

Commits on Dec 21, 2021

  1. scheduler: fix quadratic performance with spread blocks

    When the scheduler picks a node for each evaluation, the
    `LimitIterator` provides at most 2 eligible nodes for the
    `MaxScoreIterator` to choose from. This keeps scheduling fast while
    producing acceptable results because the results are binpacked.
    
    Jobs with a `spread` block (or node affinity) remove this limit in
    order to produce correct spread scoring. This means that every
    allocation within a job with a `spread` block is evaluated against
    _all_ eligible nodes. Operators of large clusters have reported that
    jobs with `spread` blocks that are eligible on a large number of nodes
    can take longer than the nack timeout to evaluate (60s). Typical
    evaluations are processed in milliseconds.
    
    In practice, it's not necessary to evaluate every eligible node for
    every allocation on large clusters, because the `RandomIterator` at
    the base of the scheduler stack produces enough variation in each pass
    that the likelihood of an uneven spread is negligible. Note that
    feasibility is checked before the limit, so this only impacts the
    number of _eligible_ nodes available for scoring, not the total number
    of nodes.
    
    This changeset sets the iterator limit for "large" `spread` block and
    node affinity jobs to be equal to the number of desired
    allocations. This brings an example problematic job evaluation down
    from ~3min to ~10s. The included tests ensure that we have acceptable
    spread results across a variety of large cluster topologies.
    tgross committed Dec 21, 2021
    Configuration menu
    Copy the full SHA
    252f93b View commit details
    Browse the repository at this point in the history