Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement affinity support in the scheduler (batch and service jobs) #4513

Merged
merged 8 commits into from
Jul 24, 2018

Conversation

preetapan
Copy link
Contributor

This PR builds on top of #4512 and implements incorporating affinities into the placement algorithm. Changes include:

  • The scheduler changes from having different ranges of scores from bin packing (0-18), job anti affinity (-30 to -10) and node anti affinity (-50), to all scores from different factors being normalized to a value between [-1, 1]. Each scoring iterator appends its scores to a list, and a normalization iterator as the last step averages them.

  • Node affinities/anti affinities as a new step in the scoring task. Inter affinity weights are also normalized. (e.g if a job has two affinities each with a weight of 100, their weight in the scoring layer becomes 0.5 each).

  • If the job or task group has affinities, then the scheduler examines all nodes rather than log(n) nodes.

for _, in := range input {
cleaned := strings.TrimSpace(in)
lookup[cleaned] = struct{}{}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this get a test?

if collisions > 0 {
scorePenalty := -1 * float64(collisions) * iter.penalty
option.Score += scorePenalty
scorePenalty := -1 * float64(collisions) / float64(iter.desiredCount)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be desiredCount - 1? If you have count = 2 and they both end up on the same node, shouldn't that be the max negative score?

Copy link
Contributor Author

@preetapan preetapan Jul 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we would want (collisions +1)/desiredCount) as the score then? i.e get the count of current and proposed allocs in this node, and add one for the current placement being decided in the iterator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah what you said is correct

iter.penaltyNodes = make(map[string]struct{})
iter.source.Reset()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments

source RankIterator
}

func NewScoreNormalizationIterator(ctx Context, source RankIterator) *ScoreNormalizationIterator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments


func (iter *ScoreNormalizationIterator) Next() *RankedNode {
option := iter.source.Next()
if option == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        if option == nil || len(option.Scores) == 0 {
		return nil
	}

	sum := 0.0
	for _, score := range option.Scores {
		sum += score
	}
	option.FinalScore = sum / float64(numScorers)

}
// Score should be averaged between both scorers
// -0.5 from job anti affinity and -1 from node rescheduling penalty
if out[0].FinalScore != -0.75 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use require package

}

func (iter *NodeAffinityIterator) SetJob(job *structs.Job) {
if job.Affinities != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the guard

if option == nil {
return nil
}
if len(iter.affinities) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if !iter.hasAffinities()

return option
}

func matchesAffinity(ctx Context, affinity *structs.Affinity, option *structs.Node) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should put a TODO to use the computed node class as a cache. We could skip resolving and checking based on the computed node class. Since we are examine all nodes this can be a decent speed up

@@ -304,6 +302,7 @@ func (s *SystemStack) SetJob(job *structs.Job) {

func (s *SystemStack) Select(tg *structs.TaskGroup, options *SelectOptions) (*RankedNode, *structs.Resources) {
// Reset the binpack selector and context
s.scoreNorm.Reset()
s.binPack.Reset()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete binPack reset as the scoreNorm will reset everything

@@ -76,11 +83,25 @@ job "binstore-storagelocker" {
healthy_deadline = "11m"
}

affinity {
attribute = "${node.datacenter}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vertical align equal signs and values

@@ -16,6 +16,13 @@ job "binstore-storagelocker" {
value = "windows"
}

affinity {
attribute = "${meta.team}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vertical align equal signs and values

task "binstore" {
driver = "docker"
user = "bob"
leader = true

affinity {
attribute = "${meta.foo}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vertical align equal signs and values

[]string{"str"},
"Affinity",
contextual)
if affinitiesDiff != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

append() will actually Do The Right Thing (nothing) with nil and empty slices, so you don't need this
https://play.golang.org/p/Cvgda2DSO1r

skipScoreThreshold = -10.0
// that have a score lower than this. -1 is the lowest possible score for a
// node with penalties (based on job anti affinity and node rescheduling penalties
skipScoreThreshold = 0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment is missing a closing parenthesis, but I'm curious why this changed from being equal to the value in the comment to 0. Seems strange for the comment to mention one number and the const is set to another.

Copy link
Contributor

@dadgar dadgar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit but LGTM

if collisions > 0 {
scorePenalty := -1 * float64(collisions) * iter.penalty
option.Score += scorePenalty
scorePenalty := -1 * float64(collisions) / float64(iter.desiredCount)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah what you said is correct

totalAffinityScore := 0.0
for _, affinity := range iter.affinities {
if matchesAffinity(iter.ctx, affinity, option.Node) {
normScore := affinity.Weight / sumWeight
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be more accurate to just keep adding the weights and dividing only once at the end. By dividing each time we will accumulate more floating point losses in accuracy

@preetapan preetapan merged commit 8ec5642 into f-affinities-spread Jul 24, 2018
@preetapan preetapan deleted the f-affinities-backend branch July 24, 2018 16:41
@preetapan preetapan mentioned this pull request Sep 4, 2018
@github-actions
Copy link

github-actions bot commented Mar 1, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants