Implement affinity support in the scheduler (batch and service jobs) #4513

preetapan · 2018-07-16T13:56:42Z

This PR builds on top of #4512 and implements incorporating affinities into the placement algorithm. Changes include:

The scheduler changes from having different ranges of scores from bin packing (0-18), job anti affinity (-30 to -10) and node anti affinity (-50), to all scores from different factors being normalized to a value between [-1, 1]. Each scoring iterator appends its scores to a list, and a normalization iterator as the last step averages them.
Node affinities/anti affinities as a new step in the scoring task. Inter affinity weights are also normalized. (e.g if a job has two affinities each with a weight of 100, their weight in the scoring layer becomes 0.5 each).
If the job or task group has affinities, then the scheduler examines all nodes rather than log(n) nodes.

dadgar · 2018-07-16T22:25:28Z

scheduler/feasible.go

+	for _, in := range input {
+		cleaned := strings.TrimSpace(in)
+		lookup[cleaned] = struct{}{}
+	}


Can this get a test?

dadgar · 2018-07-16T22:34:52Z

scheduler/rank.go

 		if collisions > 0 {
-			scorePenalty := -1 * float64(collisions) * iter.penalty
-			option.Score += scorePenalty
+			scorePenalty := -1 * float64(collisions) / float64(iter.desiredCount)


Should it be desiredCount - 1? If you have count = 2 and they both end up on the same node, shouldn't that be the max negative score?

Seems like we would want (collisions +1)/desiredCount) as the score then? i.e get the count of current and proposed allocs in this node, and add one for the current placement being decided in the iterator.

Yeah what you said is correct

dadgar · 2018-07-16T22:35:28Z

scheduler/rank.go

 	iter.penaltyNodes = make(map[string]struct{})
 	iter.source.Reset()
 }
+


dadgar · 2018-07-16T22:37:15Z

scheduler/rank.go

+	source RankIterator
+}
+
+func NewScoreNormalizationIterator(ctx Context, source RankIterator) *ScoreNormalizationIterator {


dadgar · 2018-07-16T22:39:51Z

scheduler/rank.go

+
+func (iter *ScoreNormalizationIterator) Next() *RankedNode {
+	option := iter.source.Next()
+	if option == nil {


if option == nil || len(option.Scores) == 0 { return nil } sum := 0.0 for _, score := range option.Scores { sum += score } option.FinalScore = sum / float64(numScorers)

dadgar · 2018-07-16T22:41:57Z

scheduler/rank_test.go

+	}
+	// Score should be averaged between both scorers
+	// -0.5 from job anti affinity and -1 from node rescheduling penalty
+	if out[0].FinalScore != -0.75 {


Use require package

dadgar · 2018-07-16T23:16:50Z

scheduler/rank.go

+}
+
+func (iter *NodeAffinityIterator) SetJob(job *structs.Job) {
+	if job.Affinities != nil {


Don't need the guard

dadgar · 2018-07-16T23:17:33Z

scheduler/rank.go

+	if option == nil {
+		return nil
+	}
+	if len(iter.affinities) == 0 {


if !iter.hasAffinities()

dadgar · 2018-07-16T23:22:37Z

scheduler/rank.go

+	return option
+}
+
+func matchesAffinity(ctx Context, affinity *structs.Affinity, option *structs.Node) bool {


You should put a TODO to use the computed node class as a cache. We could skip resolving and checking based on the computed node class. Since we are examine all nodes this can be a decent speed up

dadgar · 2018-07-16T23:26:11Z

scheduler/stack.go

@@ -304,6 +302,7 @@ func (s *SystemStack) SetJob(job *structs.Job) {

 func (s *SystemStack) Select(tg *structs.TaskGroup, options *SelectOptions) (*RankedNode, *structs.Resources) {
 	// Reset the binpack selector and context
+	s.scoreNorm.Reset()
 	s.binPack.Reset()


Delete binPack reset as the scoreNorm will reset everything

schmichael · 2018-07-23T23:12:28Z

jobspec/test-fixtures/basic.hcl

@@ -76,11 +83,25 @@ job "binstore-storagelocker" {
        healthy_deadline = "11m"
    }

+    affinity {
+      attribute = "${node.datacenter}"


vertical align equal signs and values

schmichael · 2018-07-23T23:12:33Z

jobspec/test-fixtures/basic.hcl

@@ -16,6 +16,13 @@ job "binstore-storagelocker" {
    value     = "windows"
  }

+  affinity {
+    attribute = "${meta.team}"


vertical align equal signs and values

schmichael · 2018-07-23T23:12:41Z

jobspec/test-fixtures/basic.hcl

    task "binstore" {
      driver = "docker"
      user   = "bob"
      leader = true

+      affinity {
+        attribute = "${meta.foo}"


vertical align equal signs and values

schmichael · 2018-07-23T23:15:32Z

nomad/structs/diff.go

+		[]string{"str"},
+		"Affinity",
+		contextual)
+	if affinitiesDiff != nil {


append() will actually Do The Right Thing (nothing) with nil and empty slices, so you don't need this
https://play.golang.org/p/Cvgda2DSO1r

schmichael · 2018-07-23T23:46:33Z

scheduler/stack.go

-	skipScoreThreshold = -10.0
+	// that have a score lower than this. -1 is the lowest possible score for a
+	// node with penalties (based on job anti affinity and node rescheduling penalties
+	skipScoreThreshold = 0.0


comment is missing a closing parenthesis, but I'm curious why this changed from being equal to the value in the comment to 0. Seems strange for the comment to mention one number and the const is set to another.

dadgar

Small nit but LGTM

dadgar · 2018-07-24T00:19:39Z

scheduler/rank.go

 		if collisions > 0 {
-			scorePenalty := -1 * float64(collisions) * iter.penalty
-			option.Score += scorePenalty
+			scorePenalty := -1 * float64(collisions) / float64(iter.desiredCount)


Yeah what you said is correct

dadgar · 2018-07-24T00:23:54Z

scheduler/rank.go

+	totalAffinityScore := 0.0
+	for _, affinity := range iter.affinities {
+		if matchesAffinity(iter.ctx, affinity, option.Node) {
+			normScore := affinity.Weight / sumWeight


It may be more accurate to just keep adding the weights and dividing only once at the end. By dividing each time we will accumulate more floating point losses in accuracy

github-actions · 2023-03-01T02:22:53Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

preetapan force-pushed the f-affinities-backend branch from 0f4f8df to 2c61158 Compare July 16, 2018 14:26

dadgar requested changes Jul 16, 2018

View reviewed changes

preetapan force-pushed the f-affinities-backend branch 2 times, most recently from 14dc4ef to 65d7e22 Compare July 23, 2018 16:16

preetapan mentioned this pull request Jul 23, 2018

Implement spreading allocations based on a target node attribute #4527

Merged

schmichael reviewed Jul 23, 2018

View reviewed changes

dadgar approved these changes Jul 24, 2018

View reviewed changes

preetapan added 8 commits July 24, 2018 10:37

Implement affinity support in generic scheduler

9675ef1

Back out changes to propertyset that were not necessary for affinities

eaaa563

Address some review feedback

96e134f

test for setcontainsany, and treat set_contains same as set_contains_all

a44169c

Remove unnecessary reset

3ed7dc4

Fix linting

8c0f9a0

Fix after rename to ConstraintSetContainsAny

1563486

Some minor changes from code review

f2dc2fe

preetapan force-pushed the f-affinities-backend branch from 65d7e22 to f2dc2fe Compare July 24, 2018 15:50

preetapan merged commit 8ec5642 into f-affinities-spread Jul 24, 2018

preetapan deleted the f-affinities-backend branch July 24, 2018 16:41

jippi mentioned this pull request Aug 6, 2018

Feature proposal - Affinities #2509

Closed

preetapan mentioned this pull request Sep 4, 2018

Affinities and spread #4640

Merged

github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement affinity support in the scheduler (batch and service jobs) #4513

Implement affinity support in the scheduler (batch and service jobs) #4513

preetapan commented Jul 16, 2018

dadgar Jul 16, 2018

dadgar Jul 16, 2018

preetapan Jul 18, 2018 •

edited

Loading

dadgar Jul 24, 2018

dadgar Jul 16, 2018

dadgar Jul 16, 2018

dadgar Jul 16, 2018

dadgar Jul 16, 2018

dadgar Jul 16, 2018

dadgar Jul 16, 2018

dadgar Jul 16, 2018

dadgar Jul 16, 2018

schmichael Jul 23, 2018

schmichael Jul 23, 2018

schmichael Jul 23, 2018

schmichael Jul 23, 2018

schmichael Jul 23, 2018

dadgar left a comment

dadgar Jul 24, 2018

dadgar Jul 24, 2018

github-actions bot commented Mar 1, 2023

Implement affinity support in the scheduler (batch and service jobs) #4513

Implement affinity support in the scheduler (batch and service jobs) #4513

Conversation

preetapan commented Jul 16, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

preetapan Jul 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dadgar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Mar 1, 2023

preetapan Jul 18, 2018 •

edited

Loading