Skip to content
This repository has been archived by the owner on May 25, 2023. It is now read-only.

Deserved attr is not correctly calculated in proportion plugin #729

Closed
zionwu opened this issue Apr 10, 2019 · 2 comments · Fixed by #730
Closed

Deserved attr is not correctly calculated in proportion plugin #729

zionwu opened this issue Apr 10, 2019 · 2 comments · Fixed by #730
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Milestone

Comments

@zionwu
Copy link
Contributor

zionwu commented Apr 10, 2019

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:

I have a job not getting scheduled while there is enough resources in the cluster. I looked into the log and find that in allocate action, the queue is marked as overused:

I0410 04:06:16.001089 1 allocate.go:72] Queue <11073333> is overused, ignore it.
I0410 04:06:16.001094 1 allocate.go:72] Queue <11073333> is overused, ignore it

The reason why the queue is overused is that in proportion plugin, the deserved value for the queue is not correctly calculated:

			attr.deserved.Add(remaining.Clone().Multi(float64(attr.weight) / float64(totalWeight)))
			if !attr.deserved.LessEqual(attr.request) {
				attr.deserved = helpers.Min(attr.deserved, attr.request)
				meet[attr.queueID] = struct{}{}
			}

For example, the attr.deserved is <cpu 523750.00, memory 3076011404288.00, GPU 3750.00> and attr.request is <cpu 608000.00, memory 5153960755200.00, GPU 0.00>,
!attr.deserved.LessEqual(attr.request) return true and the queue is set to meet and is not allocated enough resources.

I think we should use attr.request.LessEqual(attr.deserved) instead.

Another problem is that the calculation of the total increased deserved:

			deserved.Add(attr.deserved.Clone().Sub(oldDeserved))

We assume that the attr.deserved is greater than oldDeserved, which is wrong,for example, oldDeserved can be <cpu 523750.00, memory 3076011404288.00, GPU 3750.00> and attr.deserved can be <cpu 500750.00, memory 5076011404288.00, GPU 0.00>, memory is increased but gpu and cpu is decreased.

We should deal with both the increased and decreased value.

@k82cn
Copy link
Contributor

k82cn commented Apr 15, 2019

/kind bugs
/sig scheduling

@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Apr 15, 2019
@k82cn
Copy link
Contributor

k82cn commented Apr 15, 2019

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 15, 2019
@k82cn k82cn added this to the v0.5 milestone Apr 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants