All jobs are pending, when some jobs set resources，others not set resources. #409

chenyangxueHDU · 2018-10-10T07:23:06Z

I created 4 jobs which have same group-name, such as ps-0, ps-1, worker-0, worker-1. Only worker jobs had resources tag.

- apiVersion: batch/v1
  kind: Job
  metadata:
    name: cyx2-worker-0
        annotations:
          scheduling.k8s.io/group-name: cyx2
  spec:
   template:
     spec:
       containers:
       -  resources:
            limits:
              nvidia.com/gpu: "1"
            requests:
              cpu: "1"
              memory: 1Gi
- apiVersion: batch/v1
  kind: Job
  metadata:
    name: cyx2-ps-0
        annotations:
          scheduling.k8s.io/group-name: cyx2
  spec:
   template:
     spec:
       containers:

I found all jobs are pending. I checked log.

I1009 21:42:36.472045   21605 allocate.go:42] Enter Allocate ...
I1009 21:42:36.472224   21605 allocate.go:118] Binding Task <mind-automl/cyx2-worker-0-2mr7q> to node <192.168.47.52>
I1009 21:42:36.472399   21605 allocate.go:118] Binding Task <mind-automl/cyx2-worker-1-hdz8r> to node <192.168.47.52>
I1009 21:42:36.472426   21605 allocate.go:72] Queue <mind-automl> is overused, ignore it.
I1009 21:42:36.472431   21605 allocate.go:155] Leaving Allocate ..

I found queue is overuserd and only bind workers, but I did not set queue. I found the key in kube-batch\pkg\scheduler\plugins\proportion\proportion.go.

                        // Calculates the deserved of each Queue.
			attr.deserved.Add(remaining.Clone().Multi(float64(attr.weight) / float64(totalWeight)))
			if !attr.deserved.LessEqual(attr.request) {
				attr.deserved = helpers.Min(attr.deserved, attr.request)
				meet[attr.queueID] = struct{}{}
			}

These mean I can use resources what I request. Then I set ps jobs resources, all jobs are running. I think it should be reminded set resources in tutorial.

			if !attr.deserved.LessEqual(attr.request) {
				attr.deserved = helpers.Min(attr.deserved, attr.request)
				meet[attr.queueID] = struct{}{}
			}

Or we can change this.

The text was updated successfully, but these errors were encountered:

k82cn · 2018-10-10T08:41:14Z

We should fix it :)

In scheduler, we ignore BestEffort resource and leave it to ResourceQuota.

chenyangxueHDU · 2018-10-10T12:46:13Z

We should fix it :)

In scheduler, we ignore BestEffort resource and leave it to ResourceQuota.

Yes. I think we should bind BestEffort tasks before other QoS tasks in kube-batch.

k82cn · 2018-10-11T02:25:55Z

I think we should bind BestEffort tasks before other QoS tasks in kube-batch.

hm... , Yes, we need to handle BestEffort separately :) There're two options in my mind:

Handle BestEffort after Burstable/Guarantee, so we do not need to re-schedule it when pod affinity/antiaffinity ready
Handle BestEffort in another goroutines, but it will make pod-antiaffinity complex :(

Anyway, both need to support node priority to avoid dispatch all BestEffort to one node. Prefer to option 1 for now :)

chenyangxueHDU · 2018-10-11T03:38:06Z

I think we should bind BestEffort tasks before other QoS tasks in kube-batch.

hm... , Yes, we need to handle BestEffort separately :) There're two options in my mind:

Handle BestEffort after Burstable/Guarantee, so we do not need to re-schedule it when pod affinity/antiaffinity ready

Handle BestEffort in another goroutines, but it will make pod-antiaffinity complex :(

Anyway, both need to support node priority to avoid dispatch all BestEffort to one node. Prefer to option 1 for now :)

If we handle BestEffort after Burstable/Guarantee, we need to change overused logic. Because if we handle Burstable/Guarantee firstly, the queue must be overused.

To avoid this, I will handle BestEffort before Burstable/Guarantee. To do this, I think we can add compareQoS in taskOrderFn, like this:

// make BestEffort > Burstable/Guarantee
func compareQoS(l, r *v1.Pod) int {}

But it will ingore some cases of Priority, because I will add this before compare Priority in taskOrderFn

	taskOrderFn := func(l interface{}, r interface{}) int {
		lv := l.(*api.TaskInfo)
		rv := r.(*api.TaskInfo)
                 
                // compareQoS first, before compare Priority
		if res := compareQoS(lv.Pod, rv.Pod); res != 0 {
			return res
		}

		glog.V(3).Infof("Priority TaskOrder: <%v/%v> prority is %v, <%v/%v> priority is %v",
			lv.Namespace, lv.Name, lv.Priority, rv.Namespace, rv.Name, rv.Priority)

		if lv.Priority == rv.Priority {
			return 0
		}

		if lv.Priority > rv.Priority {
			return -1
		}

		return 1
	}

If you agree, I will make the PR for it.

k82cn · 2018-10-11T06:18:15Z

If we handle BestEffort after Burstable/Guarantee, we need to change overused logic. Because if we handle Burstable/Guarantee firstly, the queue must be overused.

I'm thinking a new action, named backfill, to handle such case. We did not consider pod number right now, so we do not need to consider Queue's overused in backfill for BestEffort.

We may also use this action to reuse allocated but not bind resources because of gang-scheduling/coscheduling.

I'm ok to take your fix as a quick fix, as it may take some time on struct of Job for BestEffort :)

k82cn added area/policy kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Oct 10, 2018

k82cn added this to the v0.3 milestone Oct 10, 2018

This was referenced Oct 11, 2018

fix BestEffort for overused #415

Closed

fix BestEffort for overused #416

Closed

fix BestEffort for overused #417

Merged

k82cn mentioned this issue Oct 14, 2018

Added backfill action for BestEffort pods. #433

Merged

k8s-ci-robot closed this as completed in #433 Oct 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All jobs are pending, when some jobs set resources，others not set resources. #409

All jobs are pending, when some jobs set resources，others not set resources. #409

chenyangxueHDU commented Oct 10, 2018

k82cn commented Oct 10, 2018

chenyangxueHDU commented Oct 10, 2018 •

edited

Loading

k82cn commented Oct 11, 2018

chenyangxueHDU commented Oct 11, 2018 •

edited

Loading

k82cn commented Oct 11, 2018

All jobs are pending, when some jobs set resources，others not set resources. #409

All jobs are pending, when some jobs set resources，others not set resources. #409

Comments

chenyangxueHDU commented Oct 10, 2018

k82cn commented Oct 10, 2018

chenyangxueHDU commented Oct 10, 2018 • edited Loading

k82cn commented Oct 11, 2018

chenyangxueHDU commented Oct 11, 2018 • edited Loading

k82cn commented Oct 11, 2018

chenyangxueHDU commented Oct 10, 2018 •

edited

Loading

chenyangxueHDU commented Oct 11, 2018 •

edited

Loading