Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support fair share #358

Merged
merged 2 commits into from
Aug 26, 2019
Merged

Conversation

lminzhw
Copy link
Contributor

@lminzhw lminzhw commented Jul 19, 2019

Implementation of: https://github.com/volcano-sh/volcano/blob/master/docs/design/fairshare.md

support fair share based on weight among namespaces.

Part of: #244

@volcano-sh-bot volcano-sh-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jul 19, 2019
@TravisBuddy
Copy link

Travis tests have failed

Hey @lminzhw,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: 83fd1f90-a9cb-11e9-b808-dd0e8a2f961c

@k82cn
Copy link
Member

k82cn commented Jul 19, 2019

/cc @hex108 @Jeffwan @kinglion811 @jeefy

@volcano-sh-bot
Copy link
Contributor

@k82cn: GitHub didn't allow me to request PR reviews from the following users: jeefy.

Note that only volcano-sh members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @hex108 @Jeffwan @kinglion811 @jeefy

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@TravisBuddy
Copy link

Travis tests have failed

Hey @lminzhw,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: ffa49300-a9d3-11e9-b808-dd0e8a2f961c

// map[queueId]PriorityQueue(namespaceName)
namespaceMap := map[api.QueueID]*util.PriorityQueue{}
// map[queueId]map[namespaceName]PriorityQueue(*api.JobInfo)
jobInNamespaceMap := map[api.QueueID]map[string]*util.PriorityQueue{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add more comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add more comments to describe these two maps.

}

// AddResourceQuota add ResourceQuota to scheduler cache
func (sc *SchedulerCache) AddResourceQuota(obj interface{}) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not see where you filterout resource quota that does not set vaolcano.sh/xx.weight

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The namespace without this label would default to 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In itemFromQuota function, any ResourceQuota without xx.weight key would have weight with default value 1


for _, preemptee := range preemptees {
if namespaceOrderEnabled && preemptor.Namespace != preemptee.Namespace {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better refactor this part. since the change is little too hardcoded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor this part, namespace order policy is now a separate code block out of job order policy.

Copy link
Collaborator

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need more tests, especially for namespace job allocate part.

Also i noticed a problem: The namespce may not get requested resources even when there is enough quota. Say we have queue1 and queue2, ns1 and ns2, all with weight 1. Jobs in ns1 (requires 50% of all resources) are using queue1 and queue2. Jobs in ns2(requires 50% all resources) are using queue2 only.

If queue order is queue2 ->queue1, then ns2 can only get 25% of the resources. Is this correct?

pkg/scheduler/api/namespace_info.go Outdated Show resolved Hide resolved
pkg/scheduler/cache/event_handlers.go Outdated Show resolved Hide resolved
pkg/scheduler/cache/event_handlers.go Outdated Show resolved Hide resolved
pkg/scheduler/cache/event_handlers.go Outdated Show resolved Hide resolved
}

// AddResourceQuota add ResourceQuota to scheduler cache
func (sc *SchedulerCache) AddResourceQuota(obj interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The namespace without this label would default to 1

// TODO(lminzhw): if all NamespaceOrderFn treat these two namespace as the same,
// we should make the job order have its affect among namespaces.
// or just schedule namespace one by one
lv := l.(string)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if this converting failing?

Copy link
Contributor Author

@lminzhw lminzhw Jul 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this actions follows the code pattern in other CompareFn, like taskOrder or jobOrder in priority plugin.

}

// NamespaceOrderFn invoke namespaceorder function of the plugins
func (ssn *Session) NamespaceOrderFn(l, r interface{}) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the order of namespace is decided by the first plugin if it has enabled the namespace order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if some plugin decide namespace A is prior to B, this decision is just the final result.

This logic follows the code pattern in other CompareFn, like JobOrderFn.


for _, preemptee := range preemptees {
if namespaceOrderEnabled && preemptor.Namespace != preemptee.Namespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better refactor this part. since the change is little too hardcoded.

@TommyLike
Copy link
Contributor

I think we need more tests, especially for namespace job allocate part.

Also i noticed a problem: The namespce may not get requested resources even when there is enough quota. Say we have queue1 and queue2, ns1 and ns2, all with weight 1. Jobs in ns1 (requires 50% of all resources) are using queue1 and queue2. Jobs in ns2(requires 50% all resources) are using queue2 only.

If queue order is queue2 ->queue1, then ns2 can only get 25% of the resources. Is this correct?

ns2 would both get resource from q1 and q2 I guess so.

@volcano-sh-bot volcano-sh-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 23, 2019
@TravisBuddy
Copy link

Travis tests have failed

Hey @lminzhw,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: cca67310-ad20-11e9-a8f1-bbdc807b8a3b

@lminzhw
Copy link
Contributor Author

lminzhw commented Jul 23, 2019

LGTM, but i would expect e2e tests as well.

We have 2 more e2e test cases and a XXL PR now. : )

@hzxuzhonghu
Copy link
Collaborator

@lminzhw Would suggest not squash all commits, it is hard to review.

@@ -387,4 +392,196 @@ var _ = Describe("Job E2E Test", func() {
err = waitJobPhaseReady(context, job)
Expect(err).NotTo(HaveOccurred())
})

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better add a case with both queue and namespace fair share?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RP is too big now, I'll submit a new test next PR.

)

// NamespaceName is name of namespace
type NamespaceName string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would Namespace be other types other than string? If not, it might be better to use string directly. Then it avoids many type conversion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to make codes more readable, so choose to mark it as a new type but not mark it in comment.

// pick namespace from namespaces PriorityQueue
namespace := namespaces.Pop().(api.NamespaceName)

queueInNamespace := jobsMap[namespace]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little confused. What is the relationship between queue and namespace? According to the doc motivation, a namespace works like a user, and there might be many users(namespaces) in a queue. Will be many queues in a namespace?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hex108 As I understand, generally speaking, different users (namespace) could schedule jobs in different Queues. This doc I think concentrates more on the faire resource sharing in one queue between users. This solution is mainly trying to resolve starvation problem in resource sharing.

var queue *api.QueueInfo
for queueId := range queueInNamespace {
currentQueue := ssn.Queues[queueId]
if ssn.Overused(currentQueue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the queue is overused, could not the queue get resource even the cluster has idle resource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the overused queue will be skipped in master code too.

pkg/scheduler/api/namespace_info.go Show resolved Hide resolved
@@ -60,12 +64,27 @@ func (drf *drfPlugin) Name() string {
return PluginName
}

// NamespaceOrderEnabled returns the NamespaceOrder for this plugin is enabled in this session or not
func (drf *drfPlugin) NamespaceOrderEnabled(ssn *framework.Session) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between it and func (ssn *Session) NamespaceOrderEnabled()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NamespaceOrderEnabled in plugin just return the value of that plugin. This func return the result of this session and used in allocate action.

And in the newest version code, I delete this func and will open a Issue to adjust the behavior when the NamespaceOrder is disabled or other cases.

@k82cn
Copy link
Member

k82cn commented Jul 26, 2019

/approve

@volcano-sh-bot volcano-sh-bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 26, 2019
// pick namespace from namespaces PriorityQueue
namespace := namespaces.Pop().(api.NamespaceName)

queueInNamespace := jobsMap[namespace]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hex108 As I understand, generally speaking, different users (namespace) could schedule jobs in different Queues. This doc I think concentrates more on the faire resource sharing in one queue between users. This solution is mainly trying to resolve starvation problem in resource sharing.

// Name is the name of this namespace
Name NamespaceName
// Weight is the highest weight among many ResourceQuota.
Weight int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we use signed integer, what's the best practice here? Do we suggest users to always use weights > 0 or it could be negative numbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, weight < 1 is meaningless in this case, because we have a default value which is 1. Any weight < 1 will be ignored.

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 30, 2019
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jeffwan, k82cn, lminzhw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 23, 2019
@TravisBuddy
Copy link

Travis tests have failed

Hey @lminzhw,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: ef788370-c557-11e9-8712-75d78f9b457f

@volcano-sh-bot volcano-sh-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 23, 2019
@lminzhw
Copy link
Contributor Author

lminzhw commented Aug 23, 2019

@ALL, sorry for the delayed reply.

And this PR leaves two unsolved problem now:

  1. add a case with both Queue and Namespace fair share
  2. the behavior when disable NamespaceOrder is not so good, I will adjust it.

Waiting for your response~

@k82cn
Copy link
Member

k82cn commented Aug 26, 2019

/lgtm

help to re-label lgtm, and it LGTM to :)

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 26, 2019
@volcano-sh-bot volcano-sh-bot merged commit c7126c6 into volcano-sh:master Aug 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants