Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: issue#1261 config "overcommit-factor" less than 1 leads scheduler crash #1267

Closed
wants to merge 1 commit into from

Conversation

huone1
Copy link
Contributor

@huone1 huone1 commented Jan 26, 2021

kaihua give better advice: check OverCommitFactor in webhook validate。
so add webhook validate ,just for scheduler configmap;
currently, there are only some check for enqueue action, and add other actions or plugins ‘s check to it in future
Signed-off-by: huone1 huwanxing@huawei.com

@volcano-sh-bot volcano-sh-bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jan 26, 2021
@huone1 huone1 changed the title fix: issueconfig "overcommit-factor" less than 1 leads scheduler crash fix: issue#1261 config "overcommit-factor" less than 1 leads scheduler crash Jan 26, 2021
if factor < minOverCommitFactor {
factor = minOverCommitFactor
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a unit test for that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added ut just now.

@volcano-sh-bot volcano-sh-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 26, 2021
@volcano-sh-bot volcano-sh-bot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Jan 26, 2021
@volcano-sh-bot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 1, 2021
@huone1 huone1 force-pushed the OverCommitFactor branch 4 times, most recently from b415434 to 00df8bd Compare February 2, 2021 08:12
Signed-off-by: huone1 <huwanxing@huawei.com>
@@ -34,6 +34,7 @@ tiers:
- plugins:
- name: priority
- name: gang
- name: conformance
- plugins:
- name: drf
- name: predicates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binpack is also needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was attend to add binpack because it is not fit all scenarios, so it wasn't added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default cm under installer/volcano-development.yaml contains binpack, maybe we should remove binpack in that file?


"volcano.sh/volcano/pkg/scheduler/conf"
"volcano.sh/volcano/pkg/webhooks/router"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dependency order needed to formated?


/*
allow config to create and update when
1. configmap name must be "volcano-scheduler-configmap"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l think volcano-scheduler-configmap is not a necessity, it's defined by user

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I check helm files , you're right。
configmap name Transfer to admission webhook through ENV , is it OK ??

@@ -0,0 +1,43 @@
package validate
Copy link
Member

@shinytang6 shinytang6 Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions in this file should already exist in codebase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes , but it is in scheduler folder, so I repeat some

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: huone1
To complete the pull request process, please assign hzxuzhonghu
You can assign the PR to them by writing /assign @hzxuzhonghu in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

confKey := ""
confNum := 0
for key := range cm.Data {
if strings.HasSuffix(key, ".conf") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also .conf is not a necessity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1,it watch all configmap, we must filter them, so it is necessary to adjust the configmap name
2。i think it is necessary to certain restrictions on the format of scheduler configmap data

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay got it, I have no opinion on this change, but this introduces some name restrictions indeed.

@k82cn
Copy link
Member

k82cn commented Feb 6, 2021

/hold

@volcano-sh-bot volcano-sh-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 6, 2021
@k82cn
Copy link
Member

k82cn commented Feb 6, 2021

prefer not to check scheduler's arguments in webhook.

@huone1
Copy link
Contributor Author

huone1 commented Feb 7, 2021

prefer not to check scheduler's arguments in webhook.

Another way : scheduler process gets the scheduler's arguments to check them, and if one arguments is illegal,reset it to default val;

@k82cn
Copy link
Member

k82cn commented Feb 7, 2021

Another way : scheduler process gets the scheduler's arguments to check them, and if one arguments is illegal,reset it to default val;

That's better to me, please also record an event and log for that :)

@huone1
Copy link
Contributor Author

huone1 commented Mar 20, 2021

the question is resloved by #1324 , so closed it

@huone1 huone1 closed this Mar 20, 2021
@huone1 huone1 deleted the OverCommitFactor branch March 20, 2021 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants