Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hotfix: keep updating task's min available number same as creating podgroup #2802

Merged
merged 1 commit into from
Aug 8, 2023

Conversation

lowang-bh
Copy link
Member

@lowang-bh lowang-bh commented Apr 19, 2023

task's min avaiable member is set to task.Replicas if task.MinAvailable is nil when creating podgroup;
but it doesn't keep this logic when updating a podgroup.

fix #2792

@volcano-sh-bot volcano-sh-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 19, 2023
Copy link
Member

@hwdef hwdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is ok, But your submitted an extra commit.

@volcano-sh-bot volcano-sh-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 23, 2023
@lowang-bh
Copy link
Member Author

The code is ok, but you But you submitted an extra commi.

Have rebased. The extra commit is to fix lint error, which has been merged to master now.

@lowang-bh
Copy link
Member Author

the failed message is update conflict, looks has nothing to do with this change?

              Status: "Failure",
              Message: "Operation cannot be fulfilled on queues.scheduling.volcano.sh \"reclaim-q4\": the object has been modified; please apply your changes to the latest version and try again",
              Reason: "Conflict",

@wangyang0616
Copy link
Member

the failed message is update conflict, looks has nothing to do with this change?

              Status: "Failure",
              Message: "Operation cannot be fulfilled on queues.scheduling.volcano.sh \"reclaim-q4\": the object has been modified; please apply your changes to the latest version and try again",
              Reason: "Conflict",

May have encountered the same problem as this #2783.

This problem has not been solved yet, the probability is relatively low, you can try again

@lowang-bh
Copy link
Member Author

unit-test pass

root@7de62a7cb07f:/go/src/volcano.sh# make unit-test
go clean -testcache
go test -p 8 -race $(find pkg cmd -type f -name '*_test.go' | sed -r 's|/[^/]+$||' | sort | uniq | sed "s|^|volcano.sh/volcano/|")
ok  	volcano.sh/volcano/cmd/controller-manager/app/options	0.044s
ok  	volcano.sh/volcano/cmd/scheduler/app/options	0.067s
ok  	volcano.sh/volcano/pkg/cli/job	0.119s
ok  	volcano.sh/volcano/pkg/cli/queue	0.139s
ok  	volcano.sh/volcano/pkg/cli/util	0.246s
ok  	volcano.sh/volcano/pkg/cli/vcancel	0.157s
ok  	volcano.sh/volcano/pkg/cli/vresume	0.289s
ok  	volcano.sh/volcano/pkg/cli/vsuspend	0.202s
ok  	volcano.sh/volcano/pkg/controllers/apis	0.067s
ok  	volcano.sh/volcano/pkg/controllers/cache	0.065s
ok  	volcano.sh/volcano/pkg/controllers/garbagecollector	0.189s
ok  	volcano.sh/volcano/pkg/controllers/job	1.927s
ok  	volcano.sh/volcano/pkg/controllers/job/helpers	0.170s
ok  	volcano.sh/volcano/pkg/controllers/job/plugins/distributed-framework/mpi	0.176s
ok  	volcano.sh/volcano/pkg/controllers/job/plugins/distributed-framework/pytorch	0.221s
ok  	volcano.sh/volcano/pkg/controllers/job/plugins/distributed-framework/tensorflow	0.130s
ok  	volcano.sh/volcano/pkg/controllers/job/plugins/ssh	0.137s
ok  	volcano.sh/volcano/pkg/controllers/jobflow	0.209s
ok  	volcano.sh/volcano/pkg/controllers/jobtemplate	0.170s
ok  	volcano.sh/volcano/pkg/controllers/podgroup	0.213s
ok  	volcano.sh/volcano/pkg/controllers/queue	0.216s
ok  	volcano.sh/volcano/pkg/controllers/util	0.265s
ok  	volcano.sh/volcano/pkg/scheduler	0.229s
ok  	volcano.sh/volcano/pkg/scheduler/actions/allocate	0.188s
ok  	volcano.sh/volcano/pkg/scheduler/actions/preempt	0.462s
ok  	volcano.sh/volcano/pkg/scheduler/actions/reclaim	0.181s
ok  	volcano.sh/volcano/pkg/scheduler/actions/shuffle	2.207s
ok  	volcano.sh/volcano/pkg/scheduler/api	0.156s
ok  	volcano.sh/volcano/pkg/scheduler/api/devices/nvidia/gpushare	0.263s
ok  	volcano.sh/volcano/pkg/scheduler/api/devices/nvidia/vgpu	0.206s
ok  	volcano.sh/volcano/pkg/scheduler/api/helpers	0.147s
ok  	volcano.sh/volcano/pkg/scheduler/cache	0.162s
ok  	volcano.sh/volcano/pkg/scheduler/capabilities/volumebinding	28.655s
ok  	volcano.sh/volcano/pkg/scheduler/framework	0.182s
ok  	volcano.sh/volcano/pkg/scheduler/metrics/source	0.027s
ok  	volcano.sh/volcano/pkg/scheduler/plugins/binpack	0.181s
ok  	volcano.sh/volcano/pkg/scheduler/plugins/cdp	0.170s
ok  	volcano.sh/volcano/pkg/scheduler/plugins/drf	0.359s
ok  	volcano.sh/volcano/pkg/scheduler/plugins/numaaware/policy	0.173s
ok  	volcano.sh/volcano/pkg/scheduler/plugins/numaaware/provider/cpumanager	0.137s
ok  	volcano.sh/volcano/pkg/scheduler/plugins/predicates	0.160s
ok  	volcano.sh/volcano/pkg/scheduler/plugins/proportion	9.165s
ok  	volcano.sh/volcano/pkg/scheduler/plugins/task-topology	0.125s
ok  	volcano.sh/volcano/pkg/scheduler/plugins/tdm	7.491s
ok  	volcano.sh/volcano/pkg/scheduler/util	0.171s
ok  	volcano.sh/volcano/pkg/webhooks/admission/jobs/mutate	0.081s
ok  	volcano.sh/volcano/pkg/webhooks/admission/jobs/plugins/mpi	0.090s
ok  	volcano.sh/volcano/pkg/webhooks/admission/jobs/validate	0.096s
ok  	volcano.sh/volcano/pkg/webhooks/admission/pods/mutate	0.138s
ok  	volcano.sh/volcano/pkg/webhooks/admission/pods/validate	0.082s
ok  	volcano.sh/volcano/pkg/webhooks/admission/queues/mutate	0.072s
ok  	volcano.sh/volcano/pkg/webhooks/admission/queues/validate	0.069s

fix pg not exist when get by client due to it is not added

Signed-off-by: lowang_bh <lhui_wang@163.com>
Copy link
Member

@hwdef hwdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@stale
Copy link

stale bot commented Aug 7, 2023

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 7, 2023
@lowang-bh
Copy link
Member Author

still need.

/assign @Thor-wl

@hzxuzhonghu
Copy link
Collaborator

/approve

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 8, 2023
@stale stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 8, 2023
@hzxuzhonghu
Copy link
Collaborator

do we need to backport

@volcano-sh-bot volcano-sh-bot merged commit d57f9f3 into volcano-sh:master Aug 8, 2023
@lowang-bh
Copy link
Member Author

do we need to backport

You means cherry-pick to other released branch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
6 participants