Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge kubeflow/common to training-operator #1813

Merged
merged 30 commits into from
May 31, 2023
Merged

Merge kubeflow/common to training-operator #1813

merged 30 commits into from
May 31, 2023

Conversation

johnugeorge
Copy link
Member

@johnugeorge johnugeorge commented May 24, 2023

This PR tracks kubeflow/common to training operator repo. More optimizations will be tracked in a separate PR.

Related: #1714

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johnugeorge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@terrytangyuan
Copy link
Member

/lgtm

@johnugeorge
Copy link
Member Author

/cc @alculquicondor

Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

Followups: We need to clean up redundantly and duplicate codes in separate PRs.

/lgtm

@alculquicondor
Copy link

I don't have bandwidth to review this, but the idea sgtm!

@johnugeorge
Copy link
Member Author

/cc @gaocegege
/cc @zw0610

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @johnugeorge, I left few comments

hack/boilerplate/boilerplate.go.txt Show resolved Hide resolved
pkg/apis/kubeflow.org/v1/common_types.go Show resolved Hide resolved
type SchedulingPolicy struct {
MinAvailable *int32 `json:"minAvailable,omitempty"`
Queue string `json:"queue,omitempty"`
MinResources *map[v1.ResourceName]resource.Quantity `json:"minResources,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tenzen-y @johnugeorge Why we can't just re-use Kubernetes Types ? e.g.

MinResources corev1.ResourceList

pkg/apis/kubeflow.org/v1/util.go Outdated Show resolved Hide resolved
pkg/apis/kubeflow.org/v1/util.go Outdated Show resolved Hide resolved
pkg/apis/kubeflow.org/v1/util.go Outdated Show resolved Hide resolved
Comment on lines +9 to +17
// initializeReplicaStatuses initializes the ReplicaStatuses for replica.
func initializeReplicaStatuses(jobStatus *apiv1.JobStatus, rtype apiv1.ReplicaType) {
core.InitializeReplicaStatuses(jobStatus, rtype)
}

// updateJobReplicaStatuses updates the JobReplicaStatuses according to the pod.
func updateJobReplicaStatuses(jobStatus *apiv1.JobStatus, rtype apiv1.ReplicaType, pod *corev1.Pod) {
core.UpdateJobReplicaStatuses(jobStatus, rtype, pod)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using this somewhere ?
I can see that each Controller has its own func: https://github.com/kubeflow/training-operator/blob/master/pkg/controller.v1/tensorflow/util.go#L110

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


type PriorityClassGetFunc func(string) (*schedulingv1.PriorityClass, error)

func CalcPGMinResources(minMember int32, replicas map[apiv1.ReplicaType]*apiv1.ReplicaSpec, pcGetFunc PriorityClassGetFunc) *v1.ResourceList {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find where we use this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


type FillPodGroupSpecFunc func(object metav1.Object) error

func (jc *JobController) SyncPodGroup(job metav1.Object, specFunc FillPodGroupSpecFunc) (metav1.Object, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, in the following PRs we should identify part of common repo that we are not using today.
For example, I was not able to find where we use this scheduling plugin.

@@ -0,0 +1,29 @@
## Test Job Controller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this test Job Controller ?

@google-oss-prow google-oss-prow bot removed the lgtm label May 26, 2023
@johnugeorge
Copy link
Member Author

We can take cleanups in a separate PR? This PR can be restricted to merge so as track @andreyvelich

@andreyvelich
Copy link
Member

We can take cleanups in a separate PR? This PR can be restricted to merge so as track @andreyvelich

Sure, sounds good. LGTM. It would be nice if we could create an issue to track those clean-ups
/assign @tenzen-y @terrytangyuan

@tenzen-y
Copy link
Member

We can take cleanups in a separate PR? This PR can be restricted to merge so as track @andreyvelich

Sure, sounds good. LGTM. It would be nice if we could create an issue to track those clean-ups /assign @tenzen-y @terrytangyuan

Agree.

@tenzen-y
Copy link
Member

Created: #1816

Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @johnugeorge!

/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label May 31, 2023
@tenzen-y
Copy link
Member

@johnugeorge Feel free to do /hold cancel.

@johnugeorge
Copy link
Member Author

Thanks @tenzen-y

/hold cancel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants