Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate v2 MPI operator to the unified operator #1479

Open
terrytangyuan opened this issue Nov 22, 2021 · 8 comments
Open

Migrate v2 MPI operator to the unified operator #1479

terrytangyuan opened this issue Nov 22, 2021 · 8 comments

Comments

@terrytangyuan
Copy link
Member

Now that v1 MPI operator has been migrated to this repo #1457. Let's use this issue to track the progress on v2.

https://github.com/kubeflow/mpi-operator/tree/master/v2

cc @hackerboy01 @zw0610 @alculquicondor @kubeflow/wg-training-leads

@andreyvelich
Copy link
Member

@alculquicondor What is the status for MPI Operator v2 ?
Do we have plans to deliver MPI Operator v2 as part of Universal Training Operator in Kubeflow 1.5 ?
The Kubeflow 1.5 release deadline is January 15th.

@alculquicondor
Copy link

We need a contributor to do it. I don't currently have capacity to handle it. That means that likely it wouldn't be possible for January 15th. But I don't think the v1 operator is ready either.

@terrytangyuan
Copy link
Member Author

cc @ArangoGutierrez

@johnugeorge
Copy link
Member

johnugeorge commented Feb 9, 2023

I want to resurrect this thread. There have been many asks from the community to have v2 mpi operator in training operator. Currently, newer features are merged into v2 mpi. Time have passed since the last discussion and v2 api is stable now. What is our plan here regarding migration? What are the road blocks here? There is confusion in the community the future of v1 mpi as well.

Can we prioritise this? @alculquicondor @terrytangyuan @tenzen-y

@tenzen-y
Copy link
Member

tenzen-y commented Feb 9, 2023

IIRC, we are planning to donate mpi-operator v2 to kubernetes-sigs. So we should decide whether donate to the kubernetes-sigs or merge the v2 operator to the training-operator, to avoid double management.

kubeflow/community#557

cc: @ArangoGutierrez @denkensk @ahg-g

@kuizhiqing
Copy link
Member

kuizhiqing commented Aug 28, 2023

Do we have any new plan here ? Since donate mpi-operator v2 to kubernetes-sigs is seems aborted, should we merge mpi-operator v2 to training-operator ?

@terrytangyuan
Copy link
Member Author

terrytangyuan commented Aug 28, 2023

There's also discussion around donating Spark-on-K8s project to Kubeflow (no open issue yet since we are still waiting for governance update). I personally think that project is similar to MPI Operator which not just focus on training. So I am not sure if MPI Operator would be a good fit for training-operator.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants