Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do we need kubernetes-sigs/application controller in Kubeflow? #1715

Closed
Bobgy opened this issue Jan 14, 2021 · 15 comments
Closed

Why do we need kubernetes-sigs/application controller in Kubeflow? #1715

Bobgy opened this issue Jan 14, 2021 · 15 comments

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Jan 14, 2021

When I was investigating issue: GoogleCloudPlatform/kubeflow-distribution#184,

I realized there's a huge volume of logs coming from application controller, and so far I think that's normal behavior, we need a way to change log level, but I don't think it exposed such a config.

However, a bigger question is why is Kubeflow using application controller? Does it solve anyone's problem? It seems https://github.com/kubernetes-sigs/application doesn't have much activity any more.

As a developer, I almost never look at application CRs, because I know the low-level details. There doesn't seem to be enough UI/guidance that teach entrance level people to use them.

This is what I feel about it, I'd like to know more context.

/cc @jlewi @kubeflow/wg-deployment-leads

@Bobgy
Copy link
Contributor Author

Bobgy commented Jan 14, 2021

I can see it's enforced through https://github.com/kubeflow/community/blob/master/guidelines/application_requirements.md, but I don't fully get the reasoning behind. Was kubernetes-sigs/application ambitious at that time, but they seem to fail to achieve that goal now?

@Bobgy
Copy link
Contributor Author

Bobgy commented Jan 14, 2021

Additionally, I think we've got many issues with application maintenance like #1573, no one before thesuperzapper noticed the problem seems to be an evidence that no one used it at all.

@Bobgy
Copy link
Contributor Author

Bobgy commented Jan 14, 2021

Sorry, I mentioned the wrong group.
I meant @kubeflow/wg-manifests-leads, not @kubeflow/wg-deployment-leads, but maybe it's worth both groups' discussion.

@yanniszark
Copy link
Contributor

@Bobgy you ninja'd me there, I was planning to ask this question myself in the doc we're preparing for manifests best practices, as mentioned here: #1710

It could also lead to weird and hard-to-debug issues, like the one seen here:
#540
kubeflow/kubeflow#4767 (comment)

If adding cluster-wide resources by mistake to an Application CR can lead to resources being accidentally deleted, this is very worrying by itself.

As to the value it provides, it enables users to see and handle the high-level apps installed in the cluster. However, given the inactivity of the upstream repo (https://github.com/kubernetes-sigs/application) and the limitations it has (what about cluster-level resources?, excessive logging), I don't know if it makes sense to prescribe this everywhere.

@Bobgy
Copy link
Contributor Author

Bobgy commented Jan 15, 2021

Haha, good to know you have also thought of it.

@yanniszark
Copy link
Contributor

I prepared a small overview of the application cr in Kubeflow.

What is it used for:

  • The ability to describe an applications metadata (e.g., that an application like notebook-controller is running). This is why the GKE marketplace demands it.
  • Tying all configuration (e.g., CRDs, Deployments, etc) relevant to an App together. It doesn't support cross-namespace references.

Limitations:

Proposed Solution
The Application CR has failed to achieve its goals. It is an extra maintainance burden, so we should remove it.
@Bobgy what do you think? Perhaps we should have a quick slot one of these days to discuss further?

@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 16, 2021

I agree with that, plus.. we may use kpt live status instead for the monitoring functionality.

@thesuperzapper
Copy link
Member

@yanniszark agree with removal, but since its the responsibility of component owners to do this, we should do a quick run through before the 1.3 release to check everyone has removed them

yanniszark added a commit to arrikto/katib that referenced this issue Apr 1, 2021
Remove Application CR as per kubeflow/manifests#1715

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
yanniszark added a commit to arrikto/katib that referenced this issue Apr 1, 2021
Remove Application CR as per kubeflow/manifests#1715

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
yanniszark added a commit to arrikto/katib that referenced this issue Apr 1, 2021
Remove Application CR as per kubeflow/manifests#1715

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
google-oss-robot pushed a commit to kubeflow/katib that referenced this issue Apr 1, 2021
Remove Application CR as per kubeflow/manifests#1715

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
yanniszark added a commit to arrikto/katib that referenced this issue Apr 1, 2021
Remove Application CR as per kubeflow/manifests#1715

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
yanniszark added a commit to yanniszark/katib that referenced this issue Apr 1, 2021
Remove Application CR as per kubeflow/manifests#1715

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
yanniszark added a commit to yanniszark/katib that referenced this issue Apr 1, 2021
Remove Application CR as per kubeflow/manifests#1715

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
yanniszark added a commit to yanniszark/katib that referenced this issue Apr 1, 2021
Remove Application CR as per:
kubeflow/manifests#1715

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
google-oss-robot pushed a commit to kubeflow/katib that referenced this issue Apr 1, 2021
…ry pick of #1507 on release-0.11. #1507: manifests: Remove Application CR (#1509)

Remove Application CR as per:
kubeflow/manifests#1715

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
pvaneck added a commit to pvaneck/kserve that referenced this issue Apr 19, 2021
@akdigitalself
Copy link

Thanks for giving a background on this issue. The other issues where this issue is referenced are resolved. The application manifest exists in the 1.3 branch. Is there a plan to remove it in v1.3.1?

@yanniszark
Copy link
Contributor

@akdigitalself the application controller exists under contrib, which contains 3rd-party packages not supported by a Kubeflow WG. But yes, we should clean it up for 1.4. Would you like to open a PR for that?

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 13, 2021

Some installations need it, so we don't want to remove it yet. It's already not included in platform-agnostic

@juliusvonkohout
Copy link
Member

Some installations need it, so we don't want to remove it yet. It's already not included in platform-agnostic

@Bobgy are you sure that you did not accidentally leave it in ?

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 24, 2021

@juliusvonkohout that's metacontroller, not application controller.
https://github.com/metacontroller/metacontroller is a dependency of KFP profile controller (for multi-user mode), but KFP profile controller is actually a pretty simple script, if you don't like the dependency you can totally just deploy KFP namespaced resources on your own or use a different controller.

@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in one week if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Mar 2, 2022
@stale
Copy link

stale bot commented Apr 17, 2022

This issue has been closed due to inactivity.

@stale stale bot closed this as completed Apr 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants