Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Epic] Kubeflow 1.3 upgrade #530

Closed
wg102 opened this issue May 19, 2021 · 9 comments
Closed

[Epic] Kubeflow 1.3 upgrade #530

wg102 opened this issue May 19, 2021 · 9 comments
Labels
component/kubeflow Kubeflow Related kind/epic An epic

Comments

@wg102
Copy link
Contributor

wg102 commented May 19, 2021

Epic to describe and track what needs to be done to upgrade Kubeflow to 1.3

Kubeflow 1.3 upgrade

Kubeflow upgrade might be doable in two time, update to 1.3, and then deal with jupyter-api related things. This way one person could work on upstreaming central-dashboard without big rebasing, while another could get the CWA up to our needs.

The concerns:

  • Losing features
  • 'Merging' jupyter-apis back in
  • Unknown changes in other places
  • New security isssues

The items from kubeflow/kubeflow

  • jupyter-api -> crud: Big item to get back under the 1 repo biggest trouble
    • Python -> Go: Go back to the same CWA structure. with a new folder. Create a folder backend-go for every backend folder?
    • Missing things in CWA when they converted it
      • gpu selecting validation
      • max cpu
      • loading messages when error
      • notebook without a value "Erreur in backend"
      • A few other validation messages
    • Features/fix we did to jupyter-apis that is ours
      • Kubecost
      • image help text
      • Advance settings
      • KF_LANG
    • Deployment needs to be adjusted as well

Things not upstreamed/yet

 - The KF_LANG Variable + settings
 - Validation for mounted PVC

Anything upstream that is not merged yet...

 - drop down
 - font service
 - CWA OL
  • Volume Manager:
    • Remove from jupyter-apis (or don't touch if we just change to CWA)
    • Possibly need to disable browse option, or create.
    • Friendlier message for delete?
  • Updated Manifest (more integrated with actual central dashboard structure)*
  • KFP client supports authorization checks by using service accounts
  • Manage Contributors (Does not seem upstream Managing contributors manually kubeflow/kubeflow#5847)
  • Kubeflow Pipelines (KFP):
    • UI reorganization for better User Experience
    • Manage recurring Runs via new “Jobs” page (exact name on UI is TBD)
    • Simplified view of dependency graphs
  • New UI for Tensorboard
  • Need to reconfigure so we use CWA instead of jupyter-apis
    • Move code
    • Redo all the subitems not in CWA...

Could it be possible to do it in two times.
Upgrade to 1.3 Then upgrade to using CWA?

Note: upstreaming: no central dashboard i18n or manage contributors yet

The other items

@wg102 wg102 added component/kubeflow Kubeflow Related kind/epic An epic labels May 19, 2021
@davidspek
Copy link

@wg102 I think updating to Kubeflow 1.3 will probably be much easier if you move to an Argo CD based installation. This also removes the reliance on the kubeflow manifests repo as you can use all the upstream component manifests directly. This also allows for easier maintenance in the future where you can simply upgrade each component as new versions are released, and PRs for this can be automated with Renovate (here is a good example of such a PR to update KFP).

Given that work needs to be put into reformatting your manifests now anyways, it seems like a good opportunity to move to a deployment that is easier to maintain and version track changes using Git. As I've already told @ca-scribner I'm happy to help with migrating your manifests so they can be deployed by Argo CD.

@davidspek
Copy link

Regarding the Volumes Web App, there is no browse option by default, so this is not a problem. What could be a problem is that the Volumes Web App allows users to choose the Storage Class when creating a new PVC, something the Jupyter Web App doesn't allow users to do which was a conscious decision at the time.

@davidspek
Copy link

Also, the CRUD Jupyter Web App does have a few Advanced Options sections during notebook creation that might be useful if you need to transition some options.

@blairdrummond
Copy link
Contributor

@wg102 I think updating to Kubeflow 1.3 will probably be much easier if you move to an Argo CD based installation. This also removes the reliance on the kubeflow manifests repo as you can use all the upstream component manifests directly. This also allows for easier maintenance in the future where you can simply upgrade each component as new versions are released, and PRs for this can be automated with Renovate (here is a good example of such a PR to update KFP).

Given that work needs to be put into reformatting your manifests now anyways, it seems like a good opportunity to move to a deployment that is easier to maintain and version track changes using Git. As I've already told @ca-scribner I'm happy to help with migrating your manifests so they can be deployed by Argo CD.

@brendangadd maybe an interesting opportunity here. Could try to tackle the ArgoCD install of Kubeflow into the kind experiment?

@davidspek
Copy link

@blairdrummond I’ve run the entire deployment on a 3 “node” kind cluster on my desktop. You’ll need a good amount of RAM, but it works as you would expect. I can also help with this if needed.

I’ve also started playing with KubeCost and extending the Prometheus and Grafana I’ve added to support namespace isolation. The next step in that journey is integrating KubeCost in some way into the central dashboard and possibly the crud Jupyter Web App down the road for per notebook metrics (Volumes Web App would need something similar for the PVC costs).

@brendangadd
Copy link
Contributor

@sylus is optimistically looking into the Argo CD approach for KF 1.3. The approach we choose for deployment is his call.

@davidspek: Thank you for offering your support on this transition. If @sylus gives the 👍 on this deployment method, beginning with KF 1.3, then I'm sure we'll take you up on your offer! ❤️

@davidspek
Copy link

Just wanted to let you guys know I've reached out to @berndverst from Azure / Microsoft and he's interested in possible contributing to the ArgoFlow-Azure setup and potentially making that the default methods for Azure customers to install Kubeflow going forwards as it is hopefully easier to maintain (by not needing to deal with the Kubeflow manifests repo). If this ends up happening that could hopefully take some load off of StatCan as the deployment method could be maintained more by the community.

@rohank07
Copy link
Contributor

Upstream pushed crud-web-app i18n support using localize, reverting ngx-translate. However, it uses AOT complication. To start the server in French you'd run ng serve --configuration=fr. With the ngx-translate approach we would change the browser language to French but it seems localize does not support this. Possible approach would be toggling the language via the url '/fr' which could mean changing how it will be deployed (and a lot more).
https://angular.io/guide/i18n

@wg102
Copy link
Contributor Author

wg102 commented Jan 17, 2022

the i18n with angular of crud-web apps and it's impact is out of scope for the 1.3 upgrade since it was merged upstream in 1.4 see 1.4 release notes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/kubeflow Kubeflow Related kind/epic An epic
Projects
None yet
Development

No branches or pull requests

5 participants