Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Removed traces of argo server from default deployment. #587

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

amadhusu
Copy link
Contributor

@amadhusu amadhusu commented Mar 6, 2024

The issue resolved by this Pull Request:

Resolves RHOAIENG-3895

Description of your changes:

Removed the deployment files for argo server as it isn't required by default and it was the team's concensus as well in RHOAIENG-2873

Testing instructions

  1. Deploy DSPO using make deployODH IMG="quay.io/opendatahub/data-science-pipelines-operator:pr-587"
  2. Deploy any sample DSPA. Example - /config/samples/v2/dspa-simple
  3. Run Pipelines and ensure everything works like it normally does.

NOTE
If you deploy DSPO using make deploy IMG="..., ensure to also do make argodeploy as these will install all the CRDs,RBACs and whatnot required for argo workflow controller to work

Checklist

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@dsp-developers
Copy link
Contributor

A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-587
An OCP cluster where you are logged in as cluster admin is required.

To use this image run the following:

cd $(mktemp -d)
git clone git@github.com:opendatahub-io/data-science-pipelines-operator.git
cd data-science-pipelines-operator/
git fetch origin pull/587/head
git checkout -b pullrequest 57db955658fe67d03316b61a4839c6e24bebea77
oc new-project opendatahub
make deploy IMG="quay.io/opendatahub/data-science-pipelines-operator:pr-587"

More instructions here on how to deploy and test a Data Science Pipelines Application.

@dsp-developers
Copy link
Contributor

Change to PR detected. A new PR build was completed.
A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-587

@dsp-developers
Copy link
Contributor

Change to PR detected. A new PR build was completed.
A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-587

@dsp-developers
Copy link
Contributor

Change to PR detected. A new PR build was completed.
A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-587

@dsp-developers
Copy link
Contributor

Change to PR detected. A new PR build was completed.
A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-587

@dsp-developers
Copy link
Contributor

Change to PR detected. A new PR build was completed.
A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-587

@amadhusu amadhusu changed the title WIP: Removed traces of argo server from default deployment. feat: Removed traces of argo server from default deployment. Mar 6, 2024
Copy link
Member

@DharmitD DharmitD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@HumairAK
Copy link
Contributor

HumairAK commented Mar 6, 2024

let's get at least 2 people to deploy this themselves and confirm it works before approving

Copy link
Contributor

@hbelmiro hbelmiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amadhusu ds-pipeline-workflow-controller-sample is failing as soon as deployed.

oc logs ds-pipeline-workflow-controller-sample-586c55b489-8xqxh                                                                
time="2024-03-07T12:51:49Z" level=info msg="index config" indexWorkflowSemaphoreKeys=true
time="2024-03-07T12:51:49Z" level=info msg="cron config" cronSyncPeriod=10s
time="2024-03-07T12:51:49Z" level=info msg="Memoization caches will be garbage-collected if they have not been hit after" gcAfterNotHitDuration=30s
time="2024-03-07T12:51:49.785Z" level=info msg="not enabling pprof debug endpoints"
time="2024-03-07T12:51:49.786Z" level=info msg="config map" name=ds-pipeline-workflow-controller-sample
time="2024-03-07T12:51:49.794Z" level=info msg="Get configmaps 200"
time="2024-03-07T12:51:49.798Z" level=info msg="Configuration:\nartifactRepository:\n  archiveLogs: false\n  s3:\n    accessKeySecret:\n      key: accesskey\n      name: mlpipeline-minio-artifact\n    bucket: mlpipeline\n    endpoint: http://minio-sample.kubeflow.svc.cluster.local:9000\n    insecure: false\n    secretKeySecret:\n      key: secretkey\n      name: mlpipeline-minio-artifact\ncontainerRuntimeExecutor: emissary\nexecutor:\n  imagePullPolicy: IfNotPresent\n  name: \"\"\n  resources: {}\ninitialDelay: 0s\nmetricsConfig: {}\nnodeEvents: {}\npodSpecLogStrategy: {}\ntelemetryConfig: {}\n"
time="2024-03-07T12:51:49.798Z" level=info msg="Persistence configuration disabled"
I0307 12:51:49.799153       1 leaderelection.go:248] attempting to acquire leader lease kubeflow/workflow-controller...
time="2024-03-07T12:51:49.802Z" level=info msg="Get leases 200"
time="2024-03-07T12:51:49.807Z" level=info msg="Update leases 200"
I0307 12:51:49.807676       1 leaderelection.go:258] successfully acquired lease kubeflow/workflow-controller
time="2024-03-07T12:51:49.807Z" level=info msg="new leader" leader=ds-pipeline-workflow-controller-sample-586c55b489-8xqxh
time="2024-03-07T12:51:49.807Z" level=info msg="Starting Workflow Controller" defaultRequeueTime=10s version=v3.3.10
time="2024-03-07T12:51:49.807Z" level=info msg="Current Worker Numbers" podCleanup=4 workflow=32 workflowTtl=4
time="2024-03-07T12:51:49.807Z" level=info msg="Watching task results" labelSelector="!workflows.argoproj.io/controller-instanceid,workflows.argoproj.io/workflow"
time="2024-03-07T12:51:49.807Z" level=info msg=Plugins executorPlugins=false
time="2024-03-07T12:51:49.809Z" level=info msg="List workflows 404"
time="2024-03-07T12:51:49.809Z" level=fatal msg="the server could not find the requested resource (get workflows.argoproj.io)"

@amadhusu
Copy link
Contributor Author

amadhusu commented Mar 7, 2024

@amadhusu ds-pipeline-workflow-controller-sample is failing as soon as deployed.

oc logs ds-pipeline-workflow-controller-sample-586c55b489-8xqxh                                                                
time="2024-03-07T12:51:49Z" level=info msg="index config" indexWorkflowSemaphoreKeys=true
time="2024-03-07T12:51:49Z" level=info msg="cron config" cronSyncPeriod=10s
time="2024-03-07T12:51:49Z" level=info msg="Memoization caches will be garbage-collected if they have not been hit after" gcAfterNotHitDuration=30s
time="2024-03-07T12:51:49.785Z" level=info msg="not enabling pprof debug endpoints"
time="2024-03-07T12:51:49.786Z" level=info msg="config map" name=ds-pipeline-workflow-controller-sample
time="2024-03-07T12:51:49.794Z" level=info msg="Get configmaps 200"
time="2024-03-07T12:51:49.798Z" level=info msg="Configuration:\nartifactRepository:\n  archiveLogs: false\n  s3:\n    accessKeySecret:\n      key: accesskey\n      name: mlpipeline-minio-artifact\n    bucket: mlpipeline\n    endpoint: http://minio-sample.kubeflow.svc.cluster.local:9000\n    insecure: false\n    secretKeySecret:\n      key: secretkey\n      name: mlpipeline-minio-artifact\ncontainerRuntimeExecutor: emissary\nexecutor:\n  imagePullPolicy: IfNotPresent\n  name: \"\"\n  resources: {}\ninitialDelay: 0s\nmetricsConfig: {}\nnodeEvents: {}\npodSpecLogStrategy: {}\ntelemetryConfig: {}\n"
time="2024-03-07T12:51:49.798Z" level=info msg="Persistence configuration disabled"
I0307 12:51:49.799153       1 leaderelection.go:248] attempting to acquire leader lease kubeflow/workflow-controller...
time="2024-03-07T12:51:49.802Z" level=info msg="Get leases 200"
time="2024-03-07T12:51:49.807Z" level=info msg="Update leases 200"
I0307 12:51:49.807676       1 leaderelection.go:258] successfully acquired lease kubeflow/workflow-controller
time="2024-03-07T12:51:49.807Z" level=info msg="new leader" leader=ds-pipeline-workflow-controller-sample-586c55b489-8xqxh
time="2024-03-07T12:51:49.807Z" level=info msg="Starting Workflow Controller" defaultRequeueTime=10s version=v3.3.10
time="2024-03-07T12:51:49.807Z" level=info msg="Current Worker Numbers" podCleanup=4 workflow=32 workflowTtl=4
time="2024-03-07T12:51:49.807Z" level=info msg="Watching task results" labelSelector="!workflows.argoproj.io/controller-instanceid,workflows.argoproj.io/workflow"
time="2024-03-07T12:51:49.807Z" level=info msg=Plugins executorPlugins=false
time="2024-03-07T12:51:49.809Z" level=info msg="List workflows 404"
time="2024-03-07T12:51:49.809Z" level=fatal msg="the server could not find the requested resource (get workflows.argoproj.io)"

@hbelmiro you need to ensure to do make deployODH IMG=.... . If you just did make deploy IMG=..., it will not install the CRDs required for the Workflow Controller to work which your logs are complaining of especially with respect to the resource workflows.argoproj.io

@hbelmiro
Copy link
Contributor

hbelmiro commented Mar 7, 2024

@amadhusu ds-pipeline-workflow-controller-sample is failing as soon as deployed.

oc logs ds-pipeline-workflow-controller-sample-586c55b489-8xqxh                                                                
time="2024-03-07T12:51:49Z" level=info msg="index config" indexWorkflowSemaphoreKeys=true
time="2024-03-07T12:51:49Z" level=info msg="cron config" cronSyncPeriod=10s
time="2024-03-07T12:51:49Z" level=info msg="Memoization caches will be garbage-collected if they have not been hit after" gcAfterNotHitDuration=30s
time="2024-03-07T12:51:49.785Z" level=info msg="not enabling pprof debug endpoints"
time="2024-03-07T12:51:49.786Z" level=info msg="config map" name=ds-pipeline-workflow-controller-sample
time="2024-03-07T12:51:49.794Z" level=info msg="Get configmaps 200"
time="2024-03-07T12:51:49.798Z" level=info msg="Configuration:\nartifactRepository:\n  archiveLogs: false\n  s3:\n    accessKeySecret:\n      key: accesskey\n      name: mlpipeline-minio-artifact\n    bucket: mlpipeline\n    endpoint: http://minio-sample.kubeflow.svc.cluster.local:9000\n    insecure: false\n    secretKeySecret:\n      key: secretkey\n      name: mlpipeline-minio-artifact\ncontainerRuntimeExecutor: emissary\nexecutor:\n  imagePullPolicy: IfNotPresent\n  name: \"\"\n  resources: {}\ninitialDelay: 0s\nmetricsConfig: {}\nnodeEvents: {}\npodSpecLogStrategy: {}\ntelemetryConfig: {}\n"
time="2024-03-07T12:51:49.798Z" level=info msg="Persistence configuration disabled"
I0307 12:51:49.799153       1 leaderelection.go:248] attempting to acquire leader lease kubeflow/workflow-controller...
time="2024-03-07T12:51:49.802Z" level=info msg="Get leases 200"
time="2024-03-07T12:51:49.807Z" level=info msg="Update leases 200"
I0307 12:51:49.807676       1 leaderelection.go:258] successfully acquired lease kubeflow/workflow-controller
time="2024-03-07T12:51:49.807Z" level=info msg="new leader" leader=ds-pipeline-workflow-controller-sample-586c55b489-8xqxh
time="2024-03-07T12:51:49.807Z" level=info msg="Starting Workflow Controller" defaultRequeueTime=10s version=v3.3.10
time="2024-03-07T12:51:49.807Z" level=info msg="Current Worker Numbers" podCleanup=4 workflow=32 workflowTtl=4
time="2024-03-07T12:51:49.807Z" level=info msg="Watching task results" labelSelector="!workflows.argoproj.io/controller-instanceid,workflows.argoproj.io/workflow"
time="2024-03-07T12:51:49.807Z" level=info msg=Plugins executorPlugins=false
time="2024-03-07T12:51:49.809Z" level=info msg="List workflows 404"
time="2024-03-07T12:51:49.809Z" level=fatal msg="the server could not find the requested resource (get workflows.argoproj.io)"

@hbelmiro you need to ensure to do make deployODH IMG=.... . If you just did make deploy IMG=..., it will not install the CRDs required for the Workflow Controller to work which your logs are complaining of especially with respect to the resource workflows.argoproj.io

Sorry @amadhusu. I misunderstood it.
I tested it on my (non-fresh, but cleaned) cluster and it works fine.
I'll try to spin up a new cluster to retest it.

/lgtm

@VaniHaripriya
Copy link
Contributor

/lgtm

Signed-off-by: Achyut Madhusudan <amadhusu@redhat.com>
@dsp-developers
Copy link
Contributor

Change to PR detected. A new PR build was completed.
A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-587

@HumairAK
Copy link
Contributor

HumairAK commented Mar 7, 2024

nice work!
/approve
/lgtm

Copy link
Contributor

openshift-ci bot commented Mar 7, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: HumairAK

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Mar 7, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 992a365 into opendatahub-io:main Mar 7, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants