Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RHOAIENG-6877] - odh-model-controller breaks Knative if KServe-Serve… #203

Merged
merged 1 commit into from
May 9, 2024

Conversation

spolti
Copy link
Member

@spolti spolti commented May 9, 2024

…rless is not enabled

chore: Make the validating.odh-model-controller.opendatahub.io intercept only ksvc that
contains the serving.kserve.io/inferenceservice label. This label is added by KServe
to any ksvc handled by it, this way we avoid other OpenShift Serverless user not be
affected by this webhook in case it is not started by odh-model-controller.

Description

Steps to test:

  • Deploy the RHOAI Operator with KServe and ServiceMesh disable, make sure the odh-model-controller does not starts the webhook.
  • Try to deploy the ksvc again, you will notice that it will fail indicating the validating webhook is not available.

Applying the patch:

  • Update the DSC with this content:
spec:
  components:
    codeflare:
      managementState: Removed
    kserve:
      managementState: Removed
      serving:
        ingressGateway:
          certificate:
            type: SelfSigned
        managementState: Removed
        name: knative-serving
    trustyai:
      managementState: Removed
    ray:
      managementState: Removed
    kueue:
      managementState: Removed
    workbenches:
      managementState: Removed
    dashboard:
      managementState: Removed
    modelmeshserving:
      devFlags:
        manifests:
          - contextDir: config
            sourcePath: 'base'
            uri: 'https://github.com/spolti/odh-model-controller/tarball/RHOAIENG-6877-test'
      managementState: Managed
    datasciencepipelines:
      managementState: Removed
  • After the reconcile loops have ended, make sure that odh-model-operator didn't start the webhook, and can be checked in the logs.
  • try to deploy the ksvc again. It should succeed as the objectSelector is only filtering, to confirm that, add the kserve label serving.kserve.io/inferenceservice with any value and try to deploy the ksvc again, it will fail.

Note that, if the odh-model-controller image does not starts and it is not overridden, the test can be done anyways as the webhook will not be functional.
The validation webhook is correctly updated, you can check by querying it:

oc get ValidatingWebhookConfiguration validating.odh-model-controller.opendatahub.io -n redhat-ods-applications -o yaml

And check for these line:

  objectSelector:
    matchExpressions:
    - key: serving.kserve.io/inferenceservice
      operator: Exists

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

…rless is not enabled

chore:	Make the validating.odh-model-controller.opendatahub.io intercept only ksvc that
	contains the serving.kserve.io/inferenceservice label. This label is added by KServe
	to any ksvc handled by it, this way we avoid other OpenShift Serverless user not be
	affected by this webhook in case it is not started by odh-model-controller.

Signed-off-by: Spolti <fspolti@redhat.com>
- name: validating.ksvc.odh-model-controller.opendatahub.io
clientConfig:
service:
name: odh-model-controller-webhook-service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May the patch include the namespace?

Running kustomize build config/base on the main branch, then on this PR, and doing a diff of both results shows that the clientConfig keeps system namespace, which would be a bug:

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yesterday, we talked about it and you suggested not to patch the namespace as it would be replaced anyway.
I can revert back to the patch I have added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the cluster:

    service:
      name: odh-model-controller-webhook-service
      namespace: redhat-ods-applications
      path: /validate-serving-knative-dev-v1-service
      port: 443
  failurePolicy: Fail
  matchPolicy: Equivalent
  name: validating.ksvc.odh-model-controller.opendatahub.io
  namespaceSelector: {}
  objectSelector:
    matchExpressions:
    - key: serving.kserve.io/inferenceservice
      operator: Exists

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I just tested with odh-operator and the result is the expected.
It looks like the odh-operator is doing some additional processing than kustomize.

Copy link
Contributor

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

openshift-ci bot commented May 9, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: israel-hdez, spolti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit c5de0a4 into opendatahub-io:main May 9, 2024
5 checks passed
@spolti spolti deleted the RHOAIENG-6877 branch May 10, 2024 00:29
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants