Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have pre-req retry. #571

Merged
merged 1 commit into from
Feb 16, 2024
Merged

Conversation

HumairAK
Copy link
Contributor

@HumairAK HumairAK commented Feb 13, 2024

The issue resolved by this Pull Request:

Resolves https://issues.redhat.com/browse/RHOAIENG-2099

Description of your changes:

Fixes a bug where we were not returning an error where expected, when DB connection failed to query against the configured DB.

Also added a fall back reconcile requeue. When a health check by DSPO fails for Object store or Database, dspo will requeue this DSPA to reconcile again after 20s (by default, but configurable at dspo level), then try again.

Other considerations

This adds infinite retry logic when prereqs fail. There is no sort of exponential backoff added here, the logic is pretty. My thinking is to iterate on this and eventually add a max retry attempts or some sort of backoff time in the status field.

Testing instructions

Deploy DSPO
Deploy Multiple DSPA's
Inspect the DSPA status field, ensure the proper status message/reasons are being trickled into the status field
Try with a working DB, then an invalid endpoint.
Try to scale down the default MariaDB before it comes up, and inspect the behavior of the DSPO, then try scaling it back up again.

Checklist

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@HumairAK HumairAK requested review from gregsheremeta and removed request for VaniHaripriya February 13, 2024 23:07
@dsp-developers
Copy link
Contributor

A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-571
An OCP cluster where you are logged in as cluster admin is required.

To use this image run the following:

cd $(mktemp -d)
git clone git@github.com:opendatahub-io/data-science-pipelines-operator.git
cd data-science-pipelines-operator/
git fetch origin pull/571/head
git checkout -b pullrequest c3221fced0eb9209b9338c1165cd8ede9e1fdc22
oc new-project opendatahub
make deploy IMG="quay.io/opendatahub/data-science-pipelines-operator:pr-571"

More instructions here on how to deploy and test a Data Science Pipelines Application.

@dsp-developers
Copy link
Contributor

Change to PR detected. A new PR build was completed.
A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-571

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
@dsp-developers
Copy link
Contributor

Change to PR detected. A new PR build was completed.
A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-571

Copy link
Member

@gmfrasca gmfrasca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Member

@DharmitD DharmitD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/Approve

Copy link
Contributor

openshift-ci bot commented Feb 16, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DharmitD

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit beaa241 into opendatahub-io:main Feb 16, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants