Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OVNDBCluster] Check bound error on reconcileService #297

Merged

Conversation

averdagu
Copy link
Contributor

@averdagu averdagu commented May 28, 2024

Related-Issue: OSPRH-6798

Copy link
Contributor

openshift-ci bot commented May 28, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

controllers/ovndbcluster_controller.go Outdated Show resolved Hide resolved
controllers/ovndbcluster_controller.go Outdated Show resolved Hide resolved
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/11f304f5a4a741fdb4a8e825bbe785dc

openstack-k8s-operators-content-provider FAILURE in 8m 22s
⚠️ ovn-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

pkg/common/utils.go Outdated Show resolved Hide resolved
pkg/common/utils.go Outdated Show resolved Hide resolved
Copy link
Contributor

@booxter booxter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly stylistic changes requested; comments missing etc. Nothing major.

@averdagu averdagu marked this pull request as ready for review June 5, 2024 07:52
@openshift-ci openshift-ci bot requested review from booxter and viroel June 5, 2024 07:52

package common

// Helper function while can't use buildin min
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

builtin

@booxter
Copy link
Contributor

booxter commented Jun 7, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Jun 7, 2024
@@ -749,6 +751,12 @@ func (r *OVNDBClusterReconciler) reconcileServices(
if err != nil {
return ctrl.Result{}, err
}
// It can be possible that not all pods are ready, so DNSData won't
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand why can't we just move this length check below at L723 and avoid/remove the minLenth calculator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean "above"?

I suggested to still attempt to create DNS records for running pods, even if not all of them are running, yet.

The fact you ask suggests this should be documented in a comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry, i meant "above". Not sure how much that helps to create DNSData for partial ip list. But just wanted to understand the reasoning for that, as this would just lead to unnecessary respawns for DNSMasq pod and if could avoid the hacks around Min checks. But if we still want to go with this i am ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, one scenario that comes to my mind is where there are not enough PVC available for all the ovndb replicas (5 replicas for example, only 3 PV available). With the current approach cluster will work as 3 replicas will be available.
If we just return whenever status.Replicas > len(podList) in the case mentioned above any ovsdbserver would be accessible using the CNAME.

Need to look regarding respawning DNSMasq.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will respawn once new pods show up, yes. That's sadly how dnsmasq is implemented right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was not sure about the behavior regarding the reconcile when the DNSData doesn't change.
My concern was in the scenario where replicas > len(podList), that the reconcile will be triggered until condition is fulfilled, and in every reconcile loop (with the current code) the DNSData creation will be called, and I was afraid that it would trigger the recreation of the dnsmasq pod everytime. But that only will happen if the DNSData has different information.

I don't think there's much of an overhead regarding not waiting until all pods are ready. Having said that, I'm okay with any of the two solutions.
WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everyone involved is ok with either way. :) This is not very consequential. But if you keep creating DNSData while not all pods are ready, please add a comment explaining the thinking behind this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments added

@booxter
Copy link
Contributor

booxter commented Jun 12, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Jun 12, 2024
@karelyatin
Copy link
Contributor

lgtm, can you rebase it on main branch

Copy link
Contributor

@karelyatin karelyatin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

openshift-ci bot commented Jun 13, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: averdagu, karelyatin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit d45e9a0 into openstack-k8s-operators:main Jun 13, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants