Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop the OVN database during the cleanup script #320

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ralonsoh
Copy link
Contributor

@ralonsoh ralonsoh commented Jul 5, 2024

Closes-Issue: OSPRH-8118

@openshift-ci openshift-ci bot requested review from abays and olliewalsh July 5, 2024 13:52
Copy link
Contributor

openshift-ci bot commented Jul 5, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ralonsoh
Once this PR has been reviewed and has the lgtm label, please assign dprince for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

openshift-ci bot commented Jul 5, 2024

Hi @ralonsoh. Thanks for your PR.

I'm waiting for a openstack-k8s-operators member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@olliewalsh
Copy link
Contributor

/ok-to-test

@olliewalsh
Copy link
Contributor

olliewalsh commented Jul 5, 2024

Closes-Issue: OSPRH-8118

How about just removing --single-child from the dumb-init args so it sends SIGTERM to the entire process group?

@karelyatin karelyatin requested a review from booxter July 5, 2024 14:09
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/570d19d95a2e44c0ad8896e7ebaeb63c

✔️ openstack-k8s-operators-content-provider SUCCESS in 40m 51s
ovn-operator-tempest-multinode RETRY_LIMIT in 23m 03s

if [ "$DB_TYPE" == "nb" ]; then
/usr/share/ovn/scripts/ovn-ctl stop_nb_ovsdb
elif [ "$DB_TYPE" == "sb" ]; then
/usr/share/ovn/scripts/ovn-ctl stop_sb_ovsdb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just stop_${DB_TYPE}_ovsdb I think.

@booxter
Copy link
Contributor

booxter commented Jul 5, 2024

@olliewalsh the problem with (just) removing --single-child is that dumb-init won't wait for children-of-children to exit, it will only wait for the start script to exit. (And the latter will exit immediately because there's no trap set for SIGTERM there to e.g. wait for dbserver to exit.) - Alternative solution could be to implement SIGTERM handler in the start script.

@booxter
Copy link
Contributor

booxter commented Jul 5, 2024

@olliewalsh btw FYI the signal handling improvement is tracked here: https://issues.redhat.com/browse/OSPRH-8212

@ralonsoh
Copy link
Contributor Author

ralonsoh commented Jul 8, 2024

@olliewalsh btw FYI the signal handling improvement is tracked here: https://issues.redhat.com/browse/OSPRH-8212

If we are going to implement this signal handler in the start script, is this patch relevant anymore?

@ralonsoh
Copy link
Contributor Author

ralonsoh commented Jul 8, 2024

/test ovn-operator-build-deploy-kuttl

1 similar comment
@ralonsoh
Copy link
Contributor Author

ralonsoh commented Jul 8, 2024

/test ovn-operator-build-deploy-kuttl

@booxter
Copy link
Contributor

booxter commented Jul 8, 2024

If we are going to implement this signal handler in the start script, is this patch relevant anymore?

I think it depends on how the signal handler would be implemented. The handler could handle this job, or it could only take care about ovsdb-tool db initialization. (Which is the original scenario that spurred these Jira issues.)

But yes, I think everything could be handled inside the signal handler.

Copy link
Contributor

openshift-ci bot commented Jul 15, 2024

@ralonsoh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ovn-operator-build-deploy-kuttl 690a938 link true /test ovn-operator-build-deploy-kuttl

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ralonsoh
Copy link
Contributor Author

This patch is hitting several issues:

  • https://issues.redhat.com/browse/OSPRH-7626. This bug is describing what is happening during the deletion of the OVNDBCluster pods.
  • This patch is always stopping the ovsdb-server regardless of the ovsdb-server status. We can check first if the DB is still running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants