Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove configJob for ovn-controller #195

Closed

Conversation

booxter
Copy link
Contributor

@booxter booxter commented Jan 4, 2024

Make it part of the startup path for the service. This allows to remove a hostMount dependency, opening a path towards switch to StatefulSet, replacement of all (necessary) volumes with PVC - and removal of privileges.

This allows to get rid of a hostMount shared between the job pod and the
main ovsdb-server pod container (to enable communication of vsctl
command with the database socket).

Getting rid of hostMounts is needed to be able to eventually stop
running ovn-controller pods as privileged containers.
This is to prove that this is possible, now that configJob is squashed
into the main ovn-controller pod.
Copy link
Contributor

openshift-ci bot commented Jan 4, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: booxter

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Jan 4, 2024
@booxter
Copy link
Contributor Author

booxter commented Jan 4, 2024

/test ovn-operator-build-deploy-kuttl

@booxter
Copy link
Contributor Author

booxter commented Jan 5, 2024

/hold

@booxter
Copy link
Contributor Author

booxter commented Jan 5, 2024

The rationale for this change is:

  • the job calls vsctl to ovsdb-server that serves the vswitchd process running in a separate pod;
  • this means that the job needs access to vswitchd db socket (AF_UNIX);
  • this means that the job has to share a hostMount with ovn-controller pod;
  • this means that the job has to run privileged, and ovn-controller pod has to run privileged too (hostMounts are not available for unprivileged pods).

The drawback for this change is:

  • when a configuration change happens (external-ids or even ovn-remote), then ovn-controller pod restarts;
  • since the pod contains vswitchd container, dataplane is disrupted for each change.

The latter could be handled by splitting vswitchd/ovsdb-server from ovn-controller into a separate pod with its own lifecycle. (This is helpful regardless of privileges.) But - then ovn-controller has to talk to vswitchd database somehow, which is - again - AF_UNIX socket and requires a HostMount. So we are back to zero.

I don't have an immediate solution to this contradiction.

UPD: to clarify, it's not the database socket that can't run via TCP; it's OF connection socket that implies unix: scheme (in ofctl_run). This doesn't change the logic of the argument above, but is important if we consider running db and OF connections through different mechanisms.

@booxter booxter marked this pull request as draft January 5, 2024 16:19
@booxter
Copy link
Contributor Author

booxter commented Jan 8, 2024

This is invalid. The job was introduced for a reason - to untangle reconfiguration from ovn-controller pod restart. The proper path forward should be to switch the job to connect to OVNController ovsdb-server via AF_INET socket. This is technically possible but requires SSL deployment for the ovsdb-server endpoint. (And then we can switch all its clients to use the certificates - ovn-controller, ovs-vswitchd, and the configJob.)

@booxter booxter closed this Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant