OTA-1029: Add a CVO Log Level API #1492

DavidHurta · 2023-10-09T14:48:12Z

This enhancement describes the API changes needed to provide a simple way of dynamically changing the verbosity level of Cluster Version Operator's logs.

This pull request references https://issues.redhat.com/browse/OTA-1029

openshift-ci-robot · 2023-10-09T14:48:15Z

@Davoska: This pull request references OTA-1029 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

This enhancement describes the API changes needed to provide a simple way of dynamically changing the Cluster Version Operator's log level.

This pull request references https://issues.redhat.com/browse/OTA-1029

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2023-10-09T14:48:24Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

DavidHurta · 2023-10-09T14:49:21Z

/test all

DavidHurta · 2023-10-09T15:09:05Z

/test all

enhancements/update/cvo-log-level-api.md

petr-muller · 2023-10-10T10:29:51Z

/cc

enhancements/update/cvo-log-level-api.md

DavidHurta · 2023-11-01T15:17:59Z

PTAL, reviewers @LalatenduMohanty @wking @petr-muller

DavidHurta · 2023-11-01T15:21:18Z

PTAL, API approver @deads2k

enhancements/update/cvo-log-level-api.md

petr-muller

LGTM, minor comment inline

enhancements/update/cvo-log-level-api.md

petr-muller

/lgtm

openshift-ci · 2024-10-01T12:20:58Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: petr-muller
Once this PR has been reviewed and has the lgtm label, please assign sdodson for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

enhancements/update/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

enhancements/update/cvo-log-level-api.md

JoelSpeed · 2024-10-07T14:02:15Z

enhancements/update/cvo-log-level-api.md

+A hosted CVO is located in the management cluster and accesses the hosted API
+server. As the new CR will be part of the OCP payload, it will be applied to the
+hosted cluster.


Hmm, that seems dissonant to me. If CVO acts within the management cluster, why is the configuration for the CVO not also at the management cluster level. Have you spoken to anyone from HCP about this EP yet?

This seems like the current solution for a bunch of hosted operators placed in the management cluster (cluster-image-registry-operator, dns-operator, cluster-network-operator,...). These operators are running in the management cluster; however, they access the hosted API server. However, some of them can have potentially access even to the management cluster API server, and access any configuration resources there. Let me reach out to the HyperShift team.

The main issue with putting CRDs in the management cluster is that we have to handle multiple versions of OCP on the same management cluster.

We could add this to the HostedCluster API and have that be the source of truth, while simply reconciling (and preventing changes to) the CR within the hosted cluster

The other alternative to consider is making this CRD namespaced.

We could add this to the HostedCluster API and have that be the source of truth, while simply reconciling (and preventing changes to) the CR within the hosted cluster

I really like the idea of this design. No compatibility issues regarding the management cluster. Utilizing an existing pattern of flow of the information from the HostedCluster API to the hosted cluster. The hosted CVO is not the wiser and does not need to even access the management cluster API server. The complexity seems much lower. We only need to come up with the HyperShift API change.

The ClusterConfiguration API seems like a nice place; however, its purpose seems to be oriented around the configuration API (github.com/openshift/api/config/v1), meaning configuration for OCP components rather than just operators.
I don't see any applicable existing APIs that we could use to reference the new proposed ClusterVersionOperator API. We could introduce a new OperatorConfiguration API where various APIs from the github.com/openshift/api/operator/v1 package could be referenced, including the newly proposed ClusterVersionOperator API. What do you think?

Something like:

// OperatorConfiguration specifies configuration for individual OCP operators in the // cluster, represented as embedded resources that correspond to the openshift // operator API. type OperatorConfiguration struct { // ClusterVersionOperator specifies the configuration for the Cluster Version Operator in the HostedCluster. // +optional ClusterVersionOperator *operatorv1.ClusterVersionOperatorSpec `json:"clusterVersionOperator,omitempty"` }

type HostedClusterSpec struct { ... // OperatorConfiguration specifies configuration for individual OCP operators in the // cluster, represented as embedded resources that correspond to the openshift // operator API. // // +kubebuilder:validation:Optional // +optional OperatorConfiguration *OperatorConfiguration `json:"operatorConfiguration,omitempty"`

@csrwng, @enxebre, let me know what do you think, please 🙌 I can also follow-up on Slack or in the office-hours.

The other alternative to consider is making this CRD namespaced.

That doesn't make sense from an OCP standpoint, since you'll likely need different fields for a namespaced version of this.

We could add this to the HostedCluster API and have that be the source of truth, while simply reconciling (and preventing changes to) the CR within the hosted cluster

If CVO is running in the management cluster, why would we want to have this CRD in the workload cluster at all? Would it not be better to have a separate mode of operation so that in an HCP, the CVO gets is config from CLI flags instead of the CRD, where hypershift operators can configure it directly based on some hypershift specific API (be that a new version of this CRD or part of an existing HCP API)

The reason why I am a fan of the CRD being in the hosted cluster is:

We simplify and streamline the flow of configuration.

As a maintainer of the CVO, I have to simply support the CVO to reconcile a new one configuration CR in the cluster. It does not matter whether the CVO is in the hosted cluster or in a standalone cluster. This means only one logic, less development, less prone to bugs, easier to maintain, etc.

The HyperShift does not have to parse a new API into flags/environment variables/config file (and the CVO parsing the flags back to the internal representation of the configuration). It simply propagates one API into another verbatim. Minimal risk of breaking things, simpler logic. More future-proof. Adding new fields into the ClusterVersionOperator CRD requires minimal effort on the HyperShift side and the CVO parsing side.

We do not introduce a new CRD in the management cluster.

No disruptions to a running hosted CVO.

It does have its cons. A new validating admission policy and a new CRD and a CR in the hosted cluster. Although the configuration itself does not seem to be heavily utilized in the future (it is just a configuration CR to the CVO), the operational impact seems minimal (a new one CRD, one CR, and an admission policy in the hosted cluster). So a one-time constant cost to a hosted cluster that will result in more simple development and testing.

However, you have a great point, and I do truly appreciate you pushing for not creating new resources unnecessarily and just because it seems easier (my words). It resulted in me thinking of more alternatives, which is a great thing.

the CVO gets is config from CLI flags instead of the CRD

It's easy to parse things such as "Normal" to --v=2 or "Debug" to --v4 in the hosted CVO deployment. I think we can do that for the currently proposed ClusterVersionOperator CRD, and I will probably propose to do that this week. But I do worry that in a distant potential future we could have this discussion again when new complex fields are introduced to the ClusterVersionOperator CRD. However, the CVO can also just start supporting configuration files/env vars.

@JoelSpeed, may I ask, out of curiosity, what do you see as the main downside of the CR being the hosted cluster? At the end of the day, I see it as a trade-off between development effort and the operational costs. However, you are a more experienced engineer, so I am probably missing some aspects.

This thread is converging with another, so you should review my comment here as well.

But, what I'm trying to avoid is a dissonance in UX. What I mean by this is when we expose APIs to the workload cluster admin that they cannot use. You are intending to write an API that you will have to guard against users updating, because "this isn't the right way to update this field", they have to "go over there and eventually it will get updated here"

It would be better not to expose that API at all to the end user, it is useless to them as an API, and becomes implementation for us.

Another consideration is whether there are differing requirements for CVOs in HCP vs standalone. Often our projects don't work quite the same way in HCP as they do on a standalone cluster, and, I can imagine a future where you want a feature either only on HCP, or only on standalone.

If the API to control CVO in an HCP is baked into an HCP specific type, we choose exactly what subset of the configuration we want to expose. If you directly embed the ClusterVersionOperator Spec, every time HCP updates the o/api dependency, you end up with potentially new fields being added to the HCP API, potentially, unintentionally. Which then might lead to awkward situations again where you're exposing API fields that the user cannot use.

This is one of the reasons that HyperShift uses the NodePool abstraction, to prevent users from having to worry about lots of fields in underlying APIs that we do not considered supported on HCP, it's a similar story here IMO

Thank you so much for the insights, I appreciate them. I was missing the dissonance in UX. I now see the benefits of your suggestions more clearly.

The enhancement now proposes to use a configuration file for a hosted CVO. No new CRDs in a management cluster. No new CRDs in a hosted cluster.

enhancements/update/cvo-log-level-api.md

openshift-ci · 2024-10-15T15:33:28Z

New changes are detected. LGTM label has been removed.

enhancements/update/cvo-log-level-api.md

enxebre · 2024-11-04T14:57:30Z

enhancements/update/cvo-log-level-api.md

+
+This enhancement proposes to create a new `CustomResourceDefinition` (CRD) 
+called `clusterversionoperators.operator.openshift.io`. The new type will be part 
+of the [`github.com/openshift/api/operator/v1alpha1`][github.com/openshift/api/operator/v1alpha1]


this link is broken.

Weird. The link works for me. At least after the hyperlink is rendered. It utilizes reference style links [1].

[1] https://www.markdownguide.org/basic-syntax/#reference-style-links

enxebre · 2024-11-04T15:03:52Z

enhancements/update/cvo-log-level-api.md

+`github.com/openshift/api/operator/v1` package can be referenced, including a 
+hosted CVO configuration API. The following changes are proposed for the 
+[HyperShift API][api/hypershift].
+


This API change needs to be behind a feature gate initially.
See https://github.com/openshift/hypershift/blob/main/api/hypershift/v1beta1/nodepool_types.go#L438-L441 for an example

Yes, yes, thanks for noticing. The ClusterVersionOperatorConfiguration feature gate is now proposed to be used here as well. I am not 100% sure whether a different new feature gate should be created just for the OperatorConfiguration field.

openshift-ci · 2024-11-04T17:39:11Z

@DavidHurta: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 9, 2023

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 9, 2023

DavidHurta force-pushed the cvo-logging branch from 923db57 to 9ad307f Compare October 9, 2023 15:07

wking reviewed Oct 9, 2023

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

wking reviewed Oct 9, 2023

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

wking reviewed Oct 9, 2023

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

openshift-ci bot requested a review from petr-muller October 10, 2023 10:29

DavidHurta force-pushed the cvo-logging branch 5 times, most recently from 1056f05 to 8380ff6 Compare October 12, 2023 14:51

DavidHurta requested a review from wking October 12, 2023 14:54

DavidHurta marked this pull request as ready for review October 17, 2023 11:06

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 17, 2023

openshift-ci bot requested review from LalatenduMohanty and PratikMahajan October 17, 2023 11:08

DavidHurta commented Nov 1, 2023

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

wking reviewed Nov 6, 2023

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

DavidHurta force-pushed the cvo-logging branch from 8380ff6 to 3e5a661 Compare November 7, 2023 16:22

LalatenduMohanty suggested changes Nov 7, 2023

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

DavidHurta force-pushed the cvo-logging branch from 3e5a661 to a99c2f5 Compare November 20, 2023 16:09

DavidHurta commented Nov 20, 2023

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

DavidHurta requested review from petr-muller, wking and dhellmann September 27, 2024 16:02

petr-muller reviewed Sep 30, 2024

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

Update header and address feedback

2664f1e

petr-muller approved these changes Oct 1, 2024

View reviewed changes

openshift-ci bot assigned petr-muller Oct 1, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 1, 2024

JoelSpeed reviewed Oct 7, 2024

View reviewed changes

csrwng reviewed Oct 10, 2024

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

enxebre reviewed Oct 10, 2024

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

enxebre reviewed Oct 10, 2024

View reviewed changes

enhancements/update/cvo-log-level-api.md Outdated Show resolved Hide resolved

Update and address feedback

e7cd06d

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 15, 2024

Address HyperShift

3bd86f4

DavidHurta force-pushed the cvo-logging branch from 195f97f to 3bd86f4 Compare October 17, 2024 13:38

DavidHurta requested review from petr-muller, JoelSpeed, csrwng and enxebre October 17, 2024 14:13

JoelSpeed reviewed Oct 18, 2024

View reviewed changes

Make the hosted CVO use a configuration file

5ee948e

DavidHurta requested a review from JoelSpeed November 4, 2024 10:44

enxebre reviewed Nov 4, 2024

View reviewed changes

Add missing FeatureGate for HyperShift

5b225a3

DavidHurta requested a review from enxebre November 4, 2024 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OTA-1029: Add a CVO Log Level API #1492

OTA-1029: Add a CVO Log Level API #1492

DavidHurta commented Oct 9, 2023 •

edited

Loading

openshift-ci-robot commented Oct 9, 2023 •

edited by openshift-ci bot

Loading

openshift-ci bot commented Oct 9, 2023

DavidHurta commented Oct 9, 2023

DavidHurta commented Oct 9, 2023

petr-muller commented Oct 10, 2023

DavidHurta commented Nov 1, 2023

DavidHurta commented Nov 1, 2023

petr-muller left a comment

petr-muller left a comment

openshift-ci bot commented Oct 1, 2024

JoelSpeed Oct 7, 2024

DavidHurta Oct 9, 2024 •

edited

Loading

csrwng Oct 10, 2024

enxebre Oct 10, 2024

DavidHurta Oct 11, 2024 •

edited

Loading

DavidHurta Oct 15, 2024

JoelSpeed Oct 18, 2024

DavidHurta Oct 21, 2024 •

edited

Loading

JoelSpeed Oct 21, 2024

DavidHurta Oct 24, 2024

openshift-ci bot commented Oct 15, 2024

enxebre Nov 4, 2024

DavidHurta Nov 4, 2024

enxebre Nov 4, 2024

DavidHurta Nov 4, 2024

openshift-ci bot commented Nov 4, 2024

OTA-1029: Add a CVO Log Level API #1492

Are you sure you want to change the base?

OTA-1029: Add a CVO Log Level API #1492

Conversation

DavidHurta commented Oct 9, 2023 • edited Loading

openshift-ci-robot commented Oct 9, 2023 • edited by openshift-ci bot Loading

openshift-ci bot commented Oct 9, 2023

DavidHurta commented Oct 9, 2023

DavidHurta commented Oct 9, 2023

petr-muller commented Oct 10, 2023

DavidHurta commented Nov 1, 2023

DavidHurta commented Nov 1, 2023

petr-muller left a comment

Choose a reason for hiding this comment

petr-muller left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Oct 1, 2024

Choose a reason for hiding this comment

DavidHurta Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DavidHurta Oct 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DavidHurta Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-ci bot commented Oct 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-ci bot commented Nov 4, 2024

DavidHurta commented Oct 9, 2023 •

edited

Loading

openshift-ci-robot commented Oct 9, 2023 •

edited by openshift-ci bot

Loading

DavidHurta Oct 9, 2024 •

edited

Loading

DavidHurta Oct 11, 2024 •

edited

Loading

DavidHurta Oct 21, 2024 •

edited

Loading