Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA-1339: UpdateStatus: Initial working draft #2012

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

petr-muller
Copy link
Member

@petr-muller petr-muller commented Aug 26, 2024

Initial draft of the API, not expecting API reviewers to review yet, just OTA

Still kinda WIP but it starts being partially reviewable. I still need to incorporate some feedback from Update Health API Draft and also revise this paying more attention to OpenShift API Conventions.

I'm keeping iterations as commits but will squash before (eventual) merge.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 26, 2024
Copy link
Contributor

openshift-ci bot commented Aug 26, 2024

Hello @petr-muller! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

Copy link
Contributor

openshift-ci bot commented Aug 26, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 26, 2024
@petr-muller petr-muller reopened this Aug 29, 2024
@petr-muller petr-muller force-pushed the update-status-api branch 2 times, most recently from 7c1d5de to 5d0db69 Compare September 3, 2024 15:26
@petr-muller petr-muller force-pushed the update-status-api branch 2 times, most recently from ddca9e1 to 41b0a0c Compare October 7, 2024 16:17
@petr-muller petr-muller marked this pull request as ready for review October 7, 2024 16:17
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 7, 2024
@openshift-ci openshift-ci bot requested review from deads2k and knobunc October 7, 2024 16:19
@petr-muller petr-muller changed the title UpdateStatus: Initial working draft OTA-1339: UpdateStatus: Initial working draft Oct 7, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 7, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 7, 2024

@petr-muller: This pull request references OTA-1339 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.18.0" version, but no target version was set.

In response to this:

Initial draft of the API, not expecting API reviewers to review yet, just OTA

Still kinda WIP but it starts being partially reviewable. I still need to incorporate some feedback from Update Health API Draft and also revise this paying more attention to OpenShift API Conventions.

I'm keeping iterations as commits but will squash before (eventual) merge.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

config/v1alpha1/register.go Outdated Show resolved Hide resolved
type ControlPlaneUpdateVersions struct {
// previous is the version of the control plane before the update
// +optional
Previous Version `json:"previous,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: omitempty doesn't do anything for structs (golang/go#11939, golang/go#51261). You should either drop omitempty here, or use *Version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, thanks! I audited the rest of the file for this, and either made the field required/not-omitempty, or made it a pointer.

I think there was a guideline about not using pointers in one of the conventions, but kube allows it and can't find anything for openshift.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was a guideline about not using pointers in one of the conventions...

The OpenShift guideline for that is tucked away in this policy doc. There's a subsequent section allowing pointers to structs when an API validation that requires the field to be unset, which doesn't seem to apply here. I'm just not sold on "inline zero values to help discoverability" vs. "use omitempty to hide things the admin doesn't care about; future admins can learn about new knobs via oc explain ...". Feel free to drop the pointers and the omitempty if the API approvers press you, and feel free to mark this thread resolved either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I eventually found this guideline (the API conventions are linked from the top level conventions). The OpenShift guideline about avoiding pointers is about configuration APIs ("In configuration APIs specifically,...") which Status API is not. The point about pointers being harder to work with still stands.

I ended up using the pointer at a single place in the whole API, the PoolResource in ControlPlane. I reworked the ControlPlaneUpdateVersions (https://github.com/openshift/api/pull/2012/files#diff-382a9e9c3b33f1db2b3e759e6d6019ab4db0f3ea29c8448a281bcee9bc0f4c81R71-R81) part (also simplified the metadata, the type discriminated union felt like overkill etc) and now it is no longer a pointer.


const (
// installation denotes a boolean that indicates the update was initiated as an installation
InstallationMetadata VersionMetadataKey = "installation"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ControlPlaneUpdateVersions has separate previous and target. Do we need an explicit "I'm the install!" marker in version metadata? Or can that be inferred from "previous is empty"? I can't think of a situation where you'd be in the middle of a real A->B update but still care about whether A was an install version or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this basically mimics what we do in oc prototype:

https://github.com/openshift/oc/blob/3692450b96d57ae3870e5f693e833a0701d0e2b0/pkg/cli/admin/upgrade/status/controlplane.go#L57-L61

if this was the only use case for version metadata then maybe the implicit 'previous empty implies installation' would be good enough but we have a good use case for metadata (partial) so I think it's worth being explicit, we have the mechanism for it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no problem with openshift/oc#4978e7d226e8cb2 deciding to use an explicit property internally to read more clearly. And I'm fine with you following that pattern here in the initial v1alpha1 approach. But the public-API form increases my personal threshold for whether the structure should allow inconsistent data, and "claims to be an installation but has a different previous" and "claims to be an update, but has no previous" are two possible inconsistent states for this structure. But either way, feel free to mark this thread resolved, to get v1alpha1 landed :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that point. Not having more metadata is also simpler which is always a better start. I'll change this.

@petr-muller petr-muller force-pushed the update-status-api branch 2 times, most recently from e438453 to 40182b1 Compare October 9, 2024 12:27
Adding a resource reference to this level improves the cohesion with
workerPools part of the API, and can be useful in standalone/hcp
scenarios where controlplane is represented by different CRDs.
We are expecting use cases for more flexible metadata about the versions
involved in the update, such as 'architecture' for pseudo-updates when
clusters are migrated from `x86_64` to `multi`. The original two boolean
flags were replaced with a generic list of name+type+value triplets.
Generated with:

```
./tests/hack/gen-minimal-test.sh update/v1alpha1 v1alpha1
```
- Use `+kubebuilder:validation:Required` consistently (not `+required`)
- Do not mention (assume) underlying controller name
- Integers use `int32` types with enforced bounds
- Constants are `PascalNames` by convention
- GoDoc start by JSON name by convention
- Other GoDoc description tweaks
Make the `ResourceRef` type follow [OpenShift API conventions](https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#use-resource-name-rather-than-kind-in-object-references).

I also considered to follow the ["Use Specific Types for Object Reference"](https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#use-specific-types-for-object-references-and-omit-ref-suffix) more strictly and introduce speficic types instead using `ResourceRef` everywhere but consistency felt much worse (control plane would need ClusterVersion/HostedCluster reference, status insights would need just names, pool references would need MCP/NodePool but not in control plane...). Consistency is also a value, and because we use resource references at different places, it felt better to stay consistent everywhere.
CI enforces the presence of listType annotations on slices. List of resources is not a set, because resources are not scalars. It is not simple map because resource does not have a suitable key field. So it must be atomic I guess?
OpenShift API convention forbids bool usage and enforces this with CI. While I could argue for an exception (reasoning behind the convention does not apply in this case, as it would be easy to migrate from bools to strings if necessary), we can also simplify the API and make the metadata a simple key->value map, similar to `labels`. Value is optional and with empty value the presence of the key has a boolean=true semantics. It simplifies the API and possibly even improves the usability for clients.
Copy link
Contributor

openshift-ci bot commented Dec 17, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: petr-muller
Once this PR has been reviewed and has the lgtm label, please assign sjenning for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@JoelSpeed
Copy link
Contributor

Looking at the results of the lint check here, a large number of these I know are auto-fixable, try make lint-fix on the PR and it should bring a lot of that stuff in line

Initially `UpdateStatus` was made namespaces to accomodate HCP more easily, but recently it seems the org has established a practice where HCP resources have specialized variants. We have also identified some differences in how the API will need to behave in HCP, so it makes sense to make the API cluster-scoped in standalone, and in the future we will have a HCP-specific namespaced variant.
Copy link
Contributor

openshift-ci bot commented Dec 19, 2024

@petr-muller: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/lint 0ec7f29 link true /test lint

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@JoelSpeed
Copy link
Contributor

JoelSpeed commented Dec 19, 2024

Ahaa! This is something I just merged which is showing that there is a diff between make lint-fix and what's checked in (that's what the git diff is showing), so please try make lint-fix, and then make lint to see what's left to fix

@petr-muller
Copy link
Member Author

@JoelSpeed I'm still changing the content a bit, I planned to address the linter issues once I'm done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants