Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA-1301: WIP: pkg/cincinnati: Set DifferentChannel risk #1059

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

wking
Copy link
Member

@wking wking commented Jun 19, 2024

The CVO currently had been assuming all the releases in an Update Service response belong to the currently-configured channel. With this commit, the CVO begins walking retrieved Releases and injecting a DifferentChannel conditional update risk if the current channel is not a member of the Release's channels set. This will allow the CVO to be pointed at an Update Service returning all known (or all supported) next-hop updates, regardless of their current channel, which would make it easier for cluster administrators to self-serve answers like "why don't I see updates to 4.y.z in stable-4.y?" with "oh, because that target is only in candidate-4.y and fast-4.y so far".

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jun 19, 2024

@wking: This pull request references OTA-1301 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

The CVO currently had been assuming all the releases in an Update Service response belong to the currently-configured channel. With this commit, the CVO begins walking retrieved Releases and injecting a DifferentChannel conditional update risk if the current channel is not a member of the Release's channels set. This will allow the CVO to be pointed at an Update Service returning all known (or all supported) next-hop updates, regardless of their current channel, which would make it easier for cluster administrators to self-serve answers like "why don't I see updates to 4.y.z in stable-4.y?" with "oh, because that target is only in candidate-4.y and fast-4.y so far".

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 19, 2024
Copy link
Contributor

openshift-ci bot commented Jun 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 19, 2024
@wking
Copy link
Member Author

wking commented Jun 19, 2024

Testing a4435fc with Cluster Bot launch 4.17,openshift/cluster-version-operator#1059 aws (logs):

$ oc get -o json clusterversion version | jq .status.desired
{
  "image": "registry.build03.ci.openshift.org/ci-ln-slivwqk/release@sha256:9aa335b4a2f6da11e1ef58d0c8fb6f6d1caf4ac5956733facecfd3b87ae3074e",
  "version": "4.17.0-0.test-2024-06-19-010126-ci-ln-slivwqk-latest"
}

gave me the data to populate OTA-1301's demo graph-data, which I pushed to my graph-data fork. Then I updated my cluster to consume that demo update service:

$ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/upstream", "value": "https://raw.githubusercontent.com/wking/cincinnati-graph-data/demo/cincinnati-graph.json"}]'
$ oc adm upgrade channel stable-4.18

and I see:

$ oc adm upgrade --include-not-recommended
Cluster version is 4.17.0-0.test-2024-06-19-010126-ci-ln-slivwqk-latest

Upstream: https://raw.githubusercontent.com/wking/cincinnati-graph-data/demo/cincinnati-graph.json
Channel: stable-4.18 (available channels: candidate-4.18, fast-4.18, stable-4.18)

Recommended updates:

  VERSION     IMAGE
  4.18.0      quay.io/openshift-release-dev/ocp-release@sha256:0000000000000000000000000000000000000000000000000000000000000000

Supported but not recommended updates:

  Version: 4.18.2
  Image: quay.io/openshift-release-dev/ocp-release@sha256:2222222222222222222222222222222222222222222222222222222222222222
  Recommended: False
  Reason: MultipleReasons
  Message: A. https://bug.example.com/a
  
  4.18.2 channels (candidate-4.18) do not include this cluster's stable-4.18 https://example.com/FIXME

  Version: 4.18.1
  Image: quay.io/openshift-release-dev/ocp-release@sha256:1111111111111111111111111111111111111111111111111111111111111111
  Recommended: False
  Reason: DifferentChannel
  Message: 4.18.1 channels (candidate-4.18, fast-4.18) do not include this cluster's stable-4.18 https://example.com/FIXME

Hooray, looks good to me. Obviously need to figure out where we want to point the URL, instead of my current https://example.com/FIXME.

@wking wking added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 19, 2024
func differentChannelRisk(release configv1.Release, channel string) configv1.ConditionalUpdateRisk {
return configv1.ConditionalUpdateRisk{
Name: "DifferentChannel",
URL: "https://example.com/FIXME", // Maybe a KCS? New doc section that we think might be stable between 4.y? Other?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bit that needs figuring out still

@wking wking force-pushed the different-channel-risk branch 2 times, most recently from 4533796 to 810b42c Compare June 19, 2024 20:22
The CVO currently had been assuming all the releases in an Update
Service response belong to the currently-configured channel.  With
this commit, the CVO begins walking retrieved Releases and injecting a
DifferentChannel conditional update risk if the current channel is not
a member of the Release's channels set.  This will allow the CVO to be
pointed at an Update Service returning all known (or all supported)
next-hop updates, regardless of their current channel, which would
make it easier for cluster administrators to self-serve answers like
"why don't I see updates to 4.y.z in stable-4.y?" with "oh, because
that target is only in candidate-4.y and fast-4.y so far".
@wking
Copy link
Member Author

wking commented Jun 19, 2024

TestRunGraph/mid-task_cancellation_with_work_in_queue_does_not_deadlock timeout seems like an unrelated race.

/test unit

Copy link
Contributor

openshift-ci bot commented Jun 20, 2024

@wking: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-hypershift da7ad73 link true /test e2e-hypershift
ci/prow/e2e-agnostic-ovn-upgrade-into-change da7ad73 link true /test e2e-agnostic-ovn-upgrade-into-change
ci/prow/e2e-agnostic-operator da7ad73 link true /test e2e-agnostic-operator

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants