Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SURE-9137] ClusterValues dont apply changes if one of the clusters is missing the templateValues #2943

Open
1 task done
skanakal opened this issue Oct 8, 2024 · 5 comments
Assignees
Labels
Milestone

Comments

@skanakal
Copy link

skanakal commented Oct 8, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

If a GitRepo is configured to target two or more clusters and the fleet.yaml file includes ${ .ClusterValues}, any missing templateValues in one of the cluster's spec will prevent updates or changes from being deployed to the clusters where templateValues are properly configured.

Expected Behavior

  • The changes should be applied in the cluster where the templatesValues are defined.
  • UI should show the clear Error message

Steps To Reproduce

  1. Install rancher 2.9.2 with fleet 0.10.3v
  2. Register two downstream clusters, ensuring that one of them includes templateValues.
apiVersion: fleet.cattle.io/v1alpha1
kind: Cluster
metadata:
  annotations:
  labels:
    foo: bar
    management.cattle.io/cluster-display-name: rke2custom1
    management.cattle.io/cluster-name: c-m-qmc767s2
    objectset.rio.cattle.io/hash: 464bd091084175e4d5572051571f4dfb39bcf2fd
    provider.cattle.io: rke2
  name: rke2custom1
  namespace: fleet-default
spec:
  agentAffinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: fleet.cattle.io/agent
                operator: In
                values:
                  - 'true'
          weight: 1
  clientID: pl882vs458n4lqqrj8jc58jvkvq4xgqdfv9l7q7spnrhh7s8wjgj8v
  kubeConfigSecret: rke2custom1-kubeconfig
  kubeConfigSecretNamespace: fleet-default
  templateValues:
    generated:
      cluster_metadata:
        fqdn: server-1.example.com
        name: server-1
  1. create gitrepo from this example path: templateValues
  2. check the gitrepo dashboard for resourceReady

Environment

- Architecture: x86_64
- Fleet Version: fleet:104.0.3+up0.10.3
- Cluster:
  - Provider: custom
  - Options: 1
  - Kubernetes Version: v1.30.5+rke2r1

Logs

From the fleet-controller logs:

2024-10-08T11:59:49Z	DEBUG	bundle	Unchanged bundledeployment	{"controller": "bundle", "controllerGroup": "fleet.cattle.io", "controllerKind": "Bundle", "Bundle": {"name":"mcc-rke2custom1-managed-system-upgrade-controller","namespace":"fleet-default"}, "namespace": "fleet-default", "name": "mcc-rke2custom1-managed-system-upgrade-controller", "reconcileID": "04c5c324-f0f4-4f19-bc31-1e11a890da3e", "bundledeployment": {"apiVersion": "fleet.cattle.io/v1alpha1", "kind": "BundleDeployment", "namespace": "cluster-fleet-default-rke2custom1-43138de7906f", "name": "mcc-rke2custom1-managed-system-upgrade-controller"}, "operation": "unchanged"}
2024-10-08T11:59:49Z	DEBUG	bundle	Unchanged bundledeployment	{"controller": "bundle", "controllerGroup": "fleet.cattle.io", "controllerKind": "Bundle", "Bundle": {"name":"fleet-agent-rke2custom1","namespace":"fleet-default"}, "namespace": "fleet-default", "name": "fleet-agent-rke2custom1", "reconcileID": "d63cdb5d-544d-4356-b269-350b5564aa21", "bundledeployment": {"apiVersion": "fleet.cattle.io/v1alpha1", "kind": "BundleDeployment", "namespace": "cluster-fleet-default-rke2custom1-43138de7906f", "name": "fleet-agent-rke2custom1"}, "operation": "unchanged"}
2024-10-08T11:59:49Z	ERROR	Reconciler error	{"controller": "bundle", "controllerGroup": "fleet.cattle.io", "controllerKind": "Bundle", "Bundle": {"name":"templatevalues-templatevalues-5bfacaa9","namespace":"fleet-default"}, "namespace": "fleet-default", "name": "templatevalues-templatevalues-5bfacaa9", "reconcileID": "2a8aaea7-2194-46c2-a923-bf6f745b1a4a", "error": "failed to render helm values template: template: values:56:40: executing \"values\" at <.ClusterValues.generated.cluster_metadata.fqdn>: map has no entry for key \"generated\""}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:324
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:261
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:222

Anything else?

current behavior:
image

@skanakal skanakal added kind/bug JIRA Must shout labels Oct 8, 2024
@rancherbot rancherbot added this to Fleet Oct 8, 2024
@github-project-automation github-project-automation bot moved this to 🆕 New in Fleet Oct 8, 2024
@kkaempf kkaempf added this to the v2.9.4 milestone Oct 8, 2024
@kkaempf kkaempf moved this from 🆕 New to To Triage in Fleet Oct 8, 2024
@manno
Copy link
Member

manno commented Oct 23, 2024

We should not fail all bundle deployments when one cluster is missing a label.

@manno manno modified the milestones: v2.9.4, v2.9.5 Oct 23, 2024
@p-se p-se self-assigned this Nov 7, 2024
@p-se p-se moved this from 📋 Backlog to 🏗 In progress in Fleet Nov 7, 2024
@manno manno modified the milestones: v2.9.5, v2.9.6 Dec 9, 2024
@p-se
Copy link
Contributor

p-se commented Dec 9, 2024

We should not fail all bundle deployments when one cluster is missing a label.

Clarification: We have decided not to ignore template errors when they occur, but to make them visible in Bundle and GitRepo statuses. Corresponding PR is #3114

p-se added a commit to p-se/fleet that referenced this issue Dec 9, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 9, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 9, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 9, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 10, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 10, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 10, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 10, 2024
@manno manno moved this from 🏗 In progress to 👀 In review in Fleet Dec 11, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 16, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 16, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 18, 2024
p-se added a commit to p-se/fleet that referenced this issue Dec 18, 2024
p-se added a commit to p-se/fleet that referenced this issue Jan 6, 2025
p-se added a commit to p-se/fleet that referenced this issue Jan 6, 2025
p-se added a commit to p-se/fleet that referenced this issue Jan 8, 2025
p-se added a commit to p-se/fleet that referenced this issue Jan 8, 2025
p-se added a commit to p-se/fleet that referenced this issue Jan 8, 2025
p-se added a commit to p-se/fleet that referenced this issue Jan 8, 2025
p-se added a commit to p-se/fleet that referenced this issue Jan 8, 2025
p-se added a commit to p-se/fleet that referenced this issue Jan 8, 2025
p-se added a commit that referenced this issue Jan 9, 2025
* Import v1alpha1 package as fleet

* Show bundle errors in Bundle and GitRepo

Refers to #2943

* Add E2E tests

Refers to #2943
@p-se
Copy link
Contributor

p-se commented Jan 9, 2025

/backport v2.10.2

@p-se
Copy link
Contributor

p-se commented Jan 9, 2025

/backport v2.9.6

p-se added a commit to p-se/fleet that referenced this issue Jan 9, 2025
…#3114)

* Import v1alpha1 package as fleet

* Show bundle errors in Bundle and GitRepo

Refers to rancher#2943

* Add E2E tests

Refers to rancher#2943

(cherry picked from commit 235e8ef)
p-se added a commit to p-se/fleet that referenced this issue Jan 9, 2025
…#3114)

* Import v1alpha1 package as fleet

* Show bundle errors in Bundle and GitRepo

Refers to rancher#2943

* Add E2E tests

Refers to rancher#2943

(cherry picked from commit 235e8ef)
(cherry picked from commit 3417071)
manno pushed a commit that referenced this issue Jan 10, 2025
…3196)

* Import v1alpha1 package as fleet

* Show bundle errors in Bundle and GitRepo

Refers to #2943

* Add E2E tests

Refers to #2943

(cherry picked from commit 235e8ef)
(cherry picked from commit 3417071)
manno pushed a commit that referenced this issue Jan 10, 2025
…3193)

* Import v1alpha1 package as fleet

* Show bundle errors in Bundle and GitRepo

Refers to #2943

* Add E2E tests

Refers to #2943

(cherry picked from commit 235e8ef)
@manno manno modified the milestones: v2.9.6, v2.11.0 Jan 13, 2025
@weyfonk
Copy link
Contributor

weyfonk commented Jan 13, 2025

Additional QA

Problem

When a workload targets multiple clusters, and one of those clusters is missing a template value, the following happens:

  • the workload is not deployed to any of the target clusters
  • a reconcile error appears, but only in fleet-controller pod logs. They are not visible in the Rancher UI.

Solution

Fleet now reflects targeting errors, such as those caused by missing template values on clusters, in the bundle and GitRepo statuses.
Fleet deliberately refrains from creating bundle deployments for clusters without targeting issues. A bundle working with a subset of its expected bundle deployments would be expected to cause inconsistencies in resource counts and a possible cascade of other issues. This could be revisited in a further iteration.

Testing

Engineering Testing

Manual Testing

N/A

Automated Testing

End-to-end tests have been added to check for the presence of targeting errors in bundle and GitRepo statuses.

QA Testing Considerations

Suggestion: follow the reproduction steps above, and check that:

  • targeting errors appear in the Rancher UI
  • no bundle deployments are created

Regressions Considerations

N/A

@weyfonk weyfonk moved this from 👀 In review to Needs QA review in Fleet Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Needs QA review
Development

No branches or pull requests

6 participants