Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the operator to trigger a reconciliation if a node changes #1989

Merged
merged 4 commits into from
Apr 18, 2024

Conversation

johscheuer
Copy link
Member

Description

Fixes: #1629

Type of change

Please select one of the options below.

  • New feature (non-breaking change which adds functionality)

Discussion

Testing

Ran some manual tests, I have to check how to test this in an e2e test or unit test.

Documentation

Will be updated before marking this PR ready.

Follow-up

@johscheuer johscheuer added the enhancement New feature or request label Apr 15, 2024
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 1a4b22f
  • Duration 2:19:31
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer marked this pull request as ready for review April 17, 2024 12:03
Copy link
Member Author

@johscheuer johscheuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manually tested those changes with setting and removing a taint. Here is the output for setting a taint:


{"level":"debug","ts":"2024-04-17T12:00:35Z","logger":"controllers.FoundationDBCluster.NodeTaintChangedPredicate","msg":"Got an UpdateEvent","node":"dummy-node","taintsChanged":true,"oldTaints":null,"newTaints":[{"key":"testing","value":"boom","effect":"NoSchedule"}]}
{"level":"debug","ts":"2024-04-17T12:00:35Z","logger":"controllers.FoundationDBCluster","msg":"Processing findFoundationDBClusterForNode, found Pods on node that changed","node":"dummy-node","labelSelector":"foundationdb.org/fdb-cluster-name","podsOnNode":1}
{"level":"debug","ts":"2024-04-17T12:00:35Z","logger":"controllers.FoundationDBCluster","msg":"Processing findFoundationDBClusterForNode, found cluster that needs an update","node":"dummy-node","triggeringPod":"jdev-stateless-4","clusterName":"jdev"}
{"level":"debug","ts":"2024-04-17T12:00:35Z","logger":"controllers.FoundationDBCluster","msg":"Processing findFoundationDBClusterForNode, found Pods on node that changed","node":"dummy-node","labelSelector":"foundationdb.org/fdb-cluster-name","podsOnNode":1}
{"level":"debug","ts":"2024-04-17T12:00:35Z","logger":"controllers.FoundationDBCluster","msg":"Processing findFoundationDBClusterForNode, found cluster that needs an update","node":"dummy-node","triggeringPod":"jdev-stateless-4","clusterName":"jdev"}

edit: Note that the event is enqueued twice:

EnqueueRequestsFromMapFunc ...
For UpdateEvents which contain both a new and old object, the transformation function is run on both objects and both sets of Requests are enqueue.

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: fbb287d
  • Duration 2:18:16
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: fef99ff
  • Duration 2:27:18
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Copy link
Contributor

@nicmorales9 nicmorales9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM provided one comment is checked over

@johscheuer johscheuer mentioned this pull request Apr 18, 2024
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 762925a
  • Duration 2:33:34
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer
Copy link
Member Author

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 762925a
  • Duration 2:33:34
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  2024/04/18 13:46:46 finished all tests, start deleting namespace pr-771-o9cge2jg
• [FAILED] [473.904 seconds]
Operator Upgrades one process is marked for removal and is stuck in terminating state [It] Upgrade from 7.1.57 to 7.3.33 [e2e, pr]
/codebuild/output/src2123839069/src/github.com/FoundationDB/fdb-kubernetes-operator/e2e/fixtures/upgrade_test_configuration.go:115

  [FAILED] Timed out after 300.001s.
  Expected
      <*int64 | 0x0>: nil
  not to be nil
  In [It] at: /codebuild/output/src2123839069/src/github.com/FoundationDB/fdb-kubernetes-operator/e2e/test_operator_upgrades/operator_upgrades_test.go:572 @ 04/18/24 13:46:31.206
------------------------------

The failure was unrelated, seems like the ProcessGroup didn't get the condition in time.

@johscheuer johscheuer merged commit 8370709 into FoundationDB:main Apr 18, 2024
7 of 8 checks passed
@johscheuer johscheuer deleted the add-reconcile-trigger-nodes branch April 18, 2024 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add watch for nodes for the operator
3 participants