Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial changes for new maintenance mode integration #1967

Merged

Conversation

johscheuer
Copy link
Member

Description

This PR provides changes for a better integration with the maintenance mode for the operator. I'll provide more information in the docs.

fixes: #1656
fixes: #1655
fixes: #1654
fixes: #1650

Type of change

Please select one of the options below.

  • New feature (non-breaking change which adds functionality)

Discussion

Testing

Manual testing. I added a new e2e test for this and added a new HA test with maintenance mode enabled.
I will also add some additional unit tests to the once I already added.

Documentation

Will be updated before merging.

Follow-up

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: d83772a
  • Duration 1:56:27
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer marked this pull request as ready for review March 13, 2024 14:06
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: e002ed4
  • Duration 2:06:44
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 414ff78
  • Duration 2:14:05
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: c46dab3
  • Duration 1:48:05
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: f726c45
  • Duration 2:01:31
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Copy link
Contributor

@nicmorales9 nicmorales9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass, I will need to do a second tomorrow

api/v1beta2/foundationdbcluster_types.go Outdated Show resolved Hide resolved
api/v1beta2/foundationdbcluster_types.go Outdated Show resolved Hide resolved
api/v1beta2/foundationdbcluster_types.go Outdated Show resolved Hide resolved
controllers/update_pods_test.go Outdated Show resolved Hide resolved
docs/manual/operations.md Outdated Show resolved Hide resolved
fdbclient/admin_client.go Outdated Show resolved Hide resolved
controllers/maintenance_mode_checker.go Outdated Show resolved Hide resolved
controllers/maintenance_mode_checker_test.go Show resolved Hide resolved
controllers/maintenance_mode_checker_test.go Show resolved Hide resolved
controllers/maintenance_mode_checker_test.go Outdated Show resolved Hide resolved
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: d16d183
  • Duration 1:54:27
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Copy link
Contributor

@nicmorales9 nicmorales9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment nits, but the functionality looks good to me!

internal/maintenance/maintenance.go Outdated Show resolved Hide resolved
internal/maintenance/maintenance.go Outdated Show resolved Hide resolved
api/v1beta2/foundationdbcluster_types.go Outdated Show resolved Hide resolved
docs/manual/operations.md Outdated Show resolved Hide resolved
docs/manual/operations.md Outdated Show resolved Hide resolved
fdbclient/admin_client.go Outdated Show resolved Hide resolved
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 01b7097
  • Duration 2:02:42
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer force-pushed the improve-maintenance-mode-integration branch from 01b7097 to 03cbe3f Compare March 15, 2024 13:35
@johscheuer johscheuer added the enhancement New feature or request label Mar 15, 2024
@johscheuer
Copy link
Member Author

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 01b7097
  • Duration 2:02:42
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
• [FAILED] [1089.850 seconds]
Operator HA Upgrades when no remote storage processes are restarted [It] Upgrade from 7.1.57 to 7.3.33 [e2e, pr]
/codebuild/output/src3830764168/src/github.com/FoundationDB/fdb-kubernetes-operator/e2e/fixtures/upgrade_test_configuration.go:115

  [FAILED] Unexpected error:
      <*fmt.wrapError | 0xc0081eeae0>: 
      timeout waiting for all clusters to be upgraded to 7.3.33, original error: timed out waiting for the condition
      {
          msg: "timeout waiting for all clusters to be upgraded to 7.3.33, original error: timed out waiting for the condition",
          err: <*errors.errorString | 0xc000329b50>{
              s: "timed out waiting for the condition",
          },
      }
  occurred
  In [It] at: /codebuild/output/src3830764168/src/github.com/FoundationDB/fdb-kubernetes-operator/e2e/fixtures/ha_fdb_cluster.go:313 @ 03/15/24 10:44:05.741
------------------------------

I'll take a look at this failure 👀

@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 03cbe3f
  • Duration 2:26:09
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer
Copy link
Member Author

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 03cbe3f
  • Duration 2:26:09
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

There was a single failure:

Summarizing 1 Failure:
  [FAIL] Operator when a Pod is unscheduled and another Pod is being replaced [It] should remove the targeted Pod [e2e, pr]
  /codebuild/output/src3316862412/src/github.com/FoundationDB/fdb-kubernetes-operator/e2e/fixtures/fdb_cluster.go:1140

Ran 38 of 48 Specs in 8299.577 seconds
FAIL! -- 37 Passed | 1 Failed | 9 Pending | 1 Skipped
--- FAIL: TestOperator (8302.44s)

It seems like the replacement Pod took longer than expected to come up.

@johscheuer johscheuer merged commit 4116a24 into FoundationDB:main Mar 15, 2024
7 of 8 checks passed
@johscheuer johscheuer deleted the improve-maintenance-mode-integration branch March 15, 2024 16:17
johscheuer added a commit to johscheuer/fdb-kubernetes-operator that referenced this pull request Mar 15, 2024
* Initial changes for new maintenance mode integration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
3 participants