Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in e2e test suite when checking if a pod is deleted #2092

Conversation

johscheuer
Copy link
Member

Description

Fix race condition in e2e test suite when checking if a pod is deleted. The race condition can happen when a pod is deleted and in between those checks the operator was recreating the pod quick enough. The additional check for the pod's UID will fix that, if the fetched pod has a new UID, we know that the fetched pod is a different pod.

Type of change

Please select one of the options below.

  • Bug fix (non-breaking change which fixes an issue)

Discussion

Testing

Manually ran tests.

Documentation

Follow-up

@johscheuer johscheuer added the bug Something isn't working label Jul 2, 2024
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 81adfb7
  • Duration 4:07:25
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer johscheuer closed this Jul 2, 2024
@johscheuer johscheuer reopened this Jul 2, 2024
@foundationdb-ci
Copy link

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 81adfb7
  • Duration 2:29:07
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@johscheuer
Copy link
Member Author

Result of fdb-kubernetes-operator-pr on Linux CentOS 7

  • Commit ID: 81adfb7
  • Duration 2:29:07
  • Result: ❌ FAILED
  • Error: Error while executing command: if $fail_test; then exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
• [FAILED] [979.040 seconds]
Operator HA Upgrades when no remote storage processes are restarted [It] Upgrade from 7.1.57 to 7.3.33 [e2e, pr]
/codebuild/output/src4146627943/src/github.com/FoundationDB/fdb-kubernetes-operator/e2e/fixtures/upgrade_test_configuration.go:115

  [FAILED] Unexpected error:
      <*fmt.wrapError | 0xc003919a40>: 
      timeout waiting for all clusters to be upgraded to 7.3.33, original error: timed out waiting for the condition
      {
          msg: "timeout waiting for all clusters to be upgraded to 7.3.33, original error: timed out waiting for the condition",
          err: <*errors.errorString | 0x29134d0>{
              s: "timed out waiting for the condition",
          },
      }
  occurred
  In [It] at: /codebuild/output/src4146627943/src/github.com/FoundationDB/fdb-kubernetes-operator/e2e/fixtures/ha_fdb_cluster.go:314 @ 07/02/24 18:12:57.52
------------------------------

That test failure is unrelated. I'll dig into it.

@johscheuer johscheuer merged commit 6fc6f9f into FoundationDB:main Jul 3, 2024
14 of 15 checks passed
@johscheuer johscheuer deleted the fix-pod-deletion-check-race-condition branch July 3, 2024 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants