Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: wait walreceiver on sks to be gone on 'immediate' ep restart. #9099

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

arssher
Copy link
Contributor

@arssher arssher commented Sep 23, 2024

When endpoint is stopped in immediate mode and started again there is a chance of old connection delivering some WAL to safekeepers after second start checked need for sync-safekeepers and thus grabbed basebackup LSN. It makes basebackup unusable, so compute panics. Avoid flakiness by waiting for walreceivers on safekeepers to be gone in such cases. A better way would be to bump term on safekeepers if sync-safekeepers is skipped, but it needs more infrastructure.

ref #9079

Copy link

github-actions bot commented Sep 23, 2024

5065 tests run: 4907 passed, 0 failed, 158 skipped (full report)


Flaky tests (14)

Postgres 17

Postgres 16

Postgres 15

Code coverage* (full report)

  • functions: 32.0% (7490 of 23395 functions)
  • lines: 50.0% (60468 of 120876 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
8c59bcb at 2024-09-27T07:30:14.662Z :recycle:

Copy link
Member

@koivunej koivunej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just typo and a question.

test_runner/fixtures/neon_fixtures.py Outdated Show resolved Hide resolved
test_runner/fixtures/neon_fixtures.py Outdated Show resolved Hide resolved
When endpoint is stopped in immediate mode and started again there is a
change of old connection delivering some WAL to safekeepers after second
start checked need for sync-safekeepers and thus grabbed basebackup
LSN. It makes basebackup unusable, so compute panics. Avoid flakiness by
waiting for walreceivers on safekeepers to be gone in such cases. A
better way would be to bump term on safekeepers if sync-safekeepers is
skipped, but it needs more infrastructure.

ref #9079
@arssher arssher force-pushed the immediate-restart-wait-sk-conn branch from 2362c42 to 8c59bcb Compare September 26, 2024 15:44
@hlinnaka
Copy link
Contributor

hlinnaka commented Oct 1, 2024

Restarted the failed e2e test; that's surely not related to this PR.

@hlinnaka
Copy link
Contributor

hlinnaka commented Oct 1, 2024

Is this ready to be merged?

@arssher arssher merged commit 17672c8 into main Oct 1, 2024
84 checks passed
@arssher arssher deleted the immediate-restart-wait-sk-conn branch October 1, 2024 17:54
@arssher
Copy link
Contributor Author

arssher commented Oct 1, 2024

yes, merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants