Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: allow-list neon_local endpoint errors from storage controller #8123

Merged
merged 3 commits into from
Jun 21, 2024

Conversation

jcsp
Copy link
Collaborator

@jcsp jcsp commented Jun 20, 2024

Problem

For testing, the storage controller has a built-in hack that loads neon_local endpoint config from disk, and uses it to reconfigure endpoints when the attached pageserver changes.

Some tests that stop an endpoint while the storage controller is running could occasionally fail on log errors from the controller trying to use its special test-mode calls into neon local Endpoint.

Example: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8117/9592392425/index.html#/testresult/9d2bb8623d0d53f8

Summary of changes

  • Give NotifyError an explicit NeonLocal variant, to avoid munging these into generic 500s (I don't want to ignore 500s in general)
  • Allow-list errors related to the local notification hook.

The expectation is that tests using endpoints/workloads should be independently checking that those endpoints work: if neon_local generates an error inside the storage controller, that's ignorable.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added c/storage/pageserver Component: storage: pageserver a/tech_debt Area: related to tech debt labels Jun 20, 2024
Copy link

github-actions bot commented Jun 20, 2024

2910 tests run: 2793 passed, 0 failed, 117 skipped (full report)


Code coverage* (full report)

  • functions: 32.4% (6850 of 21130 functions)
  • lines: 49.9% (53395 of 107068 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
ae9d58d at 2024-06-21T17:02:56.144Z :recycle:

@jcsp jcsp force-pushed the jcsp/storcon-neon-local-logs branch from 9b8fe20 to 6f77cdd Compare June 21, 2024 08:33
@jcsp jcsp marked this pull request as ready for review June 21, 2024 10:43
@jcsp jcsp requested a review from a team as a code owner June 21, 2024 10:43
@jcsp jcsp requested a review from problame June 21, 2024 10:43
@jcsp jcsp enabled auto-merge (squash) June 21, 2024 16:15
@jcsp jcsp merged commit b74232e into main Jun 21, 2024
57 checks passed
@jcsp jcsp deleted the jcsp/storcon-neon-local-logs branch June 21, 2024 17:23
conradludgate pushed a commit that referenced this pull request Jun 27, 2024
…8123)

## Problem

For testing, the storage controller has a built-in hack that loads
neon_local endpoint config from disk, and uses it to reconfigure
endpoints when the attached pageserver changes.

Some tests that stop an endpoint while the storage controller is running
could occasionally fail on log errors from the controller trying to use
its special test-mode calls into neon local Endpoint.

Example:
https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8117/9592392425/index.html#/testresult/9d2bb8623d0d53f8

## Summary of changes

- Give NotifyError an explicit NeonLocal variant, to avoid munging these
into generic 500s (I don't want to ignore 500s in general)
- Allow-list errors related to the local notification hook.

The expectation is that tests using endpoints/workloads should be
independently checking that those endpoints work: if neon_local
generates an error inside the storage controller, that's ignorable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants