Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: allow-list neon_local endpoint errors from storage controller #8123

Merged
merged 3 commits into from
Jun 21, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions storage_controller/src/compute_hook.rs
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,9 @@ pub(crate) enum NotifyError {
// A response indicates we will never succeed, such as 400 or 404
#[error("Non-retryable error {0}")]
Fatal(StatusCode),

#[error("neon_local error: {0}")]
NeonLocal(anyhow::Error),
}

enum MaybeSendResult {
Expand Down Expand Up @@ -278,7 +281,7 @@ impl ComputeHook {
async fn do_notify_local(
&self,
reconfigure_request: &ComputeHookNotifyRequest,
) -> anyhow::Result<()> {
) -> Result<(), NotifyError> {
// neon_local updates are not safe to call concurrently, use a lock to serialize
// all calls to this function
let _locked = self.neon_local_lock.lock().await;
Expand Down Expand Up @@ -321,7 +324,8 @@ impl ComputeHook {
tracing::info!("Reconfiguring endpoint {}", endpoint_name,);
endpoint
.reconfigure(compute_pageservers.clone(), *stripe_size)
.await?;
.await
.map_err(NotifyError::NeonLocal)?;
}
}

Expand Down
5 changes: 5 additions & 0 deletions test_runner/fixtures/pageserver/allowed_errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,11 @@ def scan_pageserver_log_for_errors(
".*startup_reconcile: Could not scan node.*",
# Tests run in dev mode
".*Starting in dev mode.*",
# Tests that stop endpoints & use the storage controller's neon_local notification
# mechanism might fail (neon_local's stopping and endpoint isn't atomic wrt the storage
# controller's attempts to notify the endpoint).
".*reconciler.*Local notification hook failed.*",
jcsp marked this conversation as resolved.
Show resolved Hide resolved
".*reconciler.*neon_local error.*",
]


Expand Down
Loading