Skip to content

Commit

Permalink
pageserver: more detailed logs when calling re-attach (#9996)
Browse files Browse the repository at this point in the history
## Problem

We saw a peculiar case where a pageserver apparently got a 0-tenant
response to `/re-attach` but we couldn't see the request landing on a
storage controller. It was hard to confirm retrospectively that the
pageserver was configured properly at the moment it sent the request.

## Summary of changes

- Log the URL to which we are sending the request
- Log the NodeId and metadata that we sent
  • Loading branch information
jcsp authored Dec 3, 2024
1 parent dcb6295 commit b04ab46
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 6 deletions.
4 changes: 2 additions & 2 deletions libs/pageserver_api/src/controller_api.rs
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ pub struct TenantCreateResponse {
pub shards: Vec<TenantCreateResponseShard>,
}

#[derive(Serialize, Deserialize)]
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct NodeRegisterRequest {
pub node_id: NodeId,

Expand All @@ -75,7 +75,7 @@ pub struct TenantPolicyRequest {
pub scheduling: Option<ShardSchedulingPolicy>,
}

#[derive(Clone, Serialize, Deserialize, PartialEq, Eq, Hash)]
#[derive(Clone, Serialize, Deserialize, PartialEq, Eq, Hash, Debug)]
pub struct AvailabilityZone(pub String);

impl Display for AvailabilityZone {
Expand Down
12 changes: 9 additions & 3 deletions pageserver/src/controller_upcall_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,10 @@ impl ControllerUpcallClient {

Ok(res)
}

pub(crate) fn base_url(&self) -> &Url {
&self.base_url
}
}

impl ControlPlaneGenerationsApi for ControllerUpcallClient {
Expand Down Expand Up @@ -191,13 +195,15 @@ impl ControlPlaneGenerationsApi for ControllerUpcallClient {

let request = ReAttachRequest {
node_id: self.node_id,
register,
register: register.clone(),
};

let response: ReAttachResponse = self.retry_http_forever(&re_attach_path, request).await?;
tracing::info!(
"Received re-attach response with {} tenants",
response.tenants.len()
"Received re-attach response with {} tenants (node {}, register: {:?})",
response.tenants.len(),
self.node_id,
register,
);

failpoint_support::sleep_millis_async!("control-plane-client-re-attach");
Expand Down
2 changes: 1 addition & 1 deletion pageserver/src/tenant/mgr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ async fn init_load_generations(
);
emergency_generations(tenant_confs)
} else if let Some(client) = ControllerUpcallClient::new(conf, cancel) {
info!("Calling control plane API to re-attach tenants");
info!("Calling {} API to re-attach tenants", client.base_url());
// If we are configured to use the control plane API, then it is the source of truth for what tenants to load.
match client.re_attach(conf).await {
Ok(tenants) => tenants
Expand Down

1 comment on commit b04ab46

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7144 tests run: 6826 passed, 0 failed, 318 skipped (full report)


Flaky tests (4)

Postgres 17

Postgres 15

Postgres 14

Code coverage* (full report)

  • functions: 30.7% (8270 of 26918 functions)
  • lines: 47.7% (65208 of 136584 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
b04ab46 at 2024-12-03T20:27:56.527Z :recycle:

Please sign in to comment.