Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: suppress error logs in shutdown/detach #4876

Merged
merged 2 commits into from
Aug 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions pageserver/src/tenant/tasks.rs
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,11 @@ async fn compaction_loop(tenant: Arc<Tenant>, cancel: CancellationToken) {
}
}

if cancel.is_cancelled() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now a bit racy. Adding the error is good, and we should filter against such values knstead when logging near lines L114R119.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind I missed we turn this into an Ok at the outermost fn. This check is then just extra lines, correct?

Also, still too many info!, zero are needed when shutting down but this PR at least gets rid of one extra stacktrace.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree about the info! being a bit too verbose, I put it in here for consistency with the other cancel checks in the function.

The functional impact of this check is just to avoid calling into compaction_iteration (which emits an error log when state isn't active), rather than calling in there, logging error, returning and then dropping out of this loop further down when we next enter a timeout.

info!("received cancellation request");
break;
}

let started_at = Instant::now();

let sleep_duration = if period == Duration::ZERO {
Expand Down
8 changes: 7 additions & 1 deletion pageserver/src/tenant/timeline.rs
Original file line number Diff line number Diff line change
Expand Up @@ -697,6 +697,9 @@ impl Timeline {
Err(CompactionError::DownloadRequired(rls)) => {
anyhow::bail!("Compaction requires downloading multiple times (last was {} layers), possibly battling against eviction", rls.len())
}
Err(CompactionError::ShuttingDown) => {
return Ok(());
}
Err(CompactionError::Other(e)) => {
return Err(e);
}
Expand Down Expand Up @@ -778,7 +781,8 @@ impl Timeline {
let layer_removal_cs = Arc::new(self.layer_removal_cs.clone().lock_owned().await);
// Is the timeline being deleted?
if self.is_stopping() {
return Err(anyhow::anyhow!("timeline is Stopping").into());
trace!("Dropping out of compaction on timeline shutdown");
return Err(CompactionError::ShuttingDown);
}

let target_file_size = self.get_checkpoint_distance();
Expand Down Expand Up @@ -3235,6 +3239,8 @@ enum CompactionError {
/// This should not happen repeatedly, but will be retried once by top-level
/// `Timeline::compact`.
DownloadRequired(Vec<Arc<RemoteLayer>>),
/// The timeline or pageserver is shutting down
ShuttingDown,
/// Compaction cannot be done right now; page reconstruction and so on.
Other(anyhow::Error),
}
Expand Down