Skip to content

Commit

Permalink
cloud_storage/remote_partition: handle race retention+timequery
Browse files Browse the repository at this point in the history
log a warning if when trying to get a async_manifest_view_cursor for a
timequery the result is out_of_range.
such error can happen if retention is kicking in and deleting some or
all spillover manifests: in this case there is a window where the
manifest are reclaimable but still kept in memory, and the timequery
hits one of the manifest in the reclaimable range.

handling this as a warning and no result is acceptable because the kafka
client issuing the request can handle this failure, and otherwise
handling this edge cases would increase the complexity of the callstack
  • Loading branch information
andijcr committed Jan 26, 2024
1 parent 9a541f6 commit 4ca7281
Showing 1 changed file with 27 additions and 1 deletion.
28 changes: 27 additions & 1 deletion src/v/cloud_storage/remote_partition.cc
Original file line number Diff line number Diff line change
Expand Up @@ -526,6 +526,7 @@ class partition_record_batch_reader_impl final
cur.error() == error_outcome::out_of_range
&& ss::visit(
query,
[&](model::offset) { return false; },
[&](kafka::offset query_offset) {
// Special case queries below the start offset of the log.
// The start offset may have advanced while the request was
Expand All @@ -545,7 +546,32 @@ class partition_record_batch_reader_impl final
}
return false;
},
[](auto) { return false; })) {
[&](model::timestamp query_ts) {
// Special case, it can happen when a timequery falls below
// the clean offset. Caused when the query races with
// retention/gc. log a warning, since the kafka client can
// handle a failed query
auto const& spillovers = _partition->_manifest_view
->stm_manifest()
.get_spillover_map();
if (
spillovers.empty()
|| spillovers.get_max_timestamp_column()
.last_value()
.value_or(model::timestamp::max()())
>= query_ts()) {
vlog(
_ctxlog.debug,
"Manifest query raced with retention and the result "
"is below the clean/start offset for {}",
query_ts);
return true;
}

// query was not meant for archive region. fallthrough and
// log an error
return false;
})) {
// error was handled
co_return;
}
Expand Down

0 comments on commit 4ca7281

Please sign in to comment.