cloud_storage: use remote index in cloud timequery #13011

VladLazar · 2023-08-25T12:13:01Z

This PR updates the timequery read path to skip up to the first
index entry with a timestamp smaller than the searched one. The result
is that timequeries will hydrate/materialize a maximum of two chunks
(two because the chunk boundaries don't always line up with the entries
in the index).

If the index is not present, or was originally created in v1, the search
starts from the first chunk as it did previously.

Fixes #11801

Backports Required

Release Notes

Improvements

Timequeries (i.e ListOffsets requests) that land in the cloud log now use an index to speed up the search
and reduce the number of hydrated bytes required to serve the query. On average, a time query will have to download 4 times less data (if using the default segment and chunk size)

A lookup by timestamp is added to the remote index. It has the same semantics as the other lookup methods. If the index does not include the time index (i.e. it was created with serde version 1), the search comes up empty.

Lazin

LGTM if CI is happy

VladLazar · 2023-08-25T15:46:48Z

Changes in force-push:

fix TimeQueryKafkaTest

dotnwat

🔥 🚒

dotnwat · 2023-08-25T20:24:03Z

src/v/cloud_storage/remote_segment_index.cc

@@ -184,6 +184,58 @@ offset_index::find_kaf_offset(kafka::offset upper_bound) {
    return res;
 }

+std::optional<offset_index::find_result>
+offset_index::find_timestamp(model::timestamp upper_bound) {
+    if (_initial_time == model::timestamp::missing()) {


it's a bummer that -1 is used to indicate missing timestamp. That -1 carries so much additional baggage in Kafka for example by indicating something specific in time queries.

Yeah, it's not ideal. Not much we can do about it now though. The -1 is already encoded in a released Serde version.

src/v/cloud_storage/remote_segment_index.cc

dotnwat · 2023-08-25T20:31:41Z

src/v/cloud_storage/remote_segment.cc

@@ -315,18 +315,18 @@ remote_segment::offset_data_stream(
    offset_index::find_result pos;
    std::optional<uint16_t> prefetch_override = std::nullopt;
    if (first_timestamp) {


is it odd that first_timestamp wasn't used in this conditional before?

first_timestamp indicates that this is a time-query. By this point, we have already resolved to the correct segment for the time query. Since we didn't previously used the index, it makes senes that first_timestamp wasn't used (we'd just start from the beginning of the segment).

src/v/cloud_storage/remote_segment.cc

src/v/cloud_storage/remote_segment_index.h

src/v/cloud_storage/remote_segment.cc

This commit updates the timequery read path to skip up to the first index entry with a timestamp smaller than the searched one. The result is that timequeries will hydrate/materialize a maximum of two chunks (two because the chunk boundaries don't always line up with the entries in the index). If the index is not present, or was originally created in v1, the search starts from the first chunk as it did previously.

A shard-level metric is added to track the total number of chunks that were hydrated (i.e. downloaded).

This commit makes a couple of changes to the timequery tests: 1. Run timequery on more offests (10 for each of the 12 segments) 2. Check that a maximum of two chunks are downloaded by any given timequery 3. Use the admin api to get the precise boundary between the cloud and local log. Previously, it was estimated based on record size.

This commit refactors the handling of the index search result. If a new result type is introduced, it's author will be reminded to handle it by the assertions.

VladLazar · 2023-08-29T16:38:25Z

Failure is:

CI Failure (NodeCrash) in ShutdownTest.test_timely_shutdown_with_failures #12659

vbotbuildovich · 2023-08-29T16:47:10Z

/backport v23.2.x

vbotbuildovich · 2023-08-29T16:48:17Z

Failed to run cherry-pick command. I executed the commands below:

git checkout -b backport-pr-13011-v23.2.x-222 remotes/upstream/v23.2.x
git cherry-pick -x c1d03142dc993eb31d6ce67c7468fa83bdb31145 da8533ba5bd255fe41286e0d44013d60d93082d8 b78d7113406e05fade4975d76ee3efcc5b36dbf3 6d507cd0e3ee4849c531ff9e7bb3f2a90553b4e9 de5aa5345cd2449bb484be658574c3ab7a44d4ca

Workflow run logs.

VladLazar · 2023-08-30T09:36:44Z

/backport v23.2.x

vbotbuildovich · 2023-08-30T09:37:45Z

Failed to run cherry-pick command. I executed the commands below:

git checkout -b backport-pr-13011-v23.2.x-593 remotes/upstream/v23.2.x
git cherry-pick -x c1d03142dc993eb31d6ce67c7468fa83bdb31145 da8533ba5bd255fe41286e0d44013d60d93082d8 b78d7113406e05fade4975d76ee3efcc5b36dbf3 6d507cd0e3ee4849c531ff9e7bb3f2a90553b4e9 de5aa5345cd2449bb484be658574c3ab7a44d4ca

Workflow run logs.

cloud_storage: add lookup by timestamp to index

c1d0314

A lookup by timestamp is added to the remote index. It has the same semantics as the other lookup methods. If the index does not include the time index (i.e. it was created with serde version 1), the search comes up empty.

github-actions bot added the area/redpanda label Aug 25, 2023

VladLazar requested review from abhijat and Lazin August 25, 2023 12:15

VladLazar force-pushed the use-index-in-cloud-timequery branch from ea3677a to 78454bf Compare August 25, 2023 12:17

Lazin previously approved these changes Aug 25, 2023

View reviewed changes

VladLazar dismissed Lazin’s stale review via 0e88306 August 25, 2023 15:46

VladLazar force-pushed the use-index-in-cloud-timequery branch from 78454bf to 0e88306 Compare August 25, 2023 15:46

VladLazar requested a review from Lazin August 25, 2023 15:46

dotnwat reviewed Aug 25, 2023

View reviewed changes

abhijat reviewed Aug 28, 2023

View reviewed changes

src/v/cloud_storage/remote_segment_index.h Outdated Show resolved Hide resolved

abhijat reviewed Aug 28, 2023

View reviewed changes

src/v/cloud_storage/remote_segment.cc Outdated Show resolved Hide resolved

Vlad Lazar added 3 commits August 29, 2023 11:26

cloud_storage: add a metric for chunk hydrations

b78d711

A shard-level metric is added to track the total number of chunks that were hydrated (i.e. downloaded).

VladLazar force-pushed the use-index-in-cloud-timequery branch 2 times, most recently from 604d114 to c5df897 Compare August 29, 2023 10:42

VladLazar requested review from dotnwat and abhijat August 29, 2023 10:43

VladLazar force-pushed the use-index-in-cloud-timequery branch from c5df897 to 6314700 Compare August 29, 2023 13:01

cloud_storage: visit index search result

de5aa53

This commit refactors the handling of the index search result. If a new result type is introduced, it's author will be reminded to handle it by the assertions.

VladLazar force-pushed the use-index-in-cloud-timequery branch from 6314700 to de5aa53 Compare August 29, 2023 14:06

abhijat approved these changes Aug 29, 2023

View reviewed changes

piyushredpanda merged commit dab7b6a into redpanda-data:dev Aug 29, 2023

vbotbuildovich mentioned this pull request Aug 29, 2023

[v23.2.x] cloud_storage: use remote index in cloud timequery #13065

Closed

VladLazar mentioned this pull request Aug 30, 2023

[v23.2.x] cloud_storage: use remote index in cloud timequery #13105

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cloud_storage: use remote index in cloud timequery #13011

cloud_storage: use remote index in cloud timequery #13011

VladLazar commented Aug 25, 2023 •

edited

Loading

Lazin left a comment

VladLazar commented Aug 25, 2023

dotnwat left a comment

dotnwat Aug 25, 2023

VladLazar Aug 29, 2023

dotnwat Aug 25, 2023

VladLazar Aug 29, 2023

VladLazar commented Aug 29, 2023

vbotbuildovich commented Aug 29, 2023

vbotbuildovich commented Aug 29, 2023

VladLazar commented Aug 30, 2023

vbotbuildovich commented Aug 30, 2023

cloud_storage: use remote index in cloud timequery #13011

cloud_storage: use remote index in cloud timequery #13011

Conversation

VladLazar commented Aug 25, 2023 • edited Loading

Backports Required

Release Notes

Improvements

Lazin left a comment

Choose a reason for hiding this comment

VladLazar commented Aug 25, 2023

dotnwat left a comment

Choose a reason for hiding this comment

dotnwat Aug 25, 2023

Choose a reason for hiding this comment

VladLazar Aug 29, 2023

Choose a reason for hiding this comment

dotnwat Aug 25, 2023

Choose a reason for hiding this comment

VladLazar Aug 29, 2023

Choose a reason for hiding this comment

VladLazar commented Aug 29, 2023

vbotbuildovich commented Aug 29, 2023

vbotbuildovich commented Aug 29, 2023

VladLazar commented Aug 30, 2023

vbotbuildovich commented Aug 30, 2023

VladLazar commented Aug 25, 2023 •

edited

Loading