Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud_storage: remote labels #20778

Merged
merged 34 commits into from
Jul 8, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
44978ef
offline_log_viewer: support remote_labels in topic properties
andrwng Jun 20, 2024
7d38dba
cluster: add remote label as topic property
andrwng Jun 12, 2024
fab93f1
archival_stm: add remote path provider to archival stm
andrwng Jun 29, 2024
cda86a5
ntp_archiver: use path provider for naming
andrwng Jun 20, 2024
160a40e
cloud_storage: plumb path provider into async manifest view
andrwng Jun 20, 2024
291ed3f
cluster: use topic_manifest_downloader in topic recovery
andrwng Jun 17, 2024
b6d1e16
cluster: use topic_manifest_downloader in list-based recovery
andrwng Jun 25, 2024
41df563
archival_stm: use partition_manifest_downloader for snapshot recovery
andrwng Jun 18, 2024
e16c247
ntp_archiver: use partition_manifest_downlaoder for read replicas
andrwng Jun 17, 2024
13a1567
cluster: use partition_manifest_downloader in partition recovery
andrwng Jun 17, 2024
fe640c9
cluster: use partition_manifest_downloader in topic recovery validation
andrwng Jun 20, 2024
9d97ff6
cluster: use partition_manifest_downloader for unsafe reset from cloud
andrwng Jun 17, 2024
8cb5db2
cloud_storage: use partition_manifest_downloader in scrubber/anomaly …
andrwng Jun 20, 2024
ebf2fa1
remote_partition: use partition_manifest_downloader for manifest fina…
andrwng Jun 17, 2024
653c9ec
cloud_storage: remove try_download_partition_manifest
andrwng Jun 17, 2024
eaa7b95
cloud_storage: remove remote::partition_manifest_exists()
andrwng Jun 20, 2024
849aef7
cloud_storage: use path provider for spillovers in manifest view
andrwng Jun 19, 2024
5248547
remote_partition: use path provider for segment paths
andrwng Jun 18, 2024
3937454
cloud_storage: add utils for lifecycle marker paths
andrwng Jun 29, 2024
d10a60d
archival: use path provider to generate lifecycle marker paths
andrwng Jun 29, 2024
5d30320
archival: use the path provider throughout the purger
andrwng Jun 18, 2024
a0be06a
archival/segment_merger: use path provider for remote runs
andrwng Jun 18, 2024
d2681ba
admin: use path provider for anomalies report
andrwng Jun 19, 2024
1066344
features: add flag for remote labels
andrwng Jul 4, 2024
2d7e19a
config: property for disabling remote labels for tests
andrwng Jun 20, 2024
e7669a9
cloud_storage: clean up manifest path generation from manifests
andrwng Jun 20, 2024
46ed453
cloud_storage: clean up remote segment name generation
andrwng Jun 29, 2024
459106c
topic_manifest: remove feature table
andrwng Jun 22, 2024
902f522
rptest/services: make get_cluster_uuid() node optional
andrwng Jun 24, 2024
245b4f2
rptest: support remote labels in BucketView
andrwng Jun 24, 2024
df36562
rptest: option to change si_settings bucket
andrwng Jul 1, 2024
45c62a4
cluster: plug cluster uuid into new topics
andrwng Jun 29, 2024
b6ae520
config: enable remote labels by default
andrwng Jul 3, 2024
263d4b3
rptest: add remote_label_test
andrwng Jul 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 25 additions & 16 deletions src/v/archival/ntp_archiver_service.cc
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
#include "cloud_storage/async_manifest_view.h"
#include "cloud_storage/partition_manifest.h"
#include "cloud_storage/remote.h"
#include "cloud_storage/remote_path_provider.h"
#include "cloud_storage/remote_segment.h"
#include "cloud_storage/remote_segment_index.h"
#include "cloud_storage/spillover_manifest.h"
Expand Down Expand Up @@ -516,10 +517,10 @@ ss::future<> ntp_archiver::upload_topic_manifest() {
cfg_copy.replication_factor = replication_factor;
cloud_storage::topic_manifest tm(
cfg_copy, _rev, _feature_table.local());
auto key = tm.get_manifest_path();
auto key = tm.get_manifest_path(remote_path_provider());
vlog(ctxlog.debug, "Topic manifest object key is '{}'", key);
auto res = co_await _remote.upload_manifest(
_conf->bucket_name, tm, fib);
_conf->bucket_name, tm, key, fib);
andrwng marked this conversation as resolved.
Show resolved Hide resolved
if (res != cloud_storage::upload_result::success) {
vlog(ctxlog.warn, "Topic manifest upload failed: {}", key);
} else {
Expand Down Expand Up @@ -818,12 +819,12 @@ ss::future<> ntp_archiver::sync_manifest_until_term_change() {
vlog(
_rtclog.error,
"Failed to download manifest {}",
manifest().get_manifest_path());
manifest().get_manifest_path(remote_path_provider()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remote_path_prvider is passed into every call? IIUC the remote_label is either set for the partition or not. So it'd be possible to pass remote_path_provider to the c-tor of the partition_manifest in the archival STM (which also "knows" about remote_label now).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We chatted about this offline and agreed that making the path provider a member of the manifest probably isn't the way to go, given how we use partition_manifest in many places as just a struct that knows about serialization. For now, having a singular remote_path_provider and passing it around is tentatively the least evil.

} else {
vlog(
_rtclog.debug,
"Successfuly downloaded manifest {}",
manifest().get_manifest_path());
manifest().get_manifest_path(remote_path_provider()));
}
co_await ss::sleep_abortable(_sync_manifest_timeout(), _as);
}
Expand Down Expand Up @@ -1110,15 +1111,16 @@ ss::future<cloud_storage::upload_result> ntp_archiver::upload_manifest(

auto upload_insync_offset = manifest().get_insync_offset();

auto path = manifest().get_manifest_path(remote_path_provider());
vlog(
_rtclog.debug,
"[{}] Uploading partition manifest, insync_offset={}, path={}",
upload_ctx,
upload_insync_offset,
manifest().get_manifest_path());
path());

auto result = co_await _remote.upload_manifest(
get_bucket_name(), manifest(), fib);
get_bucket_name(), manifest(), path, fib);

// now that manifest() is updated in cloud, updated the
// compacted_away_cloud_bytes metric
Expand Down Expand Up @@ -1162,7 +1164,7 @@ remote_segment_path ntp_archiver::segment_path_for_candidate(
.sname_format = cloud_storage::segment_name_format::v3,
};

return manifest().generate_segment_path(val);
return manifest().generate_segment_path(val, remote_path_provider());
}

static std::pair<ss::input_stream<char>, ss::input_stream<char>>
Expand Down Expand Up @@ -2134,9 +2136,7 @@ ntp_archiver::maybe_truncate_manifest() {
_conf->manifest_upload_timeout(),
_conf->upload_loop_initial_backoff(),
&rtc);
auto sname = cloud_storage::generate_local_segment_name(
meta.base_offset, meta.segment_term);
andrwng marked this conversation as resolved.
Show resolved Hide resolved
auto spath = m.generate_segment_path(meta);
auto spath = m.generate_segment_path(meta, remote_path_provider());
auto result = co_await _remote.segment_exists(
get_bucket_name(), spath, fib);
if (result == cloud_storage::download_result::notfound) {
Expand Down Expand Up @@ -2402,7 +2402,8 @@ ss::future<> ntp_archiver::garbage_collect_archive() {
continue;
}
if (meta.committed_offset < start_offset) {
const auto path = manifest.generate_segment_path(meta);
const auto path = manifest.generate_segment_path(
meta, remote_path_provider());
vlog(
_rtclog.info,
"Enqueuing spillover segment delete from cloud "
Expand Down Expand Up @@ -2437,7 +2438,8 @@ ss::future<> ntp_archiver::garbage_collect_archive() {
if (stop) {
break;
}
const auto path = cursor->manifest()->get_manifest_path();
const auto path = cursor->manifest()->get_manifest_path(
remote_path_provider());
vlog(
_rtclog.info,
"Enqueuing spillover manifest delete from cloud "
Expand Down Expand Up @@ -2625,8 +2627,9 @@ ss::future<> ntp_archiver::apply_spillover() {

retry_chain_node upload_rtc(
manifest_upload_timeout, manifest_upload_backoff, &_rtcnode);
const auto path = tail.get_manifest_path(remote_path_provider());
auto res = co_await _remote.upload_manifest(
get_bucket_name(), tail, upload_rtc);
get_bucket_name(), tail, path, upload_rtc);
if (res != cloud_storage::upload_result::success) {
vlog(_rtclog.error, "Failed to upload spillover manifest {}", res);
co_return;
Expand All @@ -2635,7 +2638,7 @@ ss::future<> ntp_archiver::apply_spillover() {
// Put manifest into cache to avoid roundtrip to the cloud storage
auto reservation = co_await _cache.reserve_space(len, 1);
co_await _cache.put(
tail.get_manifest_path()(),
tail.get_manifest_path(remote_path_provider())(),
str,
reservation,
_conf->upload_io_priority);
Expand Down Expand Up @@ -2672,7 +2675,7 @@ ss::future<> ntp_archiver::apply_spillover() {
vlog(
_rtclog.info,
"Uploaded spillover manifest: {}",
tail.get_manifest_path());
tail.get_manifest_path(remote_path_provider()));
}
}
}
Expand Down Expand Up @@ -2864,7 +2867,8 @@ ss::future<> ntp_archiver::garbage_collect() {

std::deque<cloud_storage_clients::object_key> objects_to_remove;
for (const auto& meta : to_remove) {
const auto path = manifest().generate_segment_path(meta);
const auto path = manifest().generate_segment_path(
meta, remote_path_provider());
vlog(_rtclog.info, "Deleting segment from cloud storage: {}", path);

objects_to_remove.emplace_back(path);
Expand Down Expand Up @@ -3257,6 +3261,11 @@ const storage::ntp_config& ntp_archiver::ntp_config() const {
return _parent.log()->config();
}

const cloud_storage::remote_path_provider&
ntp_archiver::remote_path_provider() const {
return _parent.archival_meta_stm()->path_provider();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as before, it's always the same path provider for the partition and it lives in archival STM so why not connect manifest and path provider on that level instead of doing this in every call? This makes code confusing because the reader might think that the path provider could be changed from upload to upload. Also, the async_manifest_view is getting the path provider in c-tor and partition_manifest doesn't. Which is also a bit confusing and not uniform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pasting from other PR comments:

I think the shift in mindset is that the manifest view and the partition manifest are not equivalent to each other. The manifest view is much more full-featured and must be able to access paths of an STM manifest and spillover manifests, unlike the partition_manifest class which is focused on tracking member fields and serialization.

}

void ntp_archiver::complete_transfer_leadership() {
vlog(
_rtclog.trace,
Expand Down
2 changes: 2 additions & 0 deletions src/v/archival/ntp_archiver_service.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
#include "cloud_storage/fwd.h"
#include "cloud_storage/partition_manifest.h"
#include "cloud_storage/remote.h"
#include "cloud_storage/remote_path_provider.h"
#include "cloud_storage/remote_segment_index.h"
#include "cloud_storage/types.h"
#include "cluster/fwd.h"
Expand Down Expand Up @@ -371,6 +372,7 @@ class ntp_archiver {
void complete_transfer_leadership();

const storage::ntp_config& ntp_config() const;
const cloud_storage::remote_path_provider& remote_path_provider() const;

/// If we have a projected manifest clean offset, then flush it to
/// the persistent stm clean offset.
Expand Down