-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster
: add tombstone_retention_ms
/delete.retention.ms
#23662
Conversation
9c73ed7
to
a6d89fd
Compare
@@ -100,7 +100,7 @@ consteval describe_configs_type property_config_type() { | |||
std::is_same_v<T, model::cleanup_policy_bitflags> || | |||
std::is_same_v<T, model::timestamp_type> || | |||
std::is_same_v<T, config::data_directory_path> || | |||
std::is_same_v<T, pandaproxy::schema_registry::subject_name_strategy> || | |||
std::is_same_v<T, pandaproxy::schema_registry::subject_name_strategy> || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed trailing whitespace
a6d89fd
to
f737a4d
Compare
new failures in https://buildkite.com/redpanda/redpanda/builds/55964#019269fb-8611-4e0e-ac13-76b3f62a8749:
new failures in https://buildkite.com/redpanda/redpanda/builds/55964#01926a00-34a4-458b-b616-878bdc7f3326:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/56007#01926cde-1430-41c7-90d6-85501446db06:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/56054#01926f61-8440-4115-98e5-c50e6578e0e9:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/56163#019272e8-8bbe-4c4d-a17e-60d80d5d2f06:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/56275#0192775c-3fce-4633-99a9-335b127ef068:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/57341#0192dba4-607d-4b39-ab20-96b0efab9dfd:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/57557#0192f9c1-b707-4035-acb6-613149a54e6d:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/57557#0192f9e1-6036-46a5-891a-36d5dbb760d4:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/57557#0192f9e1-6038-4fd8-b0c3-81d499a9bcfe:
|
f737a4d
to
ebf9304
Compare
Retry command for Build#56007please wait until all jobs are finished before running the slash command
|
ebf9304
to
4f52fec
Compare
Force push to:
|
Retry command for Build#56054please wait until all jobs are finished before running the slash command
|
4f52fec
to
f9cf04f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like there is a bisection issue building this commit:
Author: Willem Kaufmann <willem.kaufmann@redpanda.com>
Date: Wed Sep 4 13:09:21 2024 -0400
`cluster`: add `topic_multi_property_validation()`
notes:
src/v/cluster/topic_configuration.cc:59:14: error: field designator 'tombstone_retention_ms' does not refer to any field in type 'storage::ntp_config::default_overrides'
59 | .tombstone_retention_ms = properties.delete_retention_ms});
| ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
f9cf04f
to
7a6ad8c
Compare
Nice catch, corrected by moving changes to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is bisect still an issue on that same commit
git fetch upstream
git checkout 7b2b010eedf5c56f5887985f943c7ea9ddab39bd
git rebase 9e810f72f373ad3df8ff86fbef9a3d75b14aaa15
bazel build //...
src/v/cluster/topic_properties.cc:152:9: error: no member named 'tombstone_retention_ms' in 'storage::ntp_config::default_overrides'
152 | ret.tombstone_retention_ms = delete_retention_ms;
| ~~~ ^
7a6ad8c
to
ece7cc1
Compare
Yup, missed another line, my bad. Corrected by moving changes in |
Retry command for Build#56163please wait until all jobs are finished before running the slash command
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
ece7cc1
to
71c4da7
Compare
Force push to:
|
Retry command for Build#56370please wait until all jobs are finished before running the slash command
|
/ci-repeat 1 |
/ci-repeat 1 |
/ci-repeat 1 |
7f15d42
to
1410957
Compare
Force push to:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks pretty solid to me. just a few questions/comments
tristate<T> override_if_not_default( | ||
const tristate<T>& overrides, const std::optional<T>& def) { | ||
// Overrides is {disabled}, cluster default is {value} | ||
if (overrides.is_disabled() && def.has_value()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does it matter if def.has_value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the two cases you have commented on (// Overrides is {disabled}, cluster default is {value}
, // Overrides is {std::nullopt}, cluster default is {value}
), note that the following case in which (overrides.is_empty() || overrides.is_disabled()) && !def.has_value()
is excluded.
I.e, I'm treating the override and default as "equivalent" when the tristate
is either disabled or empty, and the optional is in the std::nullopt
state.
This makes a lot of logical sense for overrides.is_empty() && !def.has_value()
, but I recognize overrides.is_disabled() && !def.has_value()
is a bit more of a stretch. However, in the case of delete.retention.ms
, its default state is often disabled
, and the corresponding cluster property tombstone_retention_ms
's default state is often std::nullopt
. This, IMO, provides motivation for considering this state in override_if_not_default()
in the way it is currently written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
override is considered equivalent to the default are
do you think we could add to the comment a truth table for all 6 cases (or maybe it's 9 if we count the cases where the default has a value and its actually equal to the tristate rather than just semantics equal). then we could use the retention.ms and truth table as an example for override_if_not_default usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added truth table in a code comment
// Either parse configuration or return nullopt | ||
inline std::optional<bool> | ||
get_bool_value(const config_map_t& config, std::string_view key) { | ||
if (auto it = config.find(key); it != config.end()) { | ||
try { | ||
// Ignore case. | ||
auto str_value = it->second; | ||
std::transform( | ||
str_value.begin(), | ||
str_value.end(), | ||
str_value.begin(), | ||
[](const auto& c) { return std::tolower(c); }); | ||
return string_switch<std::optional<bool>>(str_value) | ||
.match("true", true) | ||
.match("false", false); | ||
} catch (const std::runtime_error&) { | ||
throw boost::bad_lexical_cast(); | ||
}; | ||
} | ||
return std::nullopt; | ||
} | ||
|
||
inline model::shadow_indexing_mode | ||
get_shadow_indexing_mode(const config_map_t& config) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did these need to move to the header or could we have added a declaration to the header? or alternatively, could the implementation of delete_retention_ms_validator have been in the same .cc file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is preferred to leave the implementation of delete_retention_ms_validator
in validators.h
to keep parity with the other definitions.
So, we could leave this as is with the inline
definitions for get_bool_value()
and get_shadow_indexing_mode()
in the header, or I could change it to what you proposed with declarations in types.h
and implementations in types.cc
.
Please let me know what you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the general rule of thumb is to be reducing the amount of code in headers. its not any sort of hard and fast rule, but i don't see in this case why it needs to move to the header. is there a dependency issue that makes it hard to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved declaration/implementation
1410957
to
c6eafb3
Compare
Retry command for Build#57341please wait until all jobs are finished before running the slash command
|
c6eafb3
to
1b9b437
Compare
Plumb the `delete_retention_ms` config through `topic_configuration`, and `topic_properties`. Also modify necessary compatibility sites. Also modify `tools/offline_log_viewer` to be compatible with the updated `serde`.
When applying `update_topic_properties_cmd`, there may be some checks that depend on multiple properties. The current motivating case is ensuring that `delete.retention.ms` is not enabled at the same time as any tiered storage properties. Add `topic_table::topic_multi_property_validation()` to perform these checks and abort the topic property update if the `properties` are found to be invalid.
The storage layer will now utilize a topic's configured `delete.retention.ms` parameter for housekeeping. At the `ntp_config` layer, this is referred to as `tombstone_retention_ms`.
1b9b437
to
2375c86
Compare
Force push to:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay looking at this. Looking pretty nice and complete, but have a few nits and questions about code placement themed around simplifying and future-proofing
#We cannot enable delete.retention.ms if any cloud storage properties are also enabled. | ||
test_cases = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this cover the entirety of the truth table from the earlier commit? Might be worth duplicating the truth table here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also nit: space between # and the first word
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it does not, and the test was not really intended to test the truth table for the overrides- more so tiered storage and delete.retention.ms
property interaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Does it make sense to include a test case for just the delete.retention.ms
behavior without tiered storage involvement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_alter_topic_config covers this already I think.
Similar in intention to the `override_if_not_default` overload for `std::optional<>`. This ensures that `tristate` topic properties may still display `DEFAULT_CONFIG` instead of `DYNAMIC_TOPIC_CONFIG` in the case that their value is `{disabled}` and the cluster default is `std::nullopt`. This is useful for topic properties that may want to default to a `{disabled}` state without indicating `DYNAMIC_TOPIC_CONFIG` as a source.
Adds configuration for the topic property `delete.retention.ms` through the `CreateTopic`, `AlterConfig`, and `IncrementalAlterConfig` Kafka APIs.
2375c86
to
877b01f
Compare
// no value => tristate.has_value() == false; | ||
// value present => tristate.has_value() == true; | ||
|
||
template<typename T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunate that this template function had to be moved to the header in order to use it in validators.h
.
Retry command for Build#57557please wait until all jobs are finished before running the slash command
|
Adding support for `tombstone_retention_ms`, which is a cluster property that is equivalent to Kafka's `log.cleaner.delete.retention.ms`, but is more precisely named, and represents the retention time for tombstone records in compacted topics in `redpanda`. This controls the default value for the topic level property `delete.retention.ms`.
877b01f
to
817afe9
Compare
Force push to:
|
As a follow up to #23231 and #23380, this PR adds both the cluster-level configuration property
tombstone_retention_ms
and the topic-level propertydelete.retention.ms
.The cluster level property
tombstone_retention_ms
controls the default value fordelete.retention.ms
, which is equivalent to the Kafka topic property.There is a tight coupling between the tiered storage properties (both at a topic and a cluster level) and these configurations. Namely,
tombstone_retention_ms
cannot be enabled at the same time as any ofcloud_storage_enabled
,cloud_storage_enable_remote_read
,cloud_storage_enable_remote_write
.delete.retention.ms
cannot be enabled at the same time as any ofredpanda.remote.read
,redpanda.remote.write
.These restrictions are in place to ensure the deletion of tombstone records does not occur for topics with tiered storage enabled, a current restriction for compacted topics in
redpanda
.Docs issue: https://github.com/redpanda-data/documentation-private/issues/2759
Backports Required
Release Notes
Features
delete.retention.ms
, as well as the cluster propertytombstone_retention_ms
. Configuring these allow for the removal of tombstone records in compacted topics with tiered storage disabled inredpanda
.