Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use initial revision to fence leader updates in partition leaders table #9300

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Mar 7, 2023

Using initial_revision_id to fence leadership updates instead of using
a revision coming from configuration id. In current implementation
leadership update might have been lost due to the fact that the
partition log wasn't fully up to date with the leader hence the revision
of configuration wasn't yet updated. Using initial revision that changes
only when topic is deleted and then created again but not when the
partition moves make the implementation correct as the fencing will only
happen for partition being part of different topic instances.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Improvements

  • Make leadership metadata propagation faster.

@mmaslankaprv mmaslankaprv force-pushed the fix-metadata-dissemination-revision branch from ba378d0 to 07b46c7 Compare March 7, 2023 14:54
src/v/features/feature_table.h Outdated Show resolved Hide resolved
src/v/cluster/health_monitor_types.h Show resolved Hide resolved
src/v/cluster/metadata_dissemination_types.h Show resolved Hide resolved
src/v/cluster/metadata_dissemination_types.h Outdated Show resolved Hide resolved
@mmaslankaprv mmaslankaprv force-pushed the fix-metadata-dissemination-revision branch 2 times, most recently from 044ecf8 to 9a257f8 Compare March 8, 2023 10:55
@@ -86,6 +91,7 @@ class partition_leaders_table {
void update_partition_leader(
const model::ntp&,
model::revision_id,
model::initial_revision_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if using model::initial_revision_id is correct here. AFAIR we introduced this to remember the revision of the topic that we restored from S3, which can be from a different cluster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the initial_revision changes every time the whole topic is deleted, and this is what we want to fence the leadership table updates with

@@ -86,6 +91,7 @@ class partition_leaders_table {
void update_partition_leader(
const model::ntp&,
model::revision_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both revision_id and initial_revision_id fields? Perhaps letting health monitor and metadata dissemination service choose which revision to use for the update will be less invasive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i moved this into partition_leaders_table to be able to precisely control the moment when we switch from revision to initial revision as we must be certain that the revision is not updated with revision_id, after we enabled the initial revision feature as it is always greater than equal to initial_revision

src/v/cluster/partition_leaders_table.cc Show resolved Hide resolved
Partition leaders table uses partition revision to fence leader metadata
update. Introduced feature enabling use of partition initial revision in
feature table.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
When propagating leadership information Redpanda include a partition
revision id to fence the updates coming from older instance of a
partition with the same ntp. The revision that is used in the leader
table is increased every time the partition configuration change. Added
an initial revision id to the leadership metadata. Initial revision
changes when partition is recreated but not when it is moved.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
…ates

Using `initial_revision_id` to fence leadership updates instead of using
a revision coming from configuration id. In current implementation
leadership update might have been lost due to the fact that the
partition log wasn't fully up to date with the leader hence the revision
of configuration wasn't yet updated. Using initial revision that changes
only when topic is deleted and then created again but not when the
partition moves make the implementation correct as the fencing will only
happen for partition being part of different topic instances.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
@mmaslankaprv mmaslankaprv marked this pull request as draft July 6, 2023 09:29
@mmaslankaprv mmaslankaprv force-pushed the fix-metadata-dissemination-revision branch from 9563860 to 9a0a9ea Compare July 6, 2023 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants