Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: duplicated schemas on rapid elections while continuous produce of records #938

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

eliax1996
Copy link
Contributor

@eliax1996 eliax1996 commented Aug 21, 2024

coordinator rewrite

  1. moving the coordinator in a separate thread
  2. adding a waiting time between when the master its elected and the master can act. This has been done to avoid rapid elections of master that may produce schemas with different ids.

Example of what could happpen without the delay:

|--------------------------------------|
|Node | Node1    | Node2    | Node3    |
|Role | Master   | Follower | Follower |
|--------------------------------------|


Node1 -> Send Message A{id=max(current_ids)} to kafka

where the max(current_ids) = 10

---------------------------------------

Node1 its disconnected, the message its still in the producer queue of Node1

---------------------------------------

Node2 its elected master

|--------------------------------------|
|Node | Node1    | Node2    | Node3    |
|Role | Follower | Master   | Follower |
|--------------------------------------|

----------------------------------------


Node2 produces a message B{id=max(current_ids)} to kafka

Because the message A isn't yet delivered to Node2, the max(current_ids) returns still 10.
And we have an ID clash.

The solution its simple, each master should wait a reasonable high number of milliseconds before acting as a master.
So that all the in-flight messages are delivered to kafka + the reasonable delay of the consumer for the master node before noticing that a message has been produced

@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch from 7b04f19 to 285c950 Compare August 22, 2024 16:25
@eliax1996 eliax1996 changed the title WIP: fix duplicated schemas on rapid elections while continuous produce of records fix: duplicated schemas on rapid elections while continuous produce of records Aug 22, 2024
@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch 2 times, most recently from 52fb34c to 054efb7 Compare August 22, 2024 16:50
@eliax1996
Copy link
Contributor Author

immagine

@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch 5 times, most recently from 1fa1052 to 7f657bd Compare August 23, 2024 10:14
Copy link

github-actions bot commented Aug 23, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  src/karapace
  config.py
  schema_reader.py 377-378
  schema_registry.py 100
  schema_registry_apis.py 106-107, 688-700, 1335
  typing.py 111, 115
  src/karapace/backup
  api.py
  cli.py
  src/karapace/coordinator
  master_coordinator.py 109, 111-112, 168
  schema_coordinator.py 198, 213-214, 500
  src/karapace/sentry
  sentry_client.py
Project Total  

This report was generated by python-coverage-comment-action

@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch from 49a9fd2 to c4e58e8 Compare August 23, 2024 14:18
@jclarysse
Copy link
Contributor

Please also add the new config waiting_time_before_acting_as_master_ms to https://github.com/Aiven-Open/karapace/blob/main/README.rst#configuration-keys

@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch 6 times, most recently from 8183430 to 784a2fa Compare August 29, 2024 13:34
karapace/schema_reader.py Outdated Show resolved Hide resolved
@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch from 784a2fa to 3f84dff Compare September 18, 2024 14:15
@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch 7 times, most recently from 9f3d8bc to 25e6c16 Compare November 13, 2024 15:56
@@ -1307,7 +1335,7 @@ async def _forward_request_remote(
if auth_header is not None:
headers["Authorization"] = auth_header

with async_timeout.timeout(timeout):
async with async_timeout.timeout(timeout):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was wrong since a while

Comment on lines 553 to 533
LOG.info("Resetting generation status")
# this is called immediately after the election, we shouldn't reset this
# until a new node its elected aka the other path where a new node its elected
# otherwise this its called at each round and we keep not counting the 5 seconds
# required before the election.
# self._are_we_master = False
self.generation = OffsetCommitRequest.DEFAULT_GENERATION_ID
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was called because we were exiting the election loop in the _async_loop of the master_coordinator.py. We must keep running the thread/algorithm otherwise we are always electing a new node since that causes a rebalance and the rebalance causes a new election (the rebalance happen because we sent the reset_generation as a side effect of closing the heartbeat task)

@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch 2 times, most recently from a422f24 to d7cdd8a Compare November 14, 2024 07:39
Comment on lines 93 to 98
# why do we need to close?
# we just need to keep running even when the schema registry its ready
# otherwise we cause a rebalance and a new election. This should run until
# karapace is restarted
# if self._sc.ready():
# break
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this question its mainly for @jjaakola-aiven. I inherited the initial implementation from him. I think we shouldn't exit but I wait for him to reply here

@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch 3 times, most recently from 2385470 to 0e374a7 Compare November 19, 2024 13:19
@eliax1996 eliax1996 marked this pull request as ready for review November 19, 2024 13:19
@eliax1996 eliax1996 requested review from a team as code owners November 19, 2024 13:19
@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch 7 times, most recently from 8eb38e1 to d8067fd Compare November 21, 2024 08:55
1. moving the coordinator in a separate thread
2. adding a waiting time between when the master its elected and the master can act. This has been done to avoid rapid elections of master that may produce schemas with different ids.

Example of what could happpen without the delay:

|--------------------------------------|
|Node | Node1    | Node2    | Node3    |
|Role | Master   | Follower | Follower |
|--------------------------------------|

Node1 -> Send Message A{id=max(current_ids)} to kafka

where the max(current_ids) = 10

---------------------------------------

Node1 its disconnected, the message its still in the producer queue of Node1

---------------------------------------

Node2 its elected master

|--------------------------------------|
|Node | Node1    | Node2    | Node3    |
|Role | Follower | Master   | Follower |
|--------------------------------------|

----------------------------------------

Node2 produces a message B{id=max(current_ids)} to kafka

Because the message A isn't yet delivered to Node2, the max(current_ids) returns still 10.
And we have an ID clash.

The solution its simple, each master should wait a reasonable high number of milliseconds before acting as a master.
So that all the in-flight messages are delivered to kafka + the reasonable delay of the consumer for the master node before noticing that a message has been produced
@eliax1996 eliax1996 force-pushed the eliax1996/make-sure-master-wait-a-while-before-being-active-and-if-new-messages-are-arriving branch from d8067fd to 5cb351b Compare November 21, 2024 10:27
"/master_available",
callback=self.master_available,
# post because they cannot be cached, need to be sure always gathering the real value of that property
method="POST",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of this because it violates semantics of POST. Maybe set a Cache-Control: no-store, no-cache, must-revalidate header instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants