Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][broker] Unblock stuck Key_Shared subscription after consumer reconnect #21579

Conversation

nborisov
Copy link

Fixes #21199 and #15705

Motivation

Current delivery mechanism blocks recently joined consumers from getting messages until all the previously delivered but not acknowledged messages are either acknowledged by the consumer which received the messages or the consumer disconnected

This is implementation trade off which allows not to track all the sent messages. This approach allows to reduce memory consumption and increase performance

The same time such implementation leads to a situation when a single failed message could lead to all the consumers blockage (described at #21199)

There is a setting allowOutOfOrderDelivery which allows to mitigate the described issue but leads to ordering guarantees loss

For some scenarios both consumption stuck and out of order delivery is not an option. The same time it could be ok to have more memory consumption and some minor performance degradation

Modifications

  1. Sent but not acknowledged messages tracking added
  2. Messages tracking setting added

Tracking of delivered but not acknowledged messages allows not to send them to recently joined consumers and the same time preserve ordering

Making the new functionality switchable allows to preserve the current optimized memory consumption and performance for all the users with default settings. The same time users who is ready to memory and performance trade off getting opportunity to solve both consumers stuck problem and out of order delivery

Verifying this change

  • Make sure that the change passes the CI checks.

This change added tests and can be verified as follows:

  • Existing KeySharedSubscriptionTest extended with scenarios with new setting enabled
  • Added integration test for verification of preserving ordering with the new setting enabled
  • Added integration test for verification that recently joined consumers are not blocked when there are some unacknowledged messages delivered to existing consumers

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: nborisov#2

@github-actions github-actions bot added the doc-required Your PR changes impact docs and you will update later. label Nov 14, 2023
@nborisov
Copy link
Author

CI checks passed at forked repo PR but there are some problems with coverage uploading. should I perform some set up to fix it or just skipping these steps is ok?

Comment on lines +892 to +901
@FieldContext(
category = CATEGORY_POLICIES,
doc = "If enabled subscriptions stores keys of messages which allows consumer not "
+ "to stuck in case it goes to recently assigned. The setting allows to overcome "
+ "situation when new KeyShared consumers will not get any messages until a consumer "
+ "that did get messages disconnects or acks/nacks some messages "
+ "The trade of is the need to track all the not acked messages in subscription"
+ "which could potentially lead to performance degradation and higher memory consumption"
)
private boolean rememberNotAckedMessagesKey = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces a new configuration. We need a PIP for it.
Could you follow the PIP guide to draft a PIP?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link! Will do

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initiated the PIP #21615 and sent [Discussion] email to dev@pulsar.apache.org
Please notify me in case I need to do some other actions which I did not noticed
Thank you!

@eolivelli
Copy link
Contributor

@nborisov I am very interested in this work, I read the PIP, but I cannot find the discussion on dev@ can you please tell me the subject of our message ?

@nborisov
Copy link
Author

@eolivelli I've actually sent 2 messages:

the first one contained wrong PIP number ([Discuss] PIP-308 Unblock stuck Key_Shared subscription after consumer reconnect)
the second one was with the actual PIP number:

[Discuss] PIP-319 Unblock stuck Key_Shared subscription after consumer reconnect

I sent both to dev@pulsar.apache.org

Please let me know if I need to perform some additional actions or made a mistake in the process!

@Technoboy- Technoboy- added this to the 3.3.0 milestone Dec 22, 2023
@nborisov nborisov closed this Jan 11, 2024
@nborisov
Copy link
Author

Closed in favor of #21657

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-required Your PR changes impact docs and you will update later.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Subscription consumption stuck on consumer reconnect
4 participants