-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce producer lookups and connections in partitioned producers #11496
Labels
type/enhancement
The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Comments
Vanlightly
added
the
type/enhancement
The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
label
Jul 29, 2021
@equanz I have created an MR with my C++ changes, which are very similar to your Java client changes and fall nicely inside your PIP (its a subset of the PIP). Some differences, such as allowing lazy producers to be configurable, and how the lazy start is kicked off. |
codelipenghui
pushed a commit
that referenced
this issue
Aug 16, 2021
Fixes #11496 also matches part of PIP 79. C++ implementation that closely matches the proposed Java client changes from reducing partitioned producer connections and lookups: PR 10279 ### Motivation Producers that send messages to partitioned topics start a producer per partition, even when using single partition routing. For topics that have the combination of a large number of producers and a large number of partitions, this can put strain on the brokers. With say 1000 partitions and single partition routing with non-keyed messages, 999 topic owner lookups and producer registrations are performed that could be avoided. PIP 79 also describes this. I wrote this before realising that PIP 79 also covers this. This implementation can be reviewed and contrasted to the Java client implementation in #10279. ### Modifications Allows partitioned producers to start producers for individual partitions lazily. Starting a producer involves a topic owner lookup to find out which broker is the owner of the partition, then registering the producer for that partition with the owner broker. For topics with many partitions and when using SinglePartition routing without keyed messages, all of these lookups and producer registrations are a waste except for the single chosen partition. This change allows the user to control whether a producer on a partitioned topic uses this lazy start or not, via a new config in ProducerConfiguration. When ProducerConfiguration.setLazyStartPartitionedProducers(true) is set, the PartitionedProducerImpl.start() becomes a synchronous operation that only does housekeeping (no network operations). The producer of any given partition is started (which includes a topic owner lookup and registration) upon sending the first message to that partition. While the producer starts, messages are buffered. The sendTimeout timer is only activated once a producer has been fully started, which should give enough time for any buffered messages to be sent. For very short send timeouts, this setting could cause send timeouts during the start phase. The default of 30s should however not cause this issue.
codelipenghui
pushed a commit
that referenced
this issue
Sep 10, 2021
Fixes #11496 also matches part of PIP 79. C++ implementation that closely matches the proposed Java client changes from reducing partitioned producer connections and lookups: PR 10279 ### Motivation Producers that send messages to partitioned topics start a producer per partition, even when using single partition routing. For topics that have the combination of a large number of producers and a large number of partitions, this can put strain on the brokers. With say 1000 partitions and single partition routing with non-keyed messages, 999 topic owner lookups and producer registrations are performed that could be avoided. PIP 79 also describes this. I wrote this before realising that PIP 79 also covers this. This implementation can be reviewed and contrasted to the Java client implementation in #10279. ### Modifications Allows partitioned producers to start producers for individual partitions lazily. Starting a producer involves a topic owner lookup to find out which broker is the owner of the partition, then registering the producer for that partition with the owner broker. For topics with many partitions and when using SinglePartition routing without keyed messages, all of these lookups and producer registrations are a waste except for the single chosen partition. This change allows the user to control whether a producer on a partitioned topic uses this lazy start or not, via a new config in ProducerConfiguration. When ProducerConfiguration.setLazyStartPartitionedProducers(true) is set, the PartitionedProducerImpl.start() becomes a synchronous operation that only does housekeeping (no network operations). The producer of any given partition is started (which includes a topic owner lookup and registration) upon sending the first message to that partition. While the producer starts, messages are buffered. The sendTimeout timer is only activated once a producer has been fully started, which should give enough time for any buffered messages to be sent. For very short send timeouts, this setting could cause send timeouts during the start phase. The default of 30s should however not cause this issue. (cherry picked from commit 9577b84)
bharanic-dev
pushed a commit
to bharanic-dev/pulsar
that referenced
this issue
Mar 18, 2022
…e#11570) Fixes apache#11496 also matches part of PIP 79. C++ implementation that closely matches the proposed Java client changes from reducing partitioned producer connections and lookups: PR 10279 ### Motivation Producers that send messages to partitioned topics start a producer per partition, even when using single partition routing. For topics that have the combination of a large number of producers and a large number of partitions, this can put strain on the brokers. With say 1000 partitions and single partition routing with non-keyed messages, 999 topic owner lookups and producer registrations are performed that could be avoided. PIP 79 also describes this. I wrote this before realising that PIP 79 also covers this. This implementation can be reviewed and contrasted to the Java client implementation in apache#10279. ### Modifications Allows partitioned producers to start producers for individual partitions lazily. Starting a producer involves a topic owner lookup to find out which broker is the owner of the partition, then registering the producer for that partition with the owner broker. For topics with many partitions and when using SinglePartition routing without keyed messages, all of these lookups and producer registrations are a waste except for the single chosen partition. This change allows the user to control whether a producer on a partitioned topic uses this lazy start or not, via a new config in ProducerConfiguration. When ProducerConfiguration.setLazyStartPartitionedProducers(true) is set, the PartitionedProducerImpl.start() becomes a synchronous operation that only does housekeeping (no network operations). The producer of any given partition is started (which includes a topic owner lookup and registration) upon sending the first message to that partition. While the producer starts, messages are buffered. The sendTimeout timer is only activated once a producer has been fully started, which should give enough time for any buffered messages to be sent. For very short send timeouts, this setting could cause send timeouts during the start phase. The default of 30s should however not cause this issue.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
type/enhancement
The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Is your enhancement request related to a problem? Please describe.
Producers that send messages to partitioned topics start a producer per partition, even when using single partition routing. For topics that have the combination of a large number of producers and a large number of partitions, this can put strain on the brokers. With say 1000 partitions and single partition routing with non-keyed messages, 999 topic owner lookups and producer registrations are performed that could be avoided.
Describe the solution you'd like
Option 1 - Strict Single Partition Routing
The problem is that we have no way of knowing which partitions will be involved upon producer creation, even when using Single Partition routing. The problem with this is that the user code can still use keyed messages which may then involve more than one partition.
Solution: offer a strict single partition routing mode where we guarantee that all messages will only be sent to a single partition, keyed or not. This would allow us to only start a single producer on the creation of the partitioned producer.
Option 2 - Lazy Producer Start
Allow for producers in the partitioned producer class to be started lazily, upon the first message being sent to their particular partition. This would be controlled via a new producer configuration as this behaviour only benefits those who:
Messages will be buffered while the connection to the topic owner is carried out.
The downside is that there will be extra latency on the first messages being published. The send timeout timer is only started once the producer is connected so this means that timeouts should not trigger. Only if the send timeout is set very low and the number of pending messages is high might we typically see send timeouts because of this change.
Describe alternatives you've considered
Just option 1 and 2.
I have an implementation of lazy producer start for the C++ client in case option 2 is preferred. I can contribute the work to both Java and C++ clients whether it is ultimately option 1 or 2 that we select.
The text was updated successfully, but these errors were encountered: