-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improve][broker] add switch for enable/disable distribute bundles evenly in LoadManager #16059
[improve][broker] add switch for enable/disable distribute bundles evenly in LoadManager #16059
Conversation
pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java
Show resolved
Hide resolved
31b1977
to
448404d
Compare
/pulsarbot run-failure-checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work, We encountered this issue early too.
@hangc0276 PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job.
(cherry picked from commit b06e78e)
Discussing 2.10 cherry-pick here: https://lists.apache.org/thread/2mq3h9gpqv1b4zyyp2cddfltlqz3wtg0 |
Motivation
When we use
ModualLoadManager
asLoadManager
in the pulsar cluster, and the load balancer shedding strategy isThresholdShedder
, we found that the unloaded bundles might be loaded by another broker which has resource usage above average load, and make the new broker be overloaded again, which frequently causes unloading bundles.We found that it tooks more than 10 hours and 400+ times to doing load shedding in the cluster. And the cluster continue to unload bundles when some of topics are lots of traffic in a short time. Below is the unload metric.
![image](https://user-images.githubusercontent.com/4970972/173722942-669971c4-51b4-4d59-b954-99a34e4f517d.png)
![image](https://user-images.githubusercontent.com/4970972/173723061-2c4d2758-33e4-4e92-8e27-8c3528b9eaf1.png)
The key reason is that some brokers with lower load but more bundles can not be candidate due to distributing bundles evenly in LoadManager by force. Most of brokers are filtered out by the strategy, only 1 or 2 brokers can be candidate in the total 136 brokers as follows.
![image](https://user-images.githubusercontent.com/4970972/173603516-2efefae4-d22b-445c-9414-6744d22e0323.png)
It could be much better to disable distribute bundles evenly in
LoadManager
, which can select the broker from those having resource usage below average load, so it can prevent the least loaded broker from quickly becoming heavily loaded.Therefore, it recommend that enable distribute bundles evenly among all brokers by customers according to user scenarios .
After disabling distribute bundles evenly in
![image](https://user-images.githubusercontent.com/4970972/173613270-e79e7bdb-204e-4834-b98a-7f4980a54134.png)
![image](https://user-images.githubusercontent.com/4970972/173613418-9bd41b73-1b79-4cac-b6b6-3b7cc772f743.png)
LoadManager
, the brokers with lower load but more bunldes can be candidate. It reduced the unload times and the cluster is stable with even load on each broker as follows.Modifications
loadBalancerDistributeBundlesEvenlyEnabled
for Load Balancer inServiceConfiguration
, keep it to be true as default.ModularLoadManager
Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
If
yes
was chosen, please highlight the changesDocumentation
Check the box below and label this PR (if you have committer privilege).
Need to update docs?
no-need-doc
doc-not-needed