You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After unloading namespace bundle, one of the available brokers shall be selected for the unloaded bundle to continue working. All available broker candidates will be filtered in org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.selectBrokerForAssignment, it:
applies NS policy if any, then,
filters out brokers with too many topics from candidates, then,
filters out brokers based on anit-affinity-group setting, then,
removes brokers that owning non-minimum-number of bundles, then,
filters out brokers that should be ignored (e.g. old version broker), then,
selects the best broker that having lowest load.
The problem happens at step 4, it only keeps the broker candidates that having least bundles. That is, the broker selection may be completely based on bunlde number on the broker, rather than the actual load.
For example, on this cluster (2.6.1 with 8 brokers), 8 topics (partitions) have significantly higher OUT throughput than the others. Ideally they should be evenly assigned on the brokers, rather than working on a small subset of the brokers. However, the filtering rule based on bundle number converges these 8 topics on 2 brokers (4 on each), which results in these 2 brokers having much higher OUT throughput, but much lower amount of bundles than the rest.
Although the ThresholdShedding from #6772 is able to detect that these 2 brokers are overloaded, they cannot be moved to the rest brokers because that only these 2 brokers can "survive" after step 4. The consequence is that these 8 topics are repeatly moved between these two brokers during load-balancing.
I think there should be at least two solutions:
Just remove the step 4, because the other steps, especailly the last one, should be enough to find out the suitable broker, i.e. the broker with lowest load.
Make this configurable, such as adding a new config called loadBalancePreferBrokersWithLeastBundles.
How do you like it? Or are there any other, more graceful, solutions to this problem?
The text was updated successfully, but these errors were encountered:
I think you can try to usetopic_count_equally_divide for the defaultNamespaceBundleSplitAlgorithm, the default policy is range_equally_divide, when there are few topics, the topic_count_equally_divide is better than range_equally_divide
I think you can try to usetopic_count_equally_divide for the defaultNamespaceBundleSplitAlgorithm, the default policy is range_equally_divide, when there are few topics, the topic_count_equally_divide is better than range_equally_divide
@codelipenghui Thank for your reply! I just changed the configuration you suggested, let me observe it for a while, thanks~
@codelipenghui I tried the config you gave me, but the issue persists. But I tried split the big topic into more partitions, the situation is better now. Thanks anyway~
After unloading namespace bundle, one of the available brokers shall be selected for the unloaded bundle to continue working. All available broker candidates will be filtered in
org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.selectBrokerForAssignment
, it:The problem happens at step 4, it only keeps the broker candidates that having least bundles. That is, the broker selection may be completely based on bunlde number on the broker, rather than the actual load.
For example, on this cluster (
2.6.1
with 8 brokers), 8 topics (partitions) have significantly higher OUT throughput than the others. Ideally they should be evenly assigned on the brokers, rather than working on a small subset of the brokers. However, the filtering rule based on bundle number converges these 8 topics on 2 brokers (4 on each), which results in these 2 brokers having much higher OUT throughput, but much lower amount of bundles than the rest.Although the
ThresholdShedding
from #6772 is able to detect that these 2 brokers are overloaded, they cannot be moved to the rest brokers because that only these 2 brokers can "survive" after step 4. The consequence is that these 8 topics are repeatly moved between these two brokers during load-balancing.I think there should be at least two solutions:
loadBalancePreferBrokersWithLeastBundles
.How do you like it? Or are there any other, more graceful, solutions to this problem?
The text was updated successfully, but these errors were encountered: