Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.

ISSUE-8492: Unloaded namespace bundles may not be assigned to suitable brokers. #1660

Closed
sijie opened this issue Nov 9, 2020 · 0 comments
Closed

Comments

@sijie
Copy link
Member

sijie commented Nov 9, 2020

Original Issue: apache#8492


After unloading namespace bundle, one of the available brokers shall be selected for the unloaded bundle to continue working. All available broker candidates will be filtered in org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.selectBrokerForAssignment, it:

  1. applies NS policy if any, then,
  2. filters out brokers with too many topics from candidates, then,
  3. filters out brokers based on anit-affinity-group setting, then,
  4. removes brokers that owning non-minimum-number of bundles, then,
  5. filters out brokers that should be ignored (e.g. old version broker), then,
  6. selects the best broker that having lowest load.

The problem happens at step 4, it only keeps the broker candidates that having least bundles. That is, the broker selection may be completely based on bunlde number on the broker, rather than the actual load.

image

For example, on this cluster (2.6.1 with 8 brokers), 8 topics (partitions) have significantly higher OUT throughput than the others. Ideally they should be evenly assigned on the brokers, rather than working on a small subset of the brokers. However, the filtering rule based on bundle number converges these 8 topics on 2 brokers (4 on each), which results in these 2 brokers having much higher OUT throughput, but much lower amount of bundles than the rest.

Although the ThresholdShedding from apache#6772 is able to detect that these 2 brokers are overloaded, they cannot be moved to the rest brokers because that only these 2 brokers can "survive" after step 4. The consequence is that these 8 topics are repeatly moved between these two brokers during load-balancing.

I think there should be at least two solutions:

  1. Just remove the step 4, because the other steps, especailly the last one, should be enough to find out the suitable broker, i.e. the broker with lowest load.
  2. Make this configurable, such as adding a new config called loadBalancePreferBrokersWithLeastBundles.

How do you like it? Or are there any other, more graceful, solutions to this problem?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant