Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in fullly connected mesh topolgy, topology update gossip's can get chatty #116

Closed
murali-reddy opened this issue Sep 17, 2019 · 0 comments · Fixed by #117
Closed

in fullly connected mesh topolgy, topology update gossip's can get chatty #116

murali-reddy opened this issue Sep 17, 2019 · 0 comments · Fixed by #117

Comments

@murali-reddy
Copy link
Contributor

On each connection add/delete/established event from a peer mesh router broadcasts topology updates to the peers. In fully connected topology broadcast would be to the all nodes in mesh.

A received topolgy gossip is further relayed to the peers if its a new update. While this should not be a concern in a stable topology it can be problematic in some use-cases.

For e.g.

  • when some one deploys a weave-net deamonset in N node cluster it can result in each node connecting to other nodes. Hence concurrent topology updates can get it in the order of n^2 in the cluster
  • in auto-scaling group's nodes can get added/deleted that can result in high topology updates

Considering #114, #115 which resuts in high cpu usage, combination chatty topology gossip results in cascading effect.

As number of peers in the mesh increases it significantly impacts scalability.

Following metrics were gathered with instrumented mesh on 150 node kubernetes cluster running weave-net using mesh. rx gossip broadcast are received topology gossip per second.

===================================================================
2019-09-17 7:22:0 Peers.garbageCollect(): 365
2019-09-17 7:22:0 routes.calculate()         -> routes.calculateBroadcast(): 59
2019-09-17 7:22:0 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 335
2019-09-17 7:22:0 routes.calculateUnicast(): 119
2019-09-17 7:22:0 connectionMaker.refresh(): 63
2019-09-17 7:22:0 rx gossip unicast: 0
2019-09-17 7:22:0 rx gossip broadcast: 325
2019-09-17 7:22:0 gossip broadcast - relay broadcasts: 345
2019-09-17 7:22:0 gossip broadcast - topology updates: 1
===================================================================
2019-09-17 7:22:1 Peers.garbageCollect(): 347
2019-09-17 7:22:1 routes.calculate()         -> routes.calculateBroadcast(): 68
2019-09-17 7:22:1 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 328
2019-09-17 7:22:1 routes.calculateUnicast(): 135
2019-09-17 7:22:1 connectionMaker.refresh(): 70
2019-09-17 7:22:1 rx gossip unicast: 0
2019-09-17 7:22:1 rx gossip broadcast: 316
2019-09-17 7:22:1 gossip broadcast - relay broadcasts: 324
2019-09-17 7:22:1 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:2 Peers.garbageCollect(): 369
2019-09-17 7:22:2 routes.calculate()         -> routes.calculateBroadcast(): 61
2019-09-17 7:22:2 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 313
2019-09-17 7:22:2 routes.calculateUnicast(): 124
2019-09-17 7:22:2 connectionMaker.refresh(): 64
2019-09-17 7:22:2 rx gossip unicast: 0
2019-09-17 7:22:2 rx gossip broadcast: 315
2019-09-17 7:22:2 gossip broadcast - relay broadcasts: 343
2019-09-17 7:22:2 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:3 Peers.garbageCollect(): 336
2019-09-17 7:22:3 routes.calculate()         -> routes.calculateBroadcast(): 75
2019-09-17 7:22:3 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 327
2019-09-17 7:22:3 routes.calculateUnicast(): 148
2019-09-17 7:22:3 connectionMaker.refresh(): 75
2019-09-17 7:22:3 rx gossip unicast: 0
2019-09-17 7:22:3 rx gossip broadcast: 322
2019-09-17 7:22:3 gossip broadcast - relay broadcasts: 326
2019-09-17 7:22:3 gossip broadcast - topology updates: 1
===================================================================
2019-09-17 7:22:4 Peers.garbageCollect(): 353
2019-09-17 7:22:4 routes.calculate()         -> routes.calculateBroadcast(): 69
2019-09-17 7:22:4 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 344
2019-09-17 7:22:4 routes.calculateUnicast(): 138
2019-09-17 7:22:4 connectionMaker.refresh(): 71
2019-09-17 7:22:4 rx gossip unicast: 0
2019-09-17 7:22:4 rx gossip broadcast: 339
2019-09-17 7:22:4 gossip broadcast - relay broadcasts: 337
2019-09-17 7:22:4 gossip broadcast - topology updates: 1
===================================================================
2019-09-17 7:22:5 Peers.garbageCollect(): 323
2019-09-17 7:22:5 routes.calculate()         -> routes.calculateBroadcast(): 68
2019-09-17 7:22:5 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 330
2019-09-17 7:22:5 routes.calculateUnicast(): 136
2019-09-17 7:22:5 connectionMaker.refresh(): 70
2019-09-17 7:22:5 rx gossip unicast: 0
2019-09-17 7:22:5 rx gossip broadcast: 328
2019-09-17 7:22:5 gossip broadcast - relay broadcasts: 311
2019-09-17 7:22:5 gossip broadcast - topology updates: 3
===================================================================
2019-09-17 7:22:6 Peers.garbageCollect(): 340
2019-09-17 7:22:6 routes.calculate()         -> routes.calculateBroadcast(): 78
2019-09-17 7:22:6 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 320
2019-09-17 7:22:6 routes.calculateUnicast(): 156
2019-09-17 7:22:6 connectionMaker.refresh(): 82
2019-09-17 7:22:6 rx gossip unicast: 0
2019-09-17 7:22:6 rx gossip broadcast: 321
2019-09-17 7:22:6 gossip broadcast - relay broadcasts: 322
2019-09-17 7:22:6 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:7 Peers.garbageCollect(): 321
2019-09-17 7:22:7 routes.calculate()         -> routes.calculateBroadcast(): 85
2019-09-17 7:22:7 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 300
2019-09-17 7:22:7 routes.calculateUnicast(): 172
2019-09-17 7:22:7 connectionMaker.refresh(): 90
2019-09-17 7:22:7 rx gossip unicast: 0
2019-09-17 7:22:7 rx gossip broadcast: 296
2019-09-17 7:22:7 gossip broadcast - relay broadcasts: 309
2019-09-17 7:22:7 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:8 Peers.garbageCollect(): 313
2019-09-17 7:22:8 routes.calculate()         -> routes.calculateBroadcast(): 81
2019-09-17 7:22:8 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 308
2019-09-17 7:22:8 routes.calculateUnicast(): 161
2019-09-17 7:22:8 connectionMaker.refresh(): 85
2019-09-17 7:22:8 rx gossip unicast: 0
2019-09-17 7:22:8 rx gossip broadcast: 309
2019-09-17 7:22:8 gossip broadcast - relay broadcasts: 291
2019-09-17 7:22:8 gossip broadcast - topology updates: 1
===================================================================
2019-09-17 7:22:9 Peers.garbageCollect(): 316
2019-09-17 7:22:9 routes.calculate()         -> routes.calculateBroadcast(): 84
2019-09-17 7:22:9 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 307
2019-09-17 7:22:9 routes.calculateUnicast(): 167
2019-09-17 7:22:9 connectionMaker.refresh(): 88
2019-09-17 7:22:9 rx gossip unicast: 0
2019-09-17 7:22:9 rx gossip broadcast: 302
2019-09-17 7:22:9 gossip broadcast - relay broadcasts: 306
2019-09-17 7:22:9 gossip broadcast - topology updates: 0
===================================================================
2019-09-17 7:22:10 Peers.garbageCollect(): 312
2019-09-17 7:22:10 routes.calculate()         -> routes.calculateBroadcast(): 83
2019-09-17 7:22:10 routes.lookupOrCalculate() -> routes.calculateBroadcast(): 278
2019-09-17 7:22:10 routes.calculateUnicast(): 166
2019-09-17 7:22:10 connectionMaker.refresh(): 85
2019-09-17 7:22:10 rx gossip unicast: 0
2019-09-17 7:22:10 rx gossip broadcast: 275
2019-09-17 7:22:10 gossip broadcast - relay broadcasts: 300
2019-09-17 7:22:10 gossip broadcast - topology updates: 2
===================================================================
murali-reddy added a commit that referenced this issue Sep 18, 2019
updates then coalesce into a single update to gossip

Fixes #116
bboreham pushed a commit that referenced this issue Sep 25, 2019
updates then coalesce into a single update to gossip

Fixes #116
murali-reddy added a commit that referenced this issue Oct 2, 2019
murali-reddy added a commit that referenced this issue Oct 3, 2019
murali-reddy added a commit that referenced this issue Oct 24, 2019
murali-reddy added a commit that referenced this issue Nov 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant