-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lean iptables updates #324
Conversation
In general I think this is an important priority for Kilo. Another way I've considered solving this in the past is by using ipsets and atomically swapping out the IP addresses in the sets so that Kilo does not have to change rules so often. This means that whenever new Nodes or Peers join a cluster, this would instead require an "instantaneous" ipset operation rather than changing iptables rules. This would bring other performance benefits with it as the evaluation of a firewall rule containing ipset is faster than evaluating many independent iptables rules. |
I do really like the sound of that approach. However, how would you feel about us starting with the implementation in this PR, so that our immediate problem can be solved (we verified that it does indeed solve our problems on one of our test clusters 😁 ), and then have a stab at doing a PR using ipsets at some point later? 😬 |
pkg/iptables/iptables.go
Outdated
@@ -105,7 +127,14 @@ func NewIPv6Rule(table, chain string, spec ...string) Rule { | |||
return &rule{table, chain, spec, ProtocolIPv6} | |||
} | |||
|
|||
func (r *rule) Add(client Client) error { | |||
func (r *rule) Prepend(client Client) error { | |||
if err := client.Insert(r.table, r.chain, 1, r.spec...); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clive-jevons should we maybe implement & use a InsertUnique
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intention of the Client
interface was to only shadow the existing API of the ipsets client? As such the InsertUnique
semantics should probably be part of the API here? Could be wrong though - some guidance by @squat or @leonnicolas would be good here, I think 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look a couple of lines down the Append()
function is implemented via AppendUnique()
- hence the question.
I think this is relevant when kilo was terminated abruptly without having a chance to clean up and then spins up again - without the check for Exists()
the Insert()
would probably cause an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very fair point 😉 I'd still be interested in the feedback / guidance from the maintainers on this 😬
But absolutely agree with the necessity of the functionality - where it ends up being implemented 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right re the interface - it's implemented in an external dependency which doesn't mirror AppendUnique()
for Insert()
. So my suggestion here would be to implement this here while at the same time making a PR in the dependency to implement InsertUnique()
and later switch to it once it's there. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Progress here: it's been merged to main. Once there is a new release we can make use of the new function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm we can probably just pin to the latest commit on main rather than wait for a release, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have done that and replaced iptables.Insert()
with iptables.InsertUnique()
The idea of using ipsets sounds great! I think this PR here might even be a nice step towards that direction by introducing the distinction between the two iptables rules classes:
The PR at hand already introduces a carrier datastructure So all in all I think this PR could even be considered as a nice evolutionary step towards an ipsets-based implementation. WDYT? |
We just ran a small test against a cluster running this PR (plus the metrics PR for observability) and this is what it looks like in grafana 😍 The two iptables related charts display the following metrics:
We also didn't observe any disconnects during autoscaling as far as we can tell. |
@dajudge what's the status of this PR? Is it ready for a final review so we can merge and maybe include in a Kilo 0.6.0? 😻 |
Yes. We've been running patched version of kilo with this PR included since July. We're currently running it on over half a dozen kubernetes clusters with dozens of nodes in total. We have not encountered any issues and it's even playing nicely with From our perspective this PR is ready to go! 😍 🚀 👍 |
@squat <- any chance of getting this merged and into a release? 😬 🙏 Anything we can do to help move this forward? |
@clive-jevons thanks for your patience! |
rebased / merge conflicts resolved. running it on a test cluster now. looks good at first glance. 👍 |
thanks @dajudge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for being patient @dajudge @clive-jevons <3
Why
We were repeatedly expecting connectivity issues when scaling larger clusters up/down and believe it's related to the way kilo applies changes to the iptables when nodes are added/removed: in many cases that resulted in many rules (or even entire chains) being dropped and recreated during scaling.
Idea
After taking a closer look at the iptables rules created by kilo we figured we have identified two different classes of the same:
FORWARD
toKILO-IPIP
andDROP
rule in theINPUT
chain - also e.g. theMASQUERADE
rule at the end of theKILO-NAT
chain)How
So we started preparing this patch that puts the different rules in separate slices and
chain rules are treated as order-sensitive/append-only which does not matter a lot beyond being applied in the first wave of changes
Results
That way we were able to accomplish the following:
We were able to confirm that this reduces the number of mutating iptables operations after the initial iptables configuration to the absolute minimum required.