Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all nodes BGP enabled since v3.4.0 #259

Closed
jhead-slg opened this issue Apr 22, 2022 · 11 comments
Closed

Not all nodes BGP enabled since v3.4.0 #259

jhead-slg opened this issue Apr 22, 2022 · 11 comments

Comments

@jhead-slg
Copy link

It seems since v3.4.0 not all nodes are being BGP enabled by the CCM. The logic appears too have changed to only enable those nodes with LoadBalancer services that have, at some point, ran on them. Would appreciate some way to enable the previous behavior. Right now, reverted back to v3.3.0.

@deitch
Copy link
Contributor

deitch commented Apr 24, 2022

I wish there were as simple answer to this; there isn't.

For quite some time, pretty much since the beginning of CPEM (as CCM for Packet), we basically just enabled it on all nodes. Eventually we added the option for some filters. But we completely detached the "announcing of nodes able to handle requests" from "deployment of service". This is contrary to how service of type=LoadBalancer actually is supposed to work (not to mention made our CCM more complex).

When we finally adopted the complete cloud-provider API, we stopped all of our complex management, and instead just get notifications from Kubernetes itself saying, "you need a LoadBalancer for these nodes". We don't even think much about it, just accept whatever Kubernetes tells us, which is, in theory at least, the "right way".

If you feel like digging deep, the API is here, with the most important part:

EnsureLoadBalancer(ctx context.Context, clusterName string, service *v1.Service, nodes []*v1.Node)

Kubernetes tells the CCM: "Here is a service, here are the nodes that should announce it. Make it so."
CCM doesn't even know of all the nodes in the cluster, doesn't monitor them.

This is pretty much identical to how it works on other cloud providers with some form of elastic load balancer: Kubernetes tells it, "here are the nodes you should announce that service from."

@deitch
Copy link
Contributor

deitch commented Apr 24, 2022

The question is, why would you want a different behaviour?

@jhead-slg
Copy link
Author

Well, this all makes sense.

Our use-case is that we manually run BGP via Bird on the control plane machines to handle a BGP enabled API address. So, we were letting the CCM handle the enablement of BGP for the machine when it comes up.

I'm sure we can work around this on our side by building a DaemonSet to enable it or just bringing up a "fake" LoadBalancer service that makes it enable it.

@deitch
Copy link
Contributor

deitch commented Apr 25, 2022

So, we were letting the CCM handle the enablement of BGP for the machine when it comes up.

That is interesting. It was a net side-effect ("unintended implicit API" always comes back to bite 😄 )

building a DaemonSet to enable

That is one approach, definitely would work.

I will ping people inside EQXM, see if there is talk about an auto-enable. I haven't heard anything, but I can ask.

@deitch
Copy link
Contributor

deitch commented Apr 26, 2022

@jhead-slg filed a request and added to the list of, "should be able to set default device BGP enablement state by project." I will not promise timelines - it isn't my group, and I am not the product manager - but it is getting visibility.

Approaching from another side, how are you creating those servers? Can you enable it at that time?

@jhead-slg
Copy link
Author

jhead-slg commented Apr 26, 2022

Thanks!

Approaching from another side, how are you creating those servers? Can you enable it at that time?

They are being created with cluster-api. It is possible to embed a Metal API token into our KubeadmControlPlane definition; we already embed a read-only token to set some annotations on the machines. It's probably safe for us to make that a read-write token and do the BGP node enablement there alongside the Bird install and config that is already happening.

@deitch
Copy link
Contributor

deitch commented Apr 26, 2022

They are being created with cluster-api

It has been a while since I looked at the internals of how CAPP sets up the machines, but from the bit I do remember, aren't there options that can be set? It already has to have the token at the CAPP level to create the machine, so it could (at least in theory) also enable BGP?

@jhead-slg
Copy link
Author

I'm sure it is possible to add something to the cluster-api-provider-packet codebase to flip BGP on too. It appears there is some work around this in the following PR over there kubernetes-sigs/cluster-api-provider-packet#320.

@deitch
Copy link
Contributor

deitch commented Apr 26, 2022

Sure does. We should adopt that into CAPP even without kube-vip, if set in a flag.

@deitch
Copy link
Contributor

deitch commented May 3, 2022

Are we good to close this @jhead-slg ?

@jhead-slg
Copy link
Author

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants