Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing (wrong) description on supported replica of metrics server deployment #861

Closed
zqzten opened this issue Aug 3, 2022 · 17 comments · Fixed by #940
Closed

Confusing (wrong) description on supported replica of metrics server deployment #861

zqzten opened this issue Aug 3, 2022 · 17 comments · Fixed by #940
Labels
bug Something isn't working

Comments

@zqzten
Copy link
Contributor

zqzten commented Aug 3, 2022

In High Availability section of the operate doc, it said that KEDA metrics server does not support HA according to limitation in k8s custom metrics server.

However, the limitation mentioned is that k8s does not support more than one type of custom metrics server but not more than one replica of the same custom metrics server. So actually there's no limitation on the replica of KEDA metrics server and it does support HA. Seems the current doc is wrong there.

@zqzten zqzten added the bug Something isn't working label Aug 3, 2022
@tomkerkhove
Copy link
Member

There is though, only 1 replica will be active and used so it does not provide HA

@zqzten
Copy link
Contributor Author

zqzten commented Aug 4, 2022

But the KEDA metrics server is an aggregated apiserver (which naturally support HA) right? Say I have two replicas of it, and there's one down, the other one can still handle the request from HPA controller. I haven't look through the code so I'm not sure if I missed sth.

@tomkerkhove
Copy link
Member

Yes but it's not going to actively load balance across multiple instances

@tomkerkhove
Copy link
Member

But @zroubalik might correct me, but I'm pretty sure that's correct

@zroubalik
Copy link
Member

Yeah, that the usual default way it works, ie. only one instance handles the load. For the requests to be balanced between all the instances of the metrics-apiserver, you need to set --enable-aggregator-routing=true on the kube-apiserver.

@zroubalik
Copy link
Member

The issue is, that once we introduce metrics values caching on the metric server side, then we will have to somehow deal with sharing the cache between multiple replicas

@zqzten
Copy link
Contributor Author

zqzten commented Aug 4, 2022

The issue is, that once we introduce metrics values caching on the metric server side, then we will have to somehow deal with sharing the cache between multiple replicas

Yeah that might be a problem. So the question can become: if I setup KEDA metrics server with HA (more than 1 replica), what possible issue would I meet?

@tomkerkhove
Copy link
Member

That the only "HA" that you get, is that it will reduce failing over ones the primary fails

@zroubalik
Copy link
Member

@zqzten no issues at the moment when specifying multiple replicas.

@zqzten
Copy link
Contributor Author

zqzten commented Aug 4, 2022

Thanks! So shall we correct the doc to say that KEDA metrics server does support HA with multiple relicas?

@zroubalik
Copy link
Member

Yeah, that would be great. But we should be careful with the wording, so people don't have the impression that they could have mutliple installations per cluster.

@tomkerkhove
Copy link
Member

tomkerkhove commented Aug 4, 2022

I would prefer not to change it given it's not really HA IMO but if everything things it's best then that's that.

What about this?

While you can run more replicas of our metric server, only one instance will used and serve traffic.

You can run multiple replicas, but they will not improve the performance of KEDA, it could only reduce downtime during a failover.

@zroubalik
Copy link
Member

LGTM

@zqzten
Copy link
Contributor Author

zqzten commented Aug 5, 2022

While you can run more replicas of our metric server, only one instance will used and serve traffic.

You can run multiple replicas, but they will not improve the performance of KEDA, it could only reduce downtime during a failover.

That might not be accurate enough? According to what @zroubalik mentioned, we can achive load balanced HA of KEDA metrics server by setting --enable-aggregator-routing=true on the kube-apiserver.

@zqzten
Copy link
Contributor Author

zqzten commented Aug 5, 2022

Just found that the official metrics server also has this recommandation, FYI: https://github.com/kubernetes-sigs/metrics-server#high-availability

@zroubalik
Copy link
Member

Yeah, we can also mention this property there. Are you willing to open a PR to fix this?

@zqzten
Copy link
Contributor Author

zqzten commented Aug 16, 2022

Yeah, we can also mention this property there. Are you willing to open a PR to fix this?

Glad to. I'll open one in these days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants