Confusing (wrong) description on supported replica of metrics server deployment #861

zqzten · 2022-08-03T07:31:48Z

In High Availability section of the operate doc, it said that KEDA metrics server does not support HA according to limitation in k8s custom metrics server.

However, the limitation mentioned is that k8s does not support more than one type of custom metrics server but not more than one replica of the same custom metrics server. So actually there's no limitation on the replica of KEDA metrics server and it does support HA. Seems the current doc is wrong there.

tomkerkhove · 2022-08-03T08:30:27Z

There is though, only 1 replica will be active and used so it does not provide HA

zqzten · 2022-08-04T07:31:58Z

But the KEDA metrics server is an aggregated apiserver (which naturally support HA) right? Say I have two replicas of it, and there's one down, the other one can still handle the request from HPA controller. I haven't look through the code so I'm not sure if I missed sth.

tomkerkhove · 2022-08-04T08:06:58Z

Yes but it's not going to actively load balance across multiple instances

tomkerkhove · 2022-08-04T08:07:14Z

But @zroubalik might correct me, but I'm pretty sure that's correct

zroubalik · 2022-08-04T08:29:54Z

Yeah, that the usual default way it works, ie. only one instance handles the load. For the requests to be balanced between all the instances of the metrics-apiserver, you need to set --enable-aggregator-routing=true on the kube-apiserver.

zroubalik · 2022-08-04T08:49:13Z

The issue is, that once we introduce metrics values caching on the metric server side, then we will have to somehow deal with sharing the cache between multiple replicas

zqzten · 2022-08-04T09:25:56Z

The issue is, that once we introduce metrics values caching on the metric server side, then we will have to somehow deal with sharing the cache between multiple replicas

Yeah that might be a problem. So the question can become: if I setup KEDA metrics server with HA (more than 1 replica), what possible issue would I meet?

tomkerkhove · 2022-08-04T09:34:01Z

That the only "HA" that you get, is that it will reduce failing over ones the primary fails

zroubalik · 2022-08-04T09:43:35Z

@zqzten no issues at the moment when specifying multiple replicas.

zqzten · 2022-08-04T09:50:00Z

Thanks! So shall we correct the doc to say that KEDA metrics server does support HA with multiple relicas?

zroubalik · 2022-08-04T09:53:40Z

Yeah, that would be great. But we should be careful with the wording, so people don't have the impression that they could have mutliple installations per cluster.

tomkerkhove · 2022-08-04T11:30:04Z

I would prefer not to change it given it's not really HA IMO but if everything things it's best then that's that.

What about this?

While you can run more replicas of our metric server, only one instance will used and serve traffic.

You can run multiple replicas, but they will not improve the performance of KEDA, it could only reduce downtime during a failover.

zroubalik · 2022-08-04T13:35:08Z

LGTM

zqzten · 2022-08-05T03:05:06Z

While you can run more replicas of our metric server, only one instance will used and serve traffic.

You can run multiple replicas, but they will not improve the performance of KEDA, it could only reduce downtime during a failover.

That might not be accurate enough? According to what @zroubalik mentioned, we can achive load balanced HA of KEDA metrics server by setting --enable-aggregator-routing=true on the kube-apiserver.

zqzten · 2022-08-05T03:13:37Z

Just found that the official metrics server also has this recommandation, FYI: https://github.com/kubernetes-sigs/metrics-server#high-availability

zroubalik · 2022-08-15T10:32:33Z

Yeah, we can also mention this property there. Are you willing to open a PR to fix this?

zqzten · 2022-08-16T06:11:51Z

Yeah, we can also mention this property there. Are you willing to open a PR to fix this?

Glad to. I'll open one in these days.

zqzten added the bug Something isn't working label Aug 3, 2022

zqzten mentioned this issue Sep 17, 2022

docs: clarify high availability descriptions #940

Merged

1 task

tomkerkhove closed this as completed in #940 Sep 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusing (wrong) description on supported replica of metrics server deployment #861

Confusing (wrong) description on supported replica of metrics server deployment #861

zqzten commented Aug 3, 2022

tomkerkhove commented Aug 3, 2022

zqzten commented Aug 4, 2022

tomkerkhove commented Aug 4, 2022

tomkerkhove commented Aug 4, 2022

zroubalik commented Aug 4, 2022

zroubalik commented Aug 4, 2022

zqzten commented Aug 4, 2022

tomkerkhove commented Aug 4, 2022

zroubalik commented Aug 4, 2022

zqzten commented Aug 4, 2022

zroubalik commented Aug 4, 2022

tomkerkhove commented Aug 4, 2022 •

edited

Loading

zroubalik commented Aug 4, 2022

zqzten commented Aug 5, 2022 •

edited

Loading

zqzten commented Aug 5, 2022

zroubalik commented Aug 15, 2022

zqzten commented Aug 16, 2022

Confusing (wrong) description on supported replica of metrics server deployment #861

Confusing (wrong) description on supported replica of metrics server deployment #861

Comments

zqzten commented Aug 3, 2022

tomkerkhove commented Aug 3, 2022

zqzten commented Aug 4, 2022

tomkerkhove commented Aug 4, 2022

tomkerkhove commented Aug 4, 2022

zroubalik commented Aug 4, 2022

zroubalik commented Aug 4, 2022

zqzten commented Aug 4, 2022

tomkerkhove commented Aug 4, 2022

zroubalik commented Aug 4, 2022

zqzten commented Aug 4, 2022

zroubalik commented Aug 4, 2022

tomkerkhove commented Aug 4, 2022 • edited Loading

zroubalik commented Aug 4, 2022

zqzten commented Aug 5, 2022 • edited Loading

zqzten commented Aug 5, 2022

zroubalik commented Aug 15, 2022

zqzten commented Aug 16, 2022

tomkerkhove commented Aug 4, 2022 •

edited

Loading

zqzten commented Aug 5, 2022 •

edited

Loading