Improve auto-sharding documentation #1559

fpetkovski · 2021-08-27T06:33:53Z

What this PR does / why we need it:
This PR improves the documentation on auto-sharding. It explains the pros and cons of using auto-sharding vs manual sharding.

How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality)
Does not affect it

Which issue(s) this PR fixes:
Fixes #1546

fpetkovski · 2021-08-27T06:34:49Z

README.md

@@ -43,7 +43,7 @@ are deleted they are no longer visible on the `/metrics` endpoint.
 - [kube-state-metrics vs. metrics-server](#kube-state-metrics-vs-metrics-server)
 - [Scaling kube-state-metrics](#scaling-kube-state-metrics)
  - [Resource recommendation](#resource-recommendation)
-  - [Horizontal scaling (sharding)](#horizontal-scaling-sharding)


I think horizontal scaling can be a bit confusing since it might indicate that all KSM replicas are the same.

fpetkovski · 2021-08-27T06:37:27Z

README.md


-There is also an experimental feature, that allows kube-state-metrics to auto discover its nominal position if it is deployed in a StatefulSet, in order to automatically configure sharding. This is an experimental feature and may be broken or removed without notice.
+KSM supports a feature which allows each shard to discover its nominal position when deployed in a StatefulSet. which is useful for automatically configuring sharding. This is an experimental feature and may be broken or removed without notice.


@brancz do you have any background information on why this feature was marked as experimental? Is it simply because it never got promoted to stable?

@kubernetes/kube-state-metrics-maintainers should we start marking this as non-experimental with the next release to get more users test it? Or do we have plans to remove it at some point?

fpetkovski · 2021-08-27T06:38:07Z

/assign @mrueg

mrueg · 2021-09-13T08:38:57Z

Thanks for working on this @fpetkovski! I added some comments

@SuperQ as you asked about more information, is this doc change helpful for you?

mrueg · 2021-09-13T08:40:53Z

README.md


-There is also an experimental feature, that allows kube-state-metrics to auto discover its nominal position if it is deployed in a StatefulSet, in order to automatically configure sharding. This is an experimental feature and may be broken or removed without notice.
+KSM supports a feature which allows each shard to discover its nominal position when deployed in a StatefulSet. which is useful for automatically configuring sharding. This is an experimental feature and may be broken or removed without notice.


Do we want to introduce KSM as an abbreviation for the project or stick with "kube-state-metrics"?

Good point, I used it as second nature 😄. I added the abbreviation next to the first reference of kube-state-metrics at the beginning of the readme. I think that it is used commonly used in the community so IMO it makes sense to have it in the docs. Wdyt?

I you change the sentence structure a bit, you could write it like this.

Automatic sharding allows each shard each shard to discover its nominal position

This avoids needing to be overly self-referential about the fact that this is a kube-state-metrics feature.

Thanks for the suggestion, applied

mrueg · 2021-09-13T08:42:48Z

README.md


 * `--shard` (zero indexed)
 * `--total-shards`

-Sharding is done by taking an md5 sum of the Kubernetes Object's UID and performing a modulo operation on it, with the total number of shards. The configured shard decides whether the object is handled by the respective instance of kube-state-metrics or not. Note that this means all instances of kube-state-metrics even if sharded will have the network traffic and the resource consumption for unmarshaling objects for all objects, not just the ones it is responsible for. To optimize this further, the Kubernetes API would need to support sharded list/watch capabilities. Overall memory consumption should be 1/n th of each shard compared to an unsharded setup. Typically, kube-state-metrics needs to be memory and latency optimized in order for it to return its metrics rather quickly to Prometheus.
+Sharding is done by taking an md5 sum of the Kubernetes Object's UID and performing a modulo operation on it with the total number of shards. Each shard decides whether the object is handled by the respective instance of kube-state-metrics or not. Note that this means all instances of kube-state-metrics, even if sharded, will have the network traffic and the resource consumption for unmarshaling objects for all objects, not just the ones they are responsible for. To optimize this further, the Kubernetes API would need to support sharded list/watch capabilities. In the optimal case, memory consumption for each shard will be 1/n compared to an unsharded setup. Typically, kube-state-metrics needs to be memory and latency optimized in order for it to return its metrics rather quickly to Prometheus.


Should we mention --use-apiserver-cache as a way to reduce load on etcd here?

Good idea, added the suggestion

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

SuperQ

LGTM

mrueg · 2021-09-15T14:01:53Z

/approve

mrueg · 2021-09-21T14:08:17Z

/lgtm

Thanks for improving the doc @fpetkovski !

k8s-ci-robot · 2021-09-21T14:08:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fpetkovski, mrueg, SuperQ

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [mrueg]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 27, 2021

k8s-ci-robot requested review from brancz and dgrisonnet August 27, 2021 06:34

fpetkovski commented Aug 27, 2021

View reviewed changes

k8s-ci-robot assigned mrueg Aug 27, 2021

mrueg reviewed Sep 13, 2021

View reviewed changes

fpetkovski force-pushed the document-autosharding branch 2 times, most recently from c7184e0 to 99dcf27 Compare September 13, 2021 09:38

Improve auto-sharding documentation

8eecd40

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>

fpetkovski force-pushed the document-autosharding branch from 99dcf27 to 8eecd40 Compare September 13, 2021 11:44

SuperQ approved these changes Sep 13, 2021

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 15, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 21, 2021

k8s-ci-robot merged commit ef61220 into kubernetes:master Sep 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve auto-sharding documentation #1559

Improve auto-sharding documentation #1559

fpetkovski commented Aug 27, 2021 •

edited

Loading

fpetkovski Aug 27, 2021

fpetkovski Aug 27, 2021

mrueg Sep 15, 2021

fpetkovski commented Aug 27, 2021

mrueg commented Sep 13, 2021 •

edited

Loading

mrueg Sep 13, 2021

fpetkovski Sep 13, 2021

SuperQ Sep 13, 2021

fpetkovski Sep 13, 2021

mrueg Sep 13, 2021

fpetkovski Sep 13, 2021

SuperQ left a comment

mrueg commented Sep 15, 2021

mrueg commented Sep 21, 2021 •

edited

Loading

k8s-ci-robot commented Sep 21, 2021


		There is also an experimental feature, that allows kube-state-metrics to auto discover its nominal position if it is deployed in a StatefulSet, in order to automatically configure sharding. This is an experimental feature and may be broken or removed without notice.
		KSM supports a feature which allows each shard to discover its nominal position when deployed in a StatefulSet. which is useful for automatically configuring sharding. This is an experimental feature and may be broken or removed without notice.

Improve auto-sharding documentation #1559

Improve auto-sharding documentation #1559

Conversation

fpetkovski commented Aug 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fpetkovski commented Aug 27, 2021

mrueg commented Sep 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SuperQ left a comment

Choose a reason for hiding this comment

mrueg commented Sep 15, 2021

mrueg commented Sep 21, 2021 • edited Loading

k8s-ci-robot commented Sep 21, 2021

fpetkovski commented Aug 27, 2021 •

edited

Loading

mrueg commented Sep 13, 2021 •

edited

Loading

mrueg commented Sep 21, 2021 •

edited

Loading