Skip to content

Commit

Permalink
Merge pull request #1 from kubernetes/master
Browse files Browse the repository at this point in the history
pull from kube-state-metrics master
  • Loading branch information
MIBc committed Dec 11, 2018
2 parents 5119063 + 33a7d11 commit 6a1b52e
Show file tree
Hide file tree
Showing 72 changed files with 8,068 additions and 5,850 deletions.
23 changes: 13 additions & 10 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,22 @@ sudo: required

language: go

go:
- "1.11.2"

install:
- mkdir -p $HOME/gopath/src/k8s.io
- mv $TRAVIS_BUILD_DIR $HOME/gopath/src/k8s.io/kube-state-metrics

jobs:
include:
- stage: Go fmt
script: make gofmtcheck
- stage: Check that all metrics are documented
script: make doccheck
- stage: Unit Test
script: make test-unit
- stage: Build
script: make build
- stage: E2e
script: make e2e
# Go fmt
- script: make gofmtcheck
# Check that all metrics are documented
- script: make doccheck
# Unit Test
- script: make test-unit
# Build
- script: make build
# E2e
- script: make e2e
18 changes: 17 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,16 @@
## Unreleased
## v1.5.0-alpha.0 / 2018-11-30

* [CHANGE] Disable gzip compression of kube-state-metrics responses by default. Can be re-enabled via `--enable-gzip-encoding`. See #563 for more details.
* [FEATURE] Add `kube_replicatset_owner` metric (#520).
* [FEATURE] Add `kube_pod_container_status_last_terminated_reason` metric (#535).
* [FEATURE] Add `stateful_set_status.{current,update}_revision` metric (#545).
* [FEATURE] Add pod disruption budget collector (#551).
* [FEATURE] Make kube-state-metrics usable as a library (#575).
* [FEATURE] Add `kube_service_spec_external_ip` metric and add `external_name` and `load_balancer_ip` label to `kube_service_info` metric (#571).
* [ENHANCEMENT] Add uid info in `kube_pod_info` metric (#508).
* [ENHANCEMENT] Update addon-resizer to 1.8.3 and increase resource limits (#552).
* [ENHANCEMENT] Improve metric caching and rendering performance (#498).
* [ENHANCEMENT] Adding CreateContainerConfigError as possible reason for container not starting (#578).

## v1.4.0 / 2018-08-22

Expand Down Expand Up @@ -29,6 +41,10 @@ After a testing period of 12 days, there were no additional bugs found or featur

## v1.3.0-rc.0 / 2018-03-23

* [CHANGE] Removed `--in-cluster` flag in [#371](https://github.com/kubernetes/kube-state-metrics/pull/371).
Users can no longer specify `--apiserver` with `--in-cluster=true`. To
emulate this behaviour in future releases, set the `KUBERNETES_SERVICE_HOST`
environment variable to the value of the `--apiserver` argument.
* [FEATURE] Allow to specify multiple namespace.
* [FEATURE] Add `kube_pod_completion_time`, `kube_pod_spec_volumes_persistentvolumeclaims_info`, and `kube_pod_spec_volumes_persistentvolumeclaims_readonly` metrics to the Pod collector.
* [FEATURE] Add `kube_node_spec_taint` metric.
Expand Down
15 changes: 15 additions & 0 deletions Documentation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Any contribution to improving this documentation or adding sample usages will be
- [Metrics Stages](#metrics-stages)
- [Metrics Deprecation](#metrics-deprecation)
- [Exposed Metrics](#exposed-metrics)
- [Join Metrics](#join-metrics)

## Metrics Stages
Stages about metrics are grouped into three categories:
Expand Down Expand Up @@ -49,6 +50,7 @@ Per group of metrics there is one file for each metrics. See each file for speci
* [PersistentVolume Metrics](persistentvolume-metrics.md)
* [PersistentVolumeClaim Metrics](persistentvolumeclaim-metrics.md)
* [Pod Metrics](pod-metrics.md)
* [Pod Disruption Budget Metrics](poddisruptionbudget-metrics.md)
* [ReplicaSet Metrics](replicaset-metrics.md)
* [ReplicationController Metrics](replicationcontroller-metrics.md)
* [ResourceQuota Metrics](resourcequota-metrics.md)
Expand All @@ -59,3 +61,16 @@ Per group of metrics there is one file for each metrics. See each file for speci
* [Endpoint Metrics](endpoint-metrics.md)
* [Secret Metrics](secret-metrics.md)
* [ConfigMap Metrics](configmap-metrics.md)


## Join Metrics
When an additional, not provided by default label is needed, a [Prometheus matching operator](https://prometheus.io/docs/prometheus/latest/querying/operators/#vector-matching)
can be used to extend single metrics output.

This example adds `label_release` to the set of default labels of the `kube_pod_status_ready` metric
and allows you select or group the metrics by helm release label:

```
kube_pod_status_ready * on (namespace, pod) group_left(label_release) kube_pod_labels
```

177 changes: 177 additions & 0 deletions Documentation/design/metrics-store-performance-optimization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Kube-State-Metrics - Performance Optimization Proposal


---

Author: Max Inden (IndenML@gmail.com)

Date: 23. July 2018

Target release: v1.5.0

---


## Glossary

- kube-state-metrics: “Simple service that listens to the Kubernetes API server
and generates metrics about the state of the objects”

- Time series: A single line in a /metrics response e.g.
“metric_name{label="value"} 1”


## Problem Statement

There has been repeated reports of two issues running kube-state-metrics on
production Kubernetes clusters. First kube-state-metrics takes a long time
(“10s - 20s”) to respond on its /metrics endpoint, leading to Prometheus
instances dropping the scrape interval request and marking the given time series
as stale. Second kube-state-metrics uses a lot of memory and thereby being
out-of-memory killed due to low set Kubernetes resource limits.


## Goal

The goal of this proposal can be split into the following sub-goals ordered by
their priority:

1. Decrease response time on /metrics endpoint

2. Decrease overall runtime memory usage


## Status Quo

Instead of requesting the needed information from the Kubernetes API-Server on
demand (on scrape), kube-state-metrics uses the Kubernetes client-go cache tool
to keep a full in memory representation of all Kubernetes objects of a given
cluster. Using the cache speeds up the performance critical path of replying to
a scrape request, and reduces the load on the Kubernetes API-Server by only
sending deltas whenever they occur. Kube-state-metrics does not make use of all
properties and sub-objects of these Kubernetes objects that it stores in its
cache.

On a scrape request by e.g. Prometheus on the /metrics endpoint
kube-state-metrics calculates the configured time series on demand based on the
objects in its cache and converts them to the Prometheus string representation.


## Proposal

Instead of a full representation of all Kubernetes objects with all its
properties in memory via the Kubernetes client-go cache, use a map, addressable
by the Kubernetes object uuid, containing all time series of that object as a
single multi-line string.

```
var cache = map[uuid][]byte{}
```

Kube-state-metrics listens on add, update and delete events via Kubernetes
client-go reflectors. On add and update events kube-state-metrics generates all
time series related to the Kubernetes object based on the event’s payload,
concatenates the time series to a single byte slice and sets / replaces the byte
slice in the store at the uuid of the Kubernetes object. One can precompute the
length of a time series byte slice before allocation as the sum of the length of
the metric name, label keys and values as well as the metric value in string
representation. On delete events kube-state-metrics deletes the uuid entry of
the given Kubernetes object in the cache map.

On a scrape request on the /metrics endpoint, kube-state-metrics iterates over
the cache map and concatenates all time series string blobs into a single
string, which is finally passed on as a response.

```
+---------------+ +-----------+ +---------------+ +-------------------+
| pod_reflector | | pod_store | | pod_collector | | metrics_endpoint |
+---------------+ +-----------+ +---------------+ +-------------------+
-------------\ | | | |
| new pod p1 |-| | | |
|------------| | | | |
| | | |
| Add(p1) | | |
|-------------->| | |
| | ----------------------\ | |
| |-| generateMetrics(p1) | | |
| | |---------------------| | |
| | | |
| nil | | |
|<--------------| | |
| | | | ---------------\
| | | |-| GET /metrics |
| | | | |--------------|
| | | |
| | | Collect() |
| | |<--------------------------|
| | | |
| | GetAll() | |
| |<------------------------------| |
| | | |
| | []string{metrics} | |
| |------------------------------>| |
| | | |
| | | concat(metrics) |
| | |-------------------------->|
| | | |
```

<details>
<summary>Code to reproduce diagram</summary>

Build via [text-diagram](http://weidagang.github.io/text-diagram/)

```
object pod_reflector pod_store pod_collector metrics_endpoint
note left of pod_reflector: new pod p1
pod_reflector -> pod_store: Add(p1)
note right of pod_store: generateMetrics(p1)
pod_store -> pod_reflector: nil
note right of metrics_endpoint: GET /metrics
metrics_endpoint -> pod_collector: Collect()
pod_collector -> pod_store: GetAll()
pod_store -> pod_collector: []string{metrics}
pod_collector -> metrics_endpoint: concat(metrics)
```

</details>


## FAQ / Follow up improvements

- If kube-state-metrics only listens on add, update and delete events, how is it
aware of already existing Kubernetes objects created before kube-state-metrics
was started? Leveraging Kubernetes client-go, reflectors can initialize all
existing objects before any add, update or delete events. To ensure no events
are missed in the long run, periodic resyncs via Kubernetes client-go can be
triggered. This extra confidence is not a must and should be compared to its
costs, as Kubernetes client-go already gives decent guarantees on event
delivery.

- What about metadata (HELP and description) in the /metrics output? As a first
iteration they would be skipped until we have a better idea on the design.

- How can the cache map be concurrently accessed? The core golang map
implementation is not thread-safe. As a first iteration a simple mutex should
be sufficient. Golangs sync.Map might be considered.

- To solve the problem of out of order events send by the Kubernetes API-Server
to kube-state-metrics, to each blob of time series inside the cache map it can
keep the Kubernetes resource version. On add and update events, first compare
the resource version of the event with than the resource version in the cache.
Only move forward if the former is higher than the latter.

- In case the memory consumption of the time series string blobs is a problem
the following optimization can be considered: Among the time series strings,
multiple sub-strings will be heavily duplicated like the metric name. Instead
of saving unstructured strings inside the cache map, one can structure them,
using pointers to deduplicate e.g. metric names.

- ...

- Kube-state-metrics does not make use of all properties of all Kubernetes
objects. Instead of unmarshalling unused properties, their json struct tags or
their Protobuf representation could be removed.
10 changes: 10 additions & 0 deletions Documentation/poddisruptionbudget-metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# PodDisruptionBudget Metrics

| Metric name| Metric type | Labels/tags | Status |
| ---------- | ----------- | ----------- | ----------- |
| kube_poddisruptionbudget_created | Gauge | `poddisruptionbudget`=&lt;pdb-name&gt; <br> `namespace`=&lt;pdb-namespace&gt; | STABLE
| kube_poddisruptionbudget_status_current_healthy | Gauge | `poddisruptionbudget`=&lt;pdb-name&gt; <br> `namespace`=&lt;pdb-namespace&gt; | STABLE
| kube_poddisruptionbudget_status_desired_healthy | Gauge | `poddisruptionbudget`=&lt;pdb-name&gt; <br> `namespace`=&lt;pdb-namespace&gt; | STABLE
| kube_poddisruptionbudget_status_pod_disruptions_allowed | Gauge | `poddisruptionbudget`=&lt;pdb-name&gt; <br> `namespace`=&lt;pdb-namespace&gt; | STABLE
| kube_poddisruptionbudget_status_expected_pods | Gauge | `poddisruptionbudget`=&lt;pdb-name&gt; <br> `namespace`=&lt;pdb-namespace&gt; | STABLE
| kube_poddisruptionbudget_status_observed_generation | Gauge | `poddisruptionbudget`=&lt;pdb-name&gt; <br> `namespace`=&lt;pdb-namespace&gt; | STABLE
4 changes: 3 additions & 1 deletion Documentation/service-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

| Metric name| Metric type | Labels/tags | Status |
| ---------- | ----------- | ----------- | ----------- |
| kube_service_info | Gauge | `service`=&lt;service-name&gt; <br> `namespace`=&lt;service-namespace&gt; <br> `cluster_ip`=&lt;service cluster ip&gt; | STABLE |
| kube_service_info | Gauge | `service`=&lt;service-name&gt; <br> `namespace`=&lt;service-namespace&gt; <br> `cluster_ip`=&lt;service cluster ip&gt; <br> `external_name`=&lt;service external name&gt; <btr> `load_balancer_ip`=&lt;service load balancer ip&gt; | STABLE |
| kube_service_labels | Gauge | `service`=&lt;service-name&gt; <br> `namespace`=&lt;service-namespace&gt; <br> `label_SERVICE_LABEL`=&lt;SERVICE_LABEL&gt; | STABLE |
| kube_service_created | Gauge | `service`=&lt;service-name&gt; <br> `namespace`=&lt;service-namespace&gt; | STABLE |
| kube_service_spec_type | Gauge | `service`=&lt;service-name&gt; <br> `namespace`=&lt;service-namespace&gt; <br> `type`=&lt;ClusterIP\|NodePort\|LoadBalancer\|ExternalName&gt; | STABLE |
| kube_service_spec_external_ip | Gauge | `service`=&lt;service-name&gt; <br> `namespace`=&lt;service-namespace&gt; <br> `external_ip`=&lt;external-ip&gt; | STABLE |
| kube_service_status_load_balancer_ingress | Gauge | `service`=&lt;service-name&gt; <br> `namespace`=&lt;service-namespace&gt; <br> `ip`=&lt;load-balancer-ingress-ip&gt; <br> `hostname`=&lt;load-balancer-ingress-hostname&gt; | STABLE |
2 changes: 2 additions & 0 deletions Documentation/statefulset-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,5 @@
| kube_statefulset_metadata_generation | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; | STABLE |
| kube_statefulset_created | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; | STABLE |
| kube_statefulset_labels | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; <br> `label_STATEFULSET_LABEL`=&lt;STATEFULSET_LABEL&gt; | STABLE |
| kube_statefulset_status_current_revision | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; <br> `revision`=&lt;statefulset-current-revision&gt; | STABLE |
| kube_statefulset_status_update_revision | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; <br> `revision`=&lt;statefulset-update-revision&gt | STABLE |
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ BuildDate = $(shell date -u +'%Y-%m-%dT%H:%M:%SZ')
Commit = $(shell git rev-parse --short HEAD)
ALL_ARCH = amd64 arm arm64 ppc64le s390x
PKG=k8s.io/kube-state-metrics/pkg
GO_VERSION=1.10.3
GO_VERSION=1.11.2

IMAGE = $(REGISTRY)/kube-state-metrics
MULTI_ARCH_IMG = $(IMAGE)-$(ARCH)
Expand Down
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,10 @@ certain heuristics to display comprehensible messages. kube-state-metrics
exposes raw data unmodified from the Kubernetes API, this way users have all the
data they require and perform heuristics as they see fit.

The metrics are exported through the [Prometheus golang
client](https://github.com/prometheus/client_golang) on the HTTP endpoint `/metrics` on
the listening port (default 80). They are served either as plaintext or
protobuf depending on the `Accept` header. They are designed to be consumed
either by Prometheus itself or by a scraper that is compatible with scraping
a Prometheus client endpoint. You can also open `/metrics` in a browser to see
The metrics are exported on the HTTP endpoint `/metrics` on the listening port
(default 80). They are served as plaintext. They are designed to be consumed
either by Prometheus itself or by a scraper that is compatible with scraping a
Prometheus client endpoint. You can also open `/metrics` in a browser to see
the raw metrics.

## Table of Contents
Expand All @@ -35,12 +33,12 @@ the raw metrics.
- [Metrics Documentation](#metrics-documentation)
- [Kube-state-metrics self metrics](#kube-state-metrics-self-metrics)
- [Resource recommendation](#resource-recommendation)
- [kube-state-metrics vs. Heapster(metrics-server)](#kube-state-metrics-vs-heapster)
- [kube-state-metrics vs. Heapster(metrics-server)](#kube-state-metrics-vs-heapstermetrics-server)
- [Setup](#setup)
- [Building the Docker container](#building-the-docker-container)
- [Usage](#usage)
- [Kubernetes Deployment](#kubernetes-deployment)
- [Deployment](#deployment)
- [Development](#development)

### Versioning

Expand All @@ -55,9 +53,9 @@ All additional compatibility is only best effort, or happens to still/already be
#### Compatibility matrix
At most 5 kube-state-metrics releases will be recorded below.

| kube-state-metrics | client-go | **Kubernetes 1.8** | **Kubernetes 1.9** | **Kubernetes 1.10** | **Kubernetes 1.11** |
| kube-state-metrics | client-go | **Kubernetes 1.9** | **Kubernetes 1.10** | **Kubernetes 1.11** | **Kubernetes 1.12** |
|--------------------|-----------|--------------------|--------------------|--------------------|--------------------|
| **v1.1.0** | release-5.0 ||| | - |
| **v1.1.0** | release-5.0 ||| - | - |
| **v1.2.0** | v6.0.0 |||||
| **v1.3.0** | v6.0.0 |||||
| **v1.3.1** | v6.0.0 |||||
Expand Down Expand Up @@ -198,6 +196,8 @@ metrics right away.
kubectl create clusterrolebinding cluster-admin-binding --clusterrole=cluster-admin --user=$(gcloud info | grep Account | cut -d '[' -f 2 | cut -d ']' -f 1)
```

Note that your GCP identity is case sensitive but `gcloud info` as of Google Cloud SDK 221.0.0 is not. This means that if your IAM member contains capital letters, the above one-liner may not work for you. If you have 403 forbidden responses after running the above command and kubectl apply -f kubernetes, check the IAM member associated with your account at https://console.cloud.google.com/iam-admin/iam?project=PROJECT_ID. If it contains capital letters, you may need to set the --user flag in the command above to the case-sensitive role listed at https://console.cloud.google.com/iam-admin/iam?project=PROJECT_ID.

After running the above, if you see `Clusterrolebinding "cluster-admin-binding" created`, then you are able to continue with the setup of this service.

#### Development
Expand Down
Loading

0 comments on commit 6a1b52e

Please sign in to comment.