Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Support scraping pods metrics that is still in scheduling status and has no assigned node #2217

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,7 @@ The downside of using an auto-sharded setup comes from the rollout strategy supp

For pod metrics, they can be sharded per node with the following flag:

* `--node`
* `--node=$(NODE_NAME)`

Each kube-state-metrics pod uses FieldSelector (spec.nodeName) to watch/list pod metrics only on the same node.

Expand All @@ -276,6 +276,22 @@ spec:
fieldPath: spec.nodeName
```

To track metrics for unassigned pods, you need to add an additional deployment and set `--node=""`, as shown in the following example:
mrueg marked this conversation as resolved.
Show resolved Hide resolved

```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mickeyzzc can you add an empty line before the codeblock to fix the lint? Thanks!

apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
containers:
- image: registry.k8s.io/kube-state-metrics/kube-state-metrics:IMAGE_TAG
name: kube-state-metrics
args:
- --resources=pods
- --node=""
```

Other metrics can be sharded via [Horizontal sharding](#horizontal-sharding).

### Setup
Expand Down
58 changes: 58 additions & 0 deletions examples/daemonsetsharding/deployment-no-node-pods.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
apiVersion: apps/v1
mickeyzzc marked this conversation as resolved.
Show resolved Hide resolved
kind: Deployment
metadata:
labels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kube-state-metrics-pods
app.kubernetes.io/version: 2.10.0
name: kube-state-metrics-pods
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: kube-state-metrics
mickeyzzc marked this conversation as resolved.
Show resolved Hide resolved
template:
metadata:
labels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 2.10.0
spec:
automountServiceAccountToken: true
containers:
- args:
- --resources=pods
- --node=""
image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.10.0
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
timeoutSeconds: 5
name: kube-state-metrics
ports:
- containerPort: 8080
name: http-metrics
- containerPort: 8081
name: telemetry
readinessProbe:
httpGet:
path: /
port: 8081
initialDelaySeconds: 5
timeoutSeconds: 5
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 65534
seccompProfile:
type: RuntimeDefault
nodeSelector:
kubernetes.io/os: linux
serviceAccountName: kube-state-metrics
24 changes: 24 additions & 0 deletions jsonnet/kube-state-metrics/kube-state-metrics.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,30 @@
},
),

deploymentNoNodePods:
local c = ksm.deployment.spec.template.spec.containers[0] {
args: [
'--resources=pods',
'--node=""',
],
};
local shardksmname = ksm.name + "-pods";
std.mergePatch(ksm.deployment,
{
metadata: {
name: shardksmname,
labels: {'app.kubernetes.io/name': shardksmname}
},
spec: {
template: {
spec: {
containers: [c],
},
},
},
},
),

daemonset:
// extending the default container from above
local c0 = ksm.deployment.spec.template.spec.containers[0] {
Expand Down
5 changes: 3 additions & 2 deletions pkg/options/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ func NewOptions() *Options {
MetricAllowlist: MetricSet{},
MetricDenylist: MetricSet{},
MetricOptInList: MetricSet{},
Node: NodeType{},
AnnotationsAllowList: LabelsAllowList{},
LabelsAllowList: LabelsAllowList{},
}
Expand Down Expand Up @@ -136,7 +137,7 @@ func (o *Options) AddFlags(cmd *cobra.Command) {
o.cmd.Flags().StringVar(&o.TLSConfig, "tls-config", "", "Path to the TLS configuration file")
o.cmd.Flags().StringVar(&o.TelemetryHost, "telemetry-host", "::", `Host to expose kube-state-metrics self metrics on.`)
o.cmd.Flags().StringVar(&o.Config, "config", "", "Path to the kube-state-metrics options config file")
o.cmd.Flags().StringVar((*string)(&o.Node), "node", "", "Name of the node that contains the kube-state-metrics pod. Most likely it should be passed via the downward API. This is used for daemonset sharding. Only available for resources (pod metrics) that support spec.nodeName fieldSelector. This is experimental.")
o.cmd.Flags().Var(&o.Node, "node", "Name of the node that contains the kube-state-metrics pod. Most likely it should be passed via the downward API. This is used for daemonset sharding. Only available for resources (pod metrics) that support spec.nodeName fieldSelector. This is experimental.")
o.cmd.Flags().Var(&o.AnnotationsAllowList, "metric-annotations-allowlist", "Comma-separated list of Kubernetes annotations keys that will be used in the resource' labels metric. By default the annotations metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes annotation keys you would like to allow for them (Example: '=namespaces=[kubernetes.io/team,...],pods=[kubernetes.io/team],...)'. A single '*' can be provided per resource instead to allow any annotations, but that has severe performance implications (Example: '=pods=[*]').")
o.cmd.Flags().Var(&o.LabelsAllowList, "metric-labels-allowlist", "Comma-separated list of additional Kubernetes label keys that will be used in the resource' labels metric. By default the labels metrics are not exposed. To include them, provide a list of resource names in their plural form and Kubernetes label keys you would like to allow for them (Example: '=namespaces=[k8s-label-1,k8s-label-n,...],pods=[app],...)'. A single '*' can be provided per resource instead to allow any labels, but that has severe performance implications (Example: '=pods=[*]'). Additionally, an asterisk (*) can be provided as a key, which will resolve to all resources, i.e., assuming '--resources=deployments,pods', '=*=[*]' will resolve to '=deployments=[*],pods=[*]'.")
o.cmd.Flags().Var(&o.MetricAllowlist, "metric-allowlist", "Comma-separated list of metrics to be exposed. This list comprises of exact metric names and/or regex patterns. The allowlist and denylist are mutually exclusive.")
Expand All @@ -161,7 +162,7 @@ func (o *Options) Usage() {
// Validate validates arguments
func (o *Options) Validate() error {
shardableResource := "pods"
if o.Node == "" {
if o.Node.String() == "" {
return nil
}
for _, x := range o.Resources.AsSlice() {
Expand Down
51 changes: 47 additions & 4 deletions pkg/options/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ package options

import (
"errors"
"regexp"
"sort"
"strings"

Expand Down Expand Up @@ -105,14 +106,56 @@ func (r *ResourceSet) Type() string {
}

// NodeType represents a nodeName to query from.
type NodeType string
type NodeType map[string]struct{}

// Set converts a comma-separated string of nodename into a slice and appends it to the NodeList
func (n *NodeType) Set(value string) error {
s := *n
mickeyzzc marked this conversation as resolved.
Show resolved Hide resolved
cols := strings.Split(value, ",")
for _, col := range cols {
col = strings.TrimSpace(col)
if len(col) != 0 {
s[col] = struct{}{}
}
}
return nil
}

// AsSlice returns the LabelsAllowList in the form of plain string slice.
func (n NodeType) AsSlice() []string {
cols := make([]string, 0, len(n))
for col := range n {
cols = append(cols, col)
}
return cols
}

func (n NodeType) String() string {
return strings.Join(n.AsSlice(), ",")
}

// Type returns a descriptive string about the NodeList type.
func (n *NodeType) Type() string {
return "string"
}

// GetNodeFieldSelector returns a nodename field selector.
func (n *NodeType) GetNodeFieldSelector() string {
if string(*n) != "" {
return fields.OneTermEqualSelector("spec.nodeName", string(*n)).String()
if nil == n || len(*n) == 0 {
klog.InfoS("Using node type is nil")
return EmptyFieldSelector()
}
return EmptyFieldSelector()
pattern := "[^a-zA-Z0-9_,-]+"
re := regexp.MustCompile(pattern)
result := re.ReplaceAllString(n.String(), "")
Copy link
Contributor

@CatherineF-dev CatherineF-dev Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: why we need re.ReplaceAllString(n.String(), "")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove re.ReplaceAllString(n.String(), ""). If node name isn't correct, keep it as before. Then users can know node name isn't correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: why we need re.ReplaceAllString(n.String(), "")

There are two considerations for regularization here:

  1. Handle the difference between "" and nil
  2. Correction of node name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: could you give two examples for these two cases?

We shouldn't change node name, which users are not aware of.

Copy link
Contributor

@CatherineF-dev CatherineF-dev Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping~

Keep node name as same from input, or return errors during validation. We should not hide errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mickeyzzc can you address this comment?

klog.InfoS("Using node type", "node", result)
return fields.OneTermEqualSelector("spec.nodeName", result).String()

}

// NodeValue represents a nodeName to query from.
type NodeValue interface {
GetNodeFieldSelector() string
}

// EmptyFieldSelector returns an empty field selector.
Expand Down
71 changes: 56 additions & 15 deletions pkg/options/types_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -162,13 +162,30 @@ func TestNodeFieldSelector(t *testing.T) {
Wanted string
}{
{
mickeyzzc marked this conversation as resolved.
Show resolved Hide resolved
Desc: "empty node name",
Node: "",
Desc: "with node name",
Wanted: "",
},
{
Desc: "with node name",
Node: "k8s-node-1",
Node: nil,
Wanted: "",
},
{
Desc: "empty node name",
Node: NodeType(
map[string]struct{}{
"": {},
},
),
Wanted: "spec.nodeName=",
},
{
Desc: "with node name",
Node: NodeType(
map[string]struct{}{
"k8s-node-1": {},
},
),
Wanted: "spec.nodeName=k8s-node-1",
},
}
Expand All @@ -194,43 +211,67 @@ func TestMergeFieldSelectors(t *testing.T) {
Desc: "empty DeniedNamespaces",
Namespaces: NamespaceList{"default", "kube-system"},
DeniedNamespaces: NamespaceList{},
Node: "",
Wanted: "",
Node: NodeType(
map[string]struct{}{
"": {},
},
),
Wanted: "spec.nodeName=",
},
{
Desc: "all DeniedNamespaces",
Namespaces: DefaultNamespaces,
DeniedNamespaces: NamespaceList{"some-system"},
Node: "",
Wanted: "metadata.namespace!=some-system",
Node: NodeType(
map[string]struct{}{
"": {},
},
),
Wanted: "metadata.namespace!=some-system,spec.nodeName=",
},
{
Desc: "general case",
Namespaces: DefaultNamespaces,
DeniedNamespaces: NamespaceList{"case1-system", "case2-system"},
Node: "",
Wanted: "metadata.namespace!=case1-system,metadata.namespace!=case2-system",
Node: NodeType(
map[string]struct{}{
"": {},
},
),
Wanted: "metadata.namespace!=case1-system,metadata.namespace!=case2-system,spec.nodeName=",
},
{
Desc: "empty DeniedNamespaces",
Namespaces: NamespaceList{"default", "kube-system"},
DeniedNamespaces: NamespaceList{},
Node: "k8s-node-1",
Wanted: "spec.nodeName=k8s-node-1",
Node: NodeType(
map[string]struct{}{
"k8s-node-1": {},
},
),
Wanted: "spec.nodeName=k8s-node-1",
},
{
Desc: "all DeniedNamespaces",
Namespaces: DefaultNamespaces,
DeniedNamespaces: NamespaceList{"some-system"},
Node: "k8s-node-1",
Wanted: "metadata.namespace!=some-system,spec.nodeName=k8s-node-1",
Node: NodeType(
map[string]struct{}{
"k8s-node-1": {},
},
),
Wanted: "metadata.namespace!=some-system,spec.nodeName=k8s-node-1",
},
{
Desc: "general case",
Namespaces: DefaultNamespaces,
DeniedNamespaces: NamespaceList{"case1-system", "case2-system"},
Node: "k8s-node-1",
Wanted: "metadata.namespace!=case1-system,metadata.namespace!=case2-system,spec.nodeName=k8s-node-1",
Node: NodeType(
map[string]struct{}{
"k8s-node-1": {},
},
),
Wanted: "metadata.namespace!=case1-system,metadata.namespace!=case2-system,spec.nodeName=k8s-node-1",
},
}

Expand Down