You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have Installed Opentelemetry Collector on a Kubernetes Cluster to monitor Kubernetes Cluster (Metrics).
Below is the Helm Chart Values
# Default values for opentelemetry-collector.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
nameOverride: ""
fullnameOverride: ""
# Valid values are "daemonset", "deployment", and "statefulset".
mode: daemonset
# Specify which namespace should be used to deploy the resources into
namespaceOverride: ""
# Handles basic configuration of components that
# also require k8s modifications to work correctly.
# .Values.config can be used to modify/add to a preset
# component configuration, but CANNOT be used to remove
# preset configuration. If you require removal of any
# sections of a preset configuration, you cannot use
# the preset. Instead, configure the component manually in
# .Values.config and use the other fields supplied in the
# values.yaml to configure k8s as necessary.
presets:
# Configures the collector to collect logs.
# Adds the filelog receiver to the logs pipeline
# and adds the necessary volumes and volume mounts.
# Best used with mode = daemonset.
# See https://opentelemetry.io/docs/kubernetes/collector/components/#filelog-receiver for details on the receiver.
logsCollection:
enabled: false
includeCollectorLogs: false
# Enabling this writes checkpoints in /var/lib/otelcol/ host directory.
# Note this changes collector's user to root, so that it can write to host directory.
storeCheckpoints: false
# The maximum bytes size of the recombined field.
# Once the size exceeds the limit, all received entries of the source will be combined and flushed.
maxRecombineLogSize: 102400
# Configures the collector to collect host metrics.
# Adds the hostmetrics receiver to the metrics pipeline
# and adds the necessary volumes and volume mounts.
# Best used with mode = daemonset.
# See https://opentelemetry.io/docs/kubernetes/collector/components/#host-metrics-receiver for details on the receiver.
hostMetrics:
enabled: false
# Configures the Kubernetes Processor to add Kubernetes metadata.
# Adds the k8sattributes processor to all the pipelines
# and adds the necessary rules to ClusteRole.
# Best used with mode = daemonset.
# See https://opentelemetry.io/docs/kubernetes/collector/components/#kubernetes-attributes-processor for details on the receiver.
kubernetesAttributes:
enabled: false
# When enabled the processor will extra all labels for an associated pod and add them as resource attributes.
# The label's exact name will be the key.
extractAllPodLabels: true
# When enabled the processor will extra all annotations for an associated pod and add them as resource attributes.
# The annotation's exact name will be the key.
extractAllPodAnnotations: true
# Configures the collector to collect node, pod, and container metrics from the API server on a kubelet..
# Adds the kubeletstats receiver to the metrics pipeline
# and adds the necessary rules to ClusteRole.
# Best used with mode = daemonset.
# See https://opentelemetry.io/docs/kubernetes/collector/components/#kubeletstats-receiver for details on the receiver.
kubeletMetrics:
enabled: false
# Configures the collector to collect kubernetes events.
# Adds the k8sobject receiver to the logs pipeline
# and collects kubernetes events by default.
# Best used with mode = deployment or statefulset.
# See https://opentelemetry.io/docs/kubernetes/collector/components/#kubernetes-objects-receiver for details on the receiver.
kubernetesEvents:
enabled: false
# Configures the Kubernetes Cluster Receiver to collect cluster-level metrics.
# Adds the k8s_cluster receiver to the metrics pipeline
# and adds the necessary rules to ClusteRole.
# Best used with mode = deployment or statefulset.
# See https://opentelemetry.io/docs/kubernetes/collector/components/#kubernetes-cluster-receiver for details on the receiver.
clusterMetrics:
enabled: false
configMap:
# Specifies whether a configMap should be created (true by default)
create: true
# Specifies an existing ConfigMap to be mounted to the pod
# The ConfigMap MUST include the collector configuration via a key named 'relay' or the collector will not start.
existingName: ""
# Base collector configuration.
# Supports templating. To escape existing instances of {{ }}, use {{` <original content> `}}.
# For example, {{ REDACTED_EMAIL }} becomes {{` {{ REDACTED_EMAIL }} `}}.
config:
extensions:
health_check: {}
zpages:
endpoint: 0.0.0.0:55679
receivers:
otlp:
protocols:
grpc:
http:
cors:
allowed_origins:
- "http://*"
- "https://*"
k8s_cluster:
node_conditions_to_report:
- Ready
- MemoryPressure
allocatable_types_to_report:
- cpu
- memory
k8sobjects:
auth_type: serviceAccount
objects:
- name: pods
mode: pull
label_selector: environment in (production),tier in (frontend)
field_selector: status.phase=Running
interval: 15m
- name: events
mode: watch
group: events.k8s.io
namespaces: [default]
kubeletstats:
collection_interval: 10s
auth_type: serviceAccount
endpoint: '${env:k8S_NODE_IP}:10250'
insecure_skip_verify: true
metric_groups:
- node
- pod
- container
prometheus:
config:
scrape_configs:
- job_name: 'otel-collector'
scrape_interval: 5s
static_configs:
- targets:
- ${env:MY_POD_IP}:8888
exporters:
debug:
verbosity: detailed
prometheusremotewrite:
endpoint: http://txxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/api/v1/receive ## Thanos Receive endpoint
processors:
memory_limiter:
check_interval: 10s
limit_percentage: 50
spike_limit_percentage: 30
k8sattributes:
auth_type: "serviceAccount"
passthrough: false
filter:
node_from_env_var: KUBE_NODE_NAME
extract:
metadata:
- k8s.pod.name
- k8s.pod.uid
- k8s.deployment.name
- k8s.namespace.name
- k8s.node.name
- k8s.pod.start_time
- k8s.daemonset.name
- k8s.statefulset.name
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
batch:
send_batch_size: 1000
send_batch_max_size: 1000
timeout: 10s
resource:
attributes:
- key: host.id
from_attribute: host.name
action: upsert
- key: k8s.cluster.name
value: aks-dev-chronos
action: insert
- key: service.instance.id
from_attribute: k8s.pod.uid
action: insert
resourcedetection:
detectors: [env]
transform:
trace_statements:
- context: span
statements:
- truncate_all(attributes, 4095)
- truncate_all(resource.attributes, 4095)
metricstransform:
transforms:
- include: duration
action: update
new_name: http.server.duration
service:
telemetry:
logs:
level: debug
extensions: [health_check, zpages]
pipelines:
metrics:
receivers:
- otlp
- prometheus
- k8s_cluster
# - kubeletstats
processors:
- memory_limiter
- resourcedetection
- resource
- k8sattributes
- batch
- metricstransform
exporters: [debug, prometheusremotewrite]
# exporters:
# azuremonitor:
# connection_string: InstrumentationKey=63921f2d-8871-4754-baba-7ba7fbe56451;IngestionEndpoint=https://westus-0.in.applicationinsights.azure.com/;LiveEndpoint=https://westus.livediagnostics.monitor.azure.com/;ApplicationId=8a81e786-4b23-4bc6-94f0-a43ba88dc954
# extensions:
# # The health_check extension is mandatory for this chart.
# # Without the health_check extension the collector will fail the readiness and liveliness probes.
# # The health_check extension can be modified, but should never be removed.
# health_check:
# endpoint: ${env:MY_POD_IP}:13133
# processors:
# batch: {}
# # Default memory limiter configuration for the collector based on k8s resource limits.
# memory_limiter:
# # check_interval is the time between measurements of memory usage.
# check_interval: 5s
# # By default limit_mib is set to 80% of ".Values.resources.limits.memory"
# limit_percentage: 80
# # By default spike_limit_mib is set to 25% of ".Values.resources.limits.memory"
# spike_limit_percentage: 25
# receivers:
# jaeger:
# protocols:
# grpc:
# endpoint: ${env:MY_POD_IP}:14250
# thrift_http:
# endpoint: ${env:MY_POD_IP}:14268
# thrift_compact:
# endpoint: ${env:MY_POD_IP}:6831
# otlp:
# protocols:
# grpc:
# endpoint: ${env:MY_POD_IP}:4317
# http:
# endpoint: ${env:MY_POD_IP}:4318
# prometheus:
# config:
# scrape_configs:
# - job_name: opentelemetry-collector
# scrape_interval: 10s
# static_configs:
# - targets:
# - ${env:MY_POD_IP}:8888
# zipkin:
# endpoint: ${env:MY_POD_IP}:9411
# service:
# telemetry:
# logs:
# level: info
# metrics:
# address: ${env:MY_POD_IP}:8888
# extensions:
# - health_check
# pipelines:
# logs:
# exporters:
# - debug
# - azuremonitor
# processors:
# - memory_limiter
# - batch
# receivers:
# - otlp
# metrics:
# exporters:
# - debug
# - azuremonitor
# processors:
# - memory_limiter
# - batch
# receivers:
# - otlp
# - prometheus
# traces:
# exporters:
# - debug
# - azuremonitor
# processors:
# - memory_limiter
# - batch
# receivers:
# - otlp
# - jaeger
# - zipkin
image:
# If you want to use the core image `otel/opentelemetry-collector`, you also need to change `command.name` value to `otelcol`.
repository: "otel/opentelemetry-collector-contrib"
pullPolicy: IfNotPresent
# Overrides the image tag whose default is the chart appVersion.
tag: ""
# When digest is set to a non-empty value, images will be pulled by digest (regardless of tag value).
digest: ""
imagePullSecrets: []
# OpenTelemetry Collector executable
command:
name: "otelcol-contrib"
extraArgs: []
serviceAccount:
# Specifies whether a service account should be created
create: true
# Annotations to add to the service account
annotations: {}
# The name of the service account to use.
# If not set and create is true, a name is generated using the fullname template
name: ""
clusterRole:
# Specifies whether a clusterRole should be created
# Some presets also trigger the creation of a cluster role and cluster role binding.
# If using one of those presets, this field is no-op.
create: true
# Annotations to add to the clusterRole
# Can be used in combination with presets that create a cluster role.
annotations: {}
# The name of the clusterRole to use.
# If not set a name is generated using the fullname template
# Can be used in combination with presets that create a cluster role.
name: ""
# A set of rules as documented here : https://kubernetes.io/docs/reference/access-authn-authz/rbac/
# Can be used in combination with presets that create a cluster role to add additional rules.
rules:
- apiGroups:
- ""
resources:
- events
- namespaces
- namespaces/status
- nodes
- nodes/spec
- nodes/stats
- nodes/proxy
- pods
- pods/status
- replicationcontrollers
- replicationcontrollers/status
- resourcequotas
- services
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- daemonsets
- deployments
- replicasets
- statefulsets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- daemonsets
- deployments
- replicasets
verbs:
- get
- list
- watch
- apiGroups:
- batch
resources:
- jobs
- cronjobs
verbs:
- get
- list
- watch
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
clusterRoleBinding:
# Annotations to add to the clusterRoleBinding
# Can be used in combination with presets that create a cluster role binding.
annotations: {}
# The name of the clusterRoleBinding to use.
# If not set a name is generated using the fullname template
# Can be used in combination with presets that create a cluster role binding.
name: ""
podSecurityContext: {}
securityContext: {}
nodeSelector: {}
tolerations: []
affinity: {}
topologySpreadConstraints: []
# Allows for pod scheduler prioritisation
priorityClassName: ""
extraEnvs:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: k8S_NODE_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
extraEnvsFrom: []
# This also supports template content, which will eventually be converted to yaml.
extraVolumes: []
# This also supports template content, which will eventually be converted to yaml.
extraVolumeMounts: []
# Configuration for ports
# nodePort is also allowed
ports:
otlp:
enabled: true
containerPort: 4317
servicePort: 4317
hostPort: 4317
protocol: TCP
# nodePort: 30317
appProtocol: grpc
otlp-http:
enabled: true
containerPort: 4318
servicePort: 4318
hostPort: 4318
protocol: TCP
jaeger-compact:
enabled: true
containerPort: 6831
servicePort: 6831
hostPort: 6831
protocol: UDP
jaeger-thrift:
enabled: true
containerPort: 14268
servicePort: 14268
hostPort: 14268
protocol: TCP
jaeger-grpc:
enabled: true
containerPort: 14250
servicePort: 14250
hostPort: 14250
protocol: TCP
zipkin:
enabled: true
containerPort: 9411
servicePort: 9411
hostPort: 9411
protocol: TCP
metrics:
# The metrics port is disabled by default. However you need to enable the port
# in order to use the ServiceMonitor (serviceMonitor.enabled) or PodMonitor (podMonitor.enabled).
enabled: false
containerPort: 8888
servicePort: 8888
protocol: TCP
# When enabled, the chart will set the GOMEMLIMIT env var to 80% of the configured resources.limits.memory.
# If no resources.limits.memory are defined then enabling does nothing.
# It is HIGHLY recommend to enable this setting and set a value for resources.limits.memory.
useGOMEMLIMIT: true
# Resource limits & requests.
# It is HIGHLY recommended to set resource limits.
resources: {}
# resources:
# limits:
# cpu: 250m
# memory: 512Mi
podAnnotations: {}
podLabels: {}
# Common labels to add to all otel-collector resources. Evaluated as a template.
additionalLabels: {}
# app.kubernetes.io/part-of: my-app
# Host networking requested for this pod. Use the host's network namespace.
hostNetwork: false
# Adding entries to Pod /etc/hosts with HostAliases
# https://kubernetes.io/docs/tasks/network/customize-hosts-file-for-pods/
hostAliases: []
# - ip: "1.2.3.4"
# hostnames:
# - "my.host.com"
# Pod DNS policy ClusterFirst, ClusterFirstWithHostNet, None, Default, None
dnsPolicy: ""
# Custom DNS config. Required when DNS policy is None.
dnsConfig: {}
# only used with deployment mode
replicaCount: 1
# only used with deployment mode
revisionHistoryLimit: 10
annotations: {}
# List of extra sidecars to add.
# This also supports template content, which will eventually be converted to yaml.
extraContainers: []
# extraContainers:
# - name: test
# command:
# - cp
# args:
# - /bin/sleep
# - /test/sleep
# image: busybox:latest
# volumeMounts:
# - name: test
# mountPath: /test
# List of init container specs, e.g. for copying a binary to be executed as a lifecycle hook.
# This also supports template content, which will eventually be converted to yaml.
# Another usage of init containers is e.g. initializing filesystem permissions to the OTLP Collector user `10001` in case you are using persistence and the volume is producing a permission denied error for the OTLP Collector container.
initContainers: []
# initContainers:
# - name: test
# image: busybox:latest
# command:
# - cp
# args:
# - /bin/sleep
# - /test/sleep
# volumeMounts:
# - name: test
# mountPath: /test
# - name: init-fs
# image: busybox:latest
# command:
# - sh
# - '-c'
# - 'chown -R 10001: /var/lib/storage/otc' # use the path given as per `extensions.file_storage.directory` & `extraVolumeMounts[x].mountPath`
# volumeMounts:
# - name: opentelemetry-collector-data # use the name of the volume used for persistence
# mountPath: /var/lib/storage/otc # use the path given as per `extensions.file_storage.directory` & `extraVolumeMounts[x].mountPath`
# Pod lifecycle policies.
lifecycleHooks: {}
# lifecycleHooks:
# preStop:
# exec:
# command:
# - /test/sleep
# - "5"
# liveness probe configuration
# Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
##
livenessProbe:
# Number of seconds after the container has started before startup, liveness or readiness probes are initiated.
# initialDelaySeconds: 1
# How often in seconds to perform the probe.
# periodSeconds: 10
# Number of seconds after which the probe times out.
# timeoutSeconds: 1
# Minimum consecutive failures for the probe to be considered failed after having succeeded.
# failureThreshold: 1
# Duration in seconds the pod needs to terminate gracefully upon probe failure.
# terminationGracePeriodSeconds: 10
httpGet:
port: 13133
path: /
# readiness probe configuration
# Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
##
readinessProbe:
# Number of seconds after the container has started before startup, liveness or readiness probes are initiated.
# initialDelaySeconds: 1
# How often (in seconds) to perform the probe.
# periodSeconds: 10
# Number of seconds after which the probe times out.
# timeoutSeconds: 1
# Minimum consecutive successes for the probe to be considered successful after having failed.
# successThreshold: 1
# Minimum consecutive failures for the probe to be considered failed after having succeeded.
# failureThreshold: 1
httpGet:
port: 13133
path: /
service:
# Enable the creation of a Service.
# By default, it's enabled on mode != daemonset.
# However, to enable it on mode = daemonset, its creation must be explicitly enabled
# enabled: true
type: ClusterIP
# type: LoadBalancer
# loadBalancerIP: 1.2.3.4
# loadBalancerSourceRanges: []
# By default, Service of type 'LoadBalancer' will be created setting 'externalTrafficPolicy: Cluster'
# unless other value is explicitly set.
# Possible values are Cluster or Local (https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip)
# externalTrafficPolicy: Cluster
annotations: {}
# By default, Service will be created setting 'internalTrafficPolicy: Local' on mode = daemonset
# unless other value is explicitly set.
# Setting 'internalTrafficPolicy: Cluster' on a daemonset is not recommended
# internalTrafficPolicy: Cluster
ingress:
enabled: false
# annotations: {}
# ingressClassName: nginx
# hosts:
# - host: collector.example.com
# paths:
# - path: /
# pathType: Prefix
# port: 4318
# tls:
# - secretName: collector-tls
# hosts:
# - collector.example.com
# Additional ingresses - only created if ingress.enabled is true
# Useful for when differently annotated ingress services are required
# Each additional ingress needs key "name" set to something unique
additionalIngresses: []
# - name: cloudwatch
# ingressClassName: nginx
# annotations: {}
# hosts:
# - host: collector.example.com
# paths:
# - path: /
# pathType: Prefix
# port: 4318
# tls:
# - secretName: collector-tls
# hosts:
# - collector.example.com
podMonitor:
# The pod monitor by default scrapes the metrics port.
# The metrics port needs to be enabled as well.
enabled: false
metricsEndpoints:
- port: metrics
# interval: 15s
# additional labels for the PodMonitor
extraLabels: {}
# release: kube-prometheus-stack
serviceMonitor:
# The service monitor by default scrapes the metrics port.
# The metrics port needs to be enabled as well.
enabled: false
metricsEndpoints:
- port: metrics
# interval: 15s
# additional labels for the ServiceMonitor
extraLabels: {}
# release: kube-prometheus-stack
# Used to set relabeling and metricRelabeling configs on the ServiceMonitor
# https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
relabelings: []
metricRelabelings: []
# PodDisruptionBudget is used only if deployment enabled
podDisruptionBudget:
enabled: false
# minAvailable: 2
# maxUnavailable: 1
# autoscaling is used only if mode is "deployment" or "statefulset"
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 10
behavior: {}
targetCPUUtilizationPercentage: 80
# targetMemoryUtilizationPercentage: 80
rollout:
rollingUpdate: {}
# When 'mode: daemonset', maxSurge cannot be used when hostPort is set for any of the ports
# maxSurge: 25%
# maxUnavailable: 0
strategy: RollingUpdate
prometheusRule:
enabled: false
groups: []
# Create default rules for monitoring the collector
defaultRules:
enabled: false
# additional labels for the PrometheusRule
extraLabels: {}
statefulset:
# volumeClaimTemplates for a statefulset
volumeClaimTemplates: []
podManagementPolicy: "Parallel"
# Controls if and how PVCs created by the StatefulSet are deleted. Available in Kubernetes 1.23+.
persistentVolumeClaimRetentionPolicy:
enabled: false
whenDeleted: Retain
whenScaled: Retain
networkPolicy:
enabled: false
# Annotations to add to the NetworkPolicy
annotations: {}
# Configure the 'from' clause of the NetworkPolicy.
# By default this will restrict traffic to ports enabled for the Collector. If
# you wish to further restrict traffic to other hosts or specific namespaces,
# see the standard NetworkPolicy 'spec.ingress.from' definition for more info:
# https://kubernetes.io/docs/reference/kubernetes-api/policy-resources/network-policy-v1/
allowIngressFrom: []
# # Allow traffic from any pod in any namespace, but not external hosts
# - namespaceSelector: {}
# # Allow external access from a specific cidr block
# - ipBlock:
# cidr: 192.168.1.64/32
# # Allow access from pods in specific namespaces
# - namespaceSelector:
# matchExpressions:
# - key: kubernetes.io/metadata.name
# operator: In
# values:
# - "cats"
# - "dogs"
# Add additional ingress rules to specific ports
# Useful to allow external hosts/services to access specific ports
# An example is allowing an external prometheus server to scrape metrics
#
# See the standard NetworkPolicy 'spec.ingress' definition for more info:
# https://kubernetes.io/docs/reference/kubernetes-api/policy-resources/network-policy-v1/
extraIngressRules: []
# - ports:
# - port: metrics
# protocol: TCP
# from:
# - ipBlock:
# cidr: 192.168.1.64/32
# Restrict egress traffic from the OpenTelemetry collector pod
# See the standard NetworkPolicy 'spec.egress' definition for more info:
# https://kubernetes.io/docs/reference/kubernetes-api/policy-resources/network-policy-v1/
egressRules: []
# - to:
# - namespaceSelector: {}
# - ipBlock:
# cidr: 192.168.10.10/24
# ports:
# - port: 1234
# protocol: TCP
# Allow containers to share processes across pod namespace
shareProcessNamespace: false
The Collector Was Deployed Without Any issue and Data is Exported to Thanos using prometheusremotewrite exporter.
But my Issue is How to Query the Data to Get
Number of Pods in the cluster
Number of Deployments
CPU and Memory of Pods/containers/nodes and others
Can you help me to Provide the Queries to get the Data on Grafana Dashboards.
The text was updated successfully, but these errors were encountered:
This seems like a Thanos quesiton, rather than an issue with the PRW exporter. I would recommend asking in the Thanos slack channel, since slack is a better place for support questions: https://cloud-native.slack.com/archives/CK5RSSC10
Component(s)
exporter/prometheusremotewrite
Describe the issue you're reporting
I have Installed Opentelemetry Collector on a Kubernetes Cluster to monitor Kubernetes Cluster (Metrics).
Below is the Helm Chart Values
The Collector Was Deployed Without Any issue and Data is Exported to Thanos using prometheusremotewrite exporter.
But my Issue is How to Query the Data to Get
Can you help me to Provide the Queries to get the Data on Grafana Dashboards.
The text was updated successfully, but these errors were encountered: