Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit pod status metrics for pending pods from api server #139

Merged

Conversation

mitali-salvi
Copy link

Description:
Pending pod metrics are not being emitted from the pod store due to cadvisor running on individual node and since pending pods are not assigned to any node, they dont show up

Testing:

kubectl get pods -n prometheus    | grep Pending                                     
NAME                                                READY   STATUS    RESTARTS       AGE
prometheus-alertmanager-0                           0/1     Pending   0              98d
prometheus-server-8486b7c658-bzvxs                  0/2     Pending   0              98d
{
  "CloudWatchMetrics":
    [
      {
        "Namespace": "ContainerInsights",
        "Dimensions":
          [
            ["ClusterName", "Namespace", "PodName"],
            ["ClusterName"],
            ["ClusterName", "FullPodName", "Namespace", "PodName"],
          ],
        "Metrics":
          [
            { "Name": "pod_status_scheduled", "Unit": "Count" },
            { "Name": "pod_status_unknown", "Unit": "Count" },
            { "Name": "pod_status_pending", "Unit": "Count" },
            { "Name": "pod_status_running", "Unit": "Count" },
            { "Name": "pod_status_succeeded", "Unit": "Count" },
            { "Name": "pod_status_failed", "Unit": "Count" },
            { "Name": "pod_status_ready", "Unit": "Count" },
          ],
      },
    ],
  "ClusterName": "cloudwatchagent-operator",
  "FullPodName": "prometheus-alertmanager-0",
  "InstanceId": "i-01e21d00258711849",
  "InstanceType": "t3.small",
  "Namespace": "prometheus",
  "NodeName": "ip-192-168-35-27.ec2.internal",
  "PodName": "prometheus-alertmanager-0",
  "Sources": ["apiserver"],
  "Timestamp": "1699491151016",
  "Type": "Pod",
  "Version": "0",
  "kubernetes":
    {
      "host": "ip-192-168-35-27.ec2.internal",
      "labels":
        {
          "app.kubernetes.io/instance": "prometheus",
          "app.kubernetes.io/name": "alertmanager",
          "controller-revision-hash": "prometheus-alertmanager-5f59cbb8f9",
          "statefulset.kubernetes.io/pod-name": "prometheus-alertmanager-0",
        },
      "namespace_name": "prometheus",
      "pod_id": "ad964342-9ff9-4d4a-937a-8c8ad69cd038",
      "pod_name": "prometheus-alertmanager-0",
      "pod_owners":
        [
          {
            "owner_kind": "StatefulSet",
            "owner_name": "prometheus-alertmanager",
          },
        ],
    },
  "pod_status": "Pending",
  "pod_status_failed": 0,
  "pod_status_pending": 1,
  "pod_status_ready": 0,
  "pod_status_running": 0,
  "pod_status_scheduled": 0,
  "pod_status_succeeded": 0,
  "pod_status_unknown": 0,
}

Screenshot 2023-11-08 at 19 51 03 (2)

Screenshot 2023-11-08 at 19 51 48 (2)

@mitali-salvi mitali-salvi self-assigned this Nov 9, 2023
internal/aws/k8s/k8sclient/pod_test.go Show resolved Hide resolved
}

type Option func(*K8sAPIServer)

// NewK8sAPIServer creates a k8sApiServer which can generate cluster-level metrics
func NewK8sAPIServer(cnp clusterNameProvider, logger *zap.Logger, leaderElection *LeaderElection, options ...Option) (*K8sAPIServer, error) {
func NewK8sAPIServer(cnp clusterNameProvider, logger *zap.Logger, leaderElection *LeaderElection, addFullPodNameMetricLabel bool, includeEnhancedMetrics bool, options ...Option) (*K8sAPIServer, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we use the option pattern so the signature doesnt have to change everywhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be taken care of as part of the code cleanup task

@mitali-salvi mitali-salvi merged commit ef547b1 into amazon-contributing:aws-cwa-dev Nov 9, 2023
59 of 71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants