Timezone problem with kube-state-metrics #279

Reamer · 2019-03-01T13:37:35Z

Hi
I updated my cluster yesterday with openshift-ansible with this commit openshift/openshift-ansible@8c77207
This commit changed the timezone in api, controller and etcd.
kube-state-metrics pod is still in UTC timezone and I get exactly this issue: kubernetes/kube-state-metrics#500

What can I do? It is possible to set the timezone also in kube-state-metric pod?

Openshift-Version:

oc version
oc v3.11.0+b6db8e6-107
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://s-cp-lb-01.cloud.mycompany.de:443
openshift v3.11.0+d0c29df-98
kubernetes v1.11.0+d4cacc0

If you need more information let me know.

The text was updated successfully, but these errors were encountered:

Master static pods are always running with UTC timezone, it would be not same timezone with worker nodes. It causes potential issues of time dependent works. - Fix: https://bugzilla.redhat.com/show_bug.cgi?id=1674170

bysnupy · 2019-03-15T14:38:05Z

Hi @Reamer , sorry for late response.

Could you elaborate your issue ?
And if you can provide steps to reproduce the issue, please let me know.

Reamer · 2019-03-15T19:41:48Z

Hi @bysnupy,

Steps to reproduce:

Install Openshift 3.11 with ansible-playbook on machines, which are not in time zone UTC. My machines are in time zone Europe/Berlin
- the version of ansible-playbook must include your time zone change. Your changes are in branch release-3.11.
the project openshift-monitoring should created by default
- cluster-montirong-operator will setup prometheus ( with configuration and rules), grafana, node-exporter and kube-state-metrics
Install also the logging components with ansible playbook
- this will create the curator cron job

With a time zone in kube-api pod, the time for the next cron job is reported not in UTC any more. The application of kube-state-metrics but still calculates with UTC. Now I have the same issue, which is described here kubernetes/kube-state-metrics#500

bysnupy · 2019-03-16T07:39:32Z

It's a good point @Reamer,

I think it's not master control plane's timezone problem, each pod should have configuration to set a specific timezone if it's some feature affect from timezone. But no way provided from kube-state-metrics to set the specific timezone at the moment.
My suggestion is creating again the curator CronJob as workaround, because this issue depends last scheduled time of it, so you can clear the history as creating again.

Personally I think kube-state-metric team take a look this, such as the discussion you provided. I'm not familiar with kube-state-metrics, sorry for not help to you.

sdodson · 2019-03-20T19:30:04Z

@brancz Can I get your opinion whether you think we should revert the changes that were made in the linked pull requests for openshift-ansible? We did this because people complained when the api/controllers/etcd processes moved from host services to static pods without access to /etc/localtime which meant their log timestamps were different from the rest of the system.

bysnupy · 2019-03-21T10:26:00Z

FYI @Reamer @brancz @sdodson

I've verified timezone influence for CronJob as follows.

In my conclusion, CronJob starting time depends on control plane(controller) timezone, not kube-state-metrics timezone.
But kube_cronjob_next_schedule_time value depends on kube-state-metrics timezone.
Look the test2 section, it's buggy.

test1>
- api, controller, etcd timezone: UTC
- kube-state-metrics: UTC
- CronJob Schedule: 5 9 * * *
- kube_cronjob_next_schedule_time:

    # TZ=UTC date -d @1553245500
    Fri Mar 22 09:05:00 UTC 2019

    # date -d @1553245500
    Fri Mar 22 18:05:00 JST 2019

test2> This pattern is buggy, look the next schedule time return as UTC timezone, even though CronJob is scheduled as JST. The time is same but timezone is different.
- api, controller, etcd timezone: JST (UTC+9)
- kube-state-metrics: UTC
- CronJob Schedule: 50 18 * * *
- kube_cronjob_next_schedule_time:

    # TZ=UTC date -d @1553194200
    Thu Mar 21 18:50:00 UTC 2019

    # date -d @1553194200
    Fri Mar 22 03:50:00 JST 2019

test3>
- api, controller, etcd timezone: JST (UTC+9)
- kube-state-metrics: JST (UTC+9)
- CronJob Schedule: 0 19 * * *
- kube_cronjob_next_schedule_time:

    # TZ=UTC date -d @1553248800
    Fri Mar 22 10:00:00 UTC 2019

    # date -d @1553248800
    Fri Mar 22 19:00:00 JST 2019

Refer the following testing evidences.

test1>

  # for ctr in $(oc get pod -o name -n kube-system); do echo "$ctr : $(oc rsh -n kube-system $ctr date)"; done
  pod/master-api-all.ocp311.example.com : Thu Mar 21 08:57:01 UTC 2019
  pod/master-controllers-all.ocp311.example.com : Thu Mar 21 08:57:02 UTC 2019
  pod/master-etcd-all.ocp311.example.com : Thu Mar 21 08:57:03 UTC 2019

  # oc rsh -n openshift-monitoring -c kube-state-metrics deployment/kube-state-metrics date
  Thu Mar 21 08:57:17 UTC 2019

  # oc create -f - <<EOF
  apiVersion: batch/v1beta1
  kind: CronJob
  metadata:
    name: testcronjob
  spec:
    jobTemplate:
      spec:
        template:
          spec:
            containers:
            - command:
              - date
              image: busybox
              imagePullPolicy: Always
              name: test
            restartPolicy: OnFailure
    schedule: '5 9 * * *'
    successfulJobsHistoryLimit: 3
    suspend: false
  EOF

  # date
  Thu Mar 21 18:03:40 JST 2019
  # TZ=UTC date
  Thu Mar 21 09:04:33 UTC 2019

  # oc describe cj testcronjob 
  Name:                       testcronjob
  Namespace:                  test
  Labels:                     <none>
  Annotations:                <none>
  Schedule:                   5 9 * * *
  ...
  Last Schedule Time:  Thu, 21 Mar 2019 18:05:00 +0900
  Active Jobs:         <none>
  Events:
    Type    Reason            Age   From                Message
    ----    ------            ----  ----                -------
    Normal  SuccessfulCreate  23s   cronjob-controller  Created job testcronjob-1553159100
    Normal  SawCompletedJob   3s    cronjob-controller  Saw completed job: testcronjob-1553159100


  # oc exec -n openshift-monitoring -c prometheus prometheus-k8s-0 -- curl -s \
            'http://localhost:9090/api/v1/query?query=kube_cronjob_next_schedule_time' | python -m json.tool
  {
      "data": {
          "result": [
              {
                  "metric": {
                      "__name__": "kube_cronjob_next_schedule_time",
                      "cronjob": "testcronjob",
                      "endpoint": "https-main",
                      "instance": "10.128.1.88:8443",
                      "job": "kube-state-metrics",
                      "namespace": "test",
                      "pod": "kube-state-metrics-75b9b8dcc4-wmkrm",
                      "service": "kube-state-metrics"
                  },
                  "value": [
                      1553159458.714,
                      "1553245500"
                  ]
              }
          ],
          "resultType": "vector"
      },
      "status": "success"
  }

  # TZ=UTC date -d @1553245500
  Fri Mar 22 09:05:00 UTC 2019

After changing UTC timezone to JST for only control plane.

test2>

  # for ctr in $(oc get pod -o name -n kube-system); do echo "$ctr : $(oc rsh -n kube-system $ctr date)"; done
  pod/master-api-all.ocp311.example.com : Thu Mar 21 18:42:43 JST 2019
  pod/master-controllers-all.ocp311.example.com : Thu Mar 21 18:42:47 JST 2019
  pod/master-etcd-all.ocp311.example.com : Thu Mar 21 18:42:49 JST 2019

  # oc rsh -n openshift-monitoring -c kube-state-metrics deployment/kube-state-metrics date
  Thu Mar 21 09:43:39 UTC 2019

  # date
  Thu Mar 21 18:44:13 JST 2019
  # TZ=UTC date
  Thu Mar 21 09:44:18 UTC 2019

  # oc edit cj/testcronjob
  ...
    schedule: 50 18 * * *
  ...

  # oc describe cj/testcronjob
  Name:                       testcronjob
  Namespace:                  test
  Labels:                     <none>
  Annotations:                <none>
  Schedule:                   50 18 * * *
  ...
  Last Schedule Time:  Thu, 21 Mar 2019 18:50:00 +0900
  Active Jobs:         <none>
  Events:
    Type    Reason            Age   From                Message
    ----    ------            ----  ----                -------
    Normal  SuccessfulCreate  45m   cronjob-controller  Created job testcronjob-1553159100
    Normal  SawCompletedJob   45m   cronjob-controller  Saw completed job: testcronjob-1553159100
    Normal  SuccessfulCreate  25s   cronjob-controller  Created job testcronjob-1553161800
    Normal  SawCompletedJob   5s    cronjob-controller  Saw completed job: testcronjob-1553161800

  # oc exec -n openshift-monitoring -c prometheus prometheus-k8s-0 -- curl -s \
             'http://localhost:9090/api/v1/query?query=kube_cronjob_next_schedule_time' | python -m json.tool
  {
      "data": {
          "result": [
              {
                  "metric": {
                      "__name__": "kube_cronjob_next_schedule_time",
                      "cronjob": "testcronjob",
                      "endpoint": "https-main",
                      "instance": "10.128.1.91:8443",
                      "job": "kube-state-metrics",
                      "namespace": "test",
                      "pod": "kube-state-metrics-75b9b8dcc4-wmkrm",
                      "service": "kube-state-metrics"
                  },
                  "value": [
                      1553161897.961,
                      "1553194200"
                  ]
              }
          ],
          "resultType": "vector"
      },
      "status": "success"
  }

  # TZ=UTC date -d @1553194200
  Thu Mar 21 18:50:00 UTC 2019

  # date -d @1553194200
  Fri Mar 22 03:50:00 JST 2019

After stop cluster-monitoring-operator and prometheus-operator, change the timezone to JST (UTC+9) for kube-state-metrics.

test3>

  # oc set env deployment/kube-state-metrics TZ=Asia/Tokyo -n openshift-monitoring
  deployment.extensions/kube-state-metrics updated

  # oc rsh -n openshift-monitoring -c kube-state-metrics deployment/kube-state-metrics date
  Thu Mar 21 18:57:44 JST 2019

  # oc edit cj/testcronjob
  ...
    schedule: 0 19 * * *
  ...

  # date
  Thu Mar 21 18:59:28 JST 2019
  # TZ=UTC date
  Thu Mar 21 09:59:34 UTC 2019

  # oc describe cj/testcronjob
  Name:                       testcronjob
  Namespace:                  test
  Labels:                     <none>
  Annotations:                <none>
  Schedule:                   0 19 * * *
  ...
  Last Schedule Time:  Thu, 21 Mar 2019 19:00:00 +0900
  Active Jobs:         <none>
  Events:
    Type    Reason            Age   From                Message
    ----    ------            ----  ----                -------
    Normal  SuccessfulCreate  55m   cronjob-controller  Created job testcronjob-1553159100
    Normal  SawCompletedJob   55m   cronjob-controller  Saw completed job: testcronjob-1553159100
    Normal  SuccessfulCreate  10m   cronjob-controller  Created job testcronjob-1553161800
    Normal  SawCompletedJob   10m   cronjob-controller  Saw completed job: testcronjob-1553161800
    Normal  SuccessfulCreate  23s   cronjob-controller  Created job testcronjob-1553162400
    Normal  SawCompletedJob   3s    cronjob-controller  Saw completed job: testcronjob-1553162400

  # oc exec -n openshift-monitoring -c prometheus prometheus-k8s-0 -- curl -s \
             'http://localhost:9090/api/v1/query?query=kube_cronjob_next_schedule_time' | python -m json.tool
  {
      "data": {
          "result": [
              {
                  "metric": {
                      "__name__": "kube_cronjob_next_schedule_time",
                      "cronjob": "testcronjob",
                      "endpoint": "https-main",
                      "instance": "10.128.1.120:8443",
                      "job": "kube-state-metrics",
                      "namespace": "test",
                      "pod": "kube-state-metrics-6484658f69-576sd",
                      "service": "kube-state-metrics"
                  },
                  "value": [
                      1553162486.08,
                      "1553248800"
                  ]
              }
          ],
          "resultType": "vector"
      },
      "status": "success"
  }

  # TZ=UTC date -d @1553248800
  Fri Mar 22 10:00:00 UTC 2019

  # date -d @1553248800
  Fri Mar 22 19:00:00 JST 2019

ThoTischner · 2019-03-21T12:54:59Z

We could implement a new ansible-playbook variable:
openshift_logging_kube_state_metrics_timezone: "Europe/Paris"
Default value is generated via facts on the master.

This value will be set as env var or config parameter.. on the cluster-monitoring-operator deployment.

If the cluster-monitoring-operator detects this variable / config value, it will add a TZ env var to the kube-state-metrics deploy.

The kube-state-metrics can now translate the kubernetes metrics timezone to UTC or export the time values with +x values for its timezone so that prometheus can convert it to UTC.

sdodson · 2019-03-21T14:11:46Z

I think it'd be much better if everything were UTC than having components respecting different timezones.

bysnupy · 2019-03-21T14:40:29Z

Personally I think one timezone is ideal situation on all the system. But real world is consist of various timezone, and most people take each region timezone for granted. If the system running as host process, then it's not problem, because the process is always running on the host timezone. If the system is based on container manner, then it isolated from host configuration. First of all, we should set a policy about running container manner. For instance, UTC is only available timezone against all system. Or prepare the way to control over timezone on the all system. We should consider the opinions of various sections specialist about this, it can lead to best result.

ThoTischner · 2019-03-21T17:08:16Z

I think it'd be much better if everything were UTC than having components respecting different timezones.

Than we can not configure the cronjob schedule time in our timezone?

Reamer · 2019-03-21T17:30:25Z

I think it'd be much better if everything were UTC than having components respecting different timezones.

Than we can not configure the cronjob schedule time in our timezone?

No you can't - think global

ThoTischner · 2019-04-01T06:15:59Z

Anyway how we proceed with this issue? Kube Metrics time and all dependent alarms are odd.

brancz · 2019-04-01T08:06:42Z

(sorry I was on vacation until just now)

In general, monitoring is always done against UTC only, for all the reasons already laid out in this thread. I'm also for UTC always and everywhere, it's a widely used best practice in SRE.

bysnupy · 2019-04-03T14:07:36Z

In this thread, UTC is better timezone than each local one based on your opinions. If UTC becomes standard on the OpenShift clusters, then personally I think we should define this clearly as documentation. I want to suppress confusing around timezone, such as cronjob scheduling time, each pod logs timestamp and so on. Could I get your thought ? @sdodson @brancz

brancz · 2019-04-03T14:19:04Z

I feel like this should be brought up on a broader level (probably at least on aos-devel), but yes I agree with this.

jkroepke · 2019-04-04T11:05:33Z

In 2019 a timezone should be not an issue. What the problem to set the TZ environment variable? It could be done by a ansible fact like described above.

From my side, it should be document that the timezone must be unique across the whole cluster. But the timezone should be managed by the user.

You might be get an ideal solution but it is not a real world solution.

Openshift is the enterprise version of kubernetes. Its mainly using inside onpremise datacenter. Supporting only UTC is bogus and breaks a lot IT process in (german) datacenters.

The worst case would be that RedHat official supports UTC only.

eparis · 2019-04-18T13:21:41Z

my opinion, for 3.x we should hostmount /etc/localtime into the kube-state-metrics container, just like we do with the api and etcd containers.

for 4.x we should use UTC everywhere. We should not continue down this path.

brancz · 2019-05-10T18:19:12Z

Opened #353 with the approach outlined by Eric.

Reamer · 2019-05-13T07:36:50Z

Hi @brancz,
just update cluster-monitoring-operator. Your change works. Thank you.

sdodson mentioned this issue Mar 19, 2019

[release-3.10] [release-3.11] Make same timezone with running hosts openshift/openshift-ansible#11373

Merged

brancz mentioned this issue May 10, 2019

Mount localtime from host #353

Merged

Reamer closed this as completed May 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timezone problem with kube-state-metrics #279

Timezone problem with kube-state-metrics #279

Reamer commented Mar 1, 2019 •

edited

Loading

bysnupy commented Mar 15, 2019

Reamer commented Mar 15, 2019

bysnupy commented Mar 16, 2019 •

edited

Loading

sdodson commented Mar 20, 2019

bysnupy commented Mar 21, 2019 •

edited

Loading

ThoTischner commented Mar 21, 2019 •

edited

Loading

sdodson commented Mar 21, 2019

bysnupy commented Mar 21, 2019 •

edited

Loading

ThoTischner commented Mar 21, 2019

Reamer commented Mar 21, 2019

ThoTischner commented Apr 1, 2019

brancz commented Apr 1, 2019

bysnupy commented Apr 3, 2019 •

edited

Loading

brancz commented Apr 3, 2019

jkroepke commented Apr 4, 2019

eparis commented Apr 18, 2019

brancz commented May 10, 2019

Reamer commented May 13, 2019

Timezone problem with kube-state-metrics #279

Timezone problem with kube-state-metrics #279

Comments

Reamer commented Mar 1, 2019 • edited Loading

bysnupy commented Mar 15, 2019

Reamer commented Mar 15, 2019

bysnupy commented Mar 16, 2019 • edited Loading

sdodson commented Mar 20, 2019

bysnupy commented Mar 21, 2019 • edited Loading

ThoTischner commented Mar 21, 2019 • edited Loading

sdodson commented Mar 21, 2019

bysnupy commented Mar 21, 2019 • edited Loading

ThoTischner commented Mar 21, 2019

Reamer commented Mar 21, 2019

ThoTischner commented Apr 1, 2019

brancz commented Apr 1, 2019

bysnupy commented Apr 3, 2019 • edited Loading

brancz commented Apr 3, 2019

jkroepke commented Apr 4, 2019

eparis commented Apr 18, 2019

brancz commented May 10, 2019

Reamer commented May 13, 2019

Reamer commented Mar 1, 2019 •

edited

Loading

bysnupy commented Mar 16, 2019 •

edited

Loading

bysnupy commented Mar 21, 2019 •

edited

Loading

ThoTischner commented Mar 21, 2019 •

edited

Loading

bysnupy commented Mar 21, 2019 •

edited

Loading

bysnupy commented Apr 3, 2019 •

edited

Loading