Operator does not see updated services #635

brechtvhb · 2020-09-17T07:15:09Z

What happened:
When updating a service, the operator does not see that the HealthChecks label got added to the service so it does not get pushed to the healthchecks ui.

What you expected to happen:
I expect to see the updated service in the health checks ui.

How to reproduce it (as minimally and precisely as possible):
Add the "HealthChecks" label to the service and apply the changes.
Note: The namespace of the service (default) is in a diffirent namespace than the healthchecks operator / dashboard (healthchecks).

The logs don't show anything that can give me a clue. The logs end after the last successfull PushService call.

[07:06:37 INF] [PushService] Notification result for infrachecker-api-service - status code: OK

Is it somehow possible to enable more verbose logs that might give a clue?

Environment:

.NET Core version: 3.1.8
Healthchecks version: latest
Operative system: AKS 18.6

The text was updated successfully, but these errors were encountered:

CarlosLanderas · 2020-09-17T07:24:59Z

Have you installed the latest operator version 3.1.1? Is the first version supporting Cluster Scope

brechtvhb · 2020-09-17T07:34:04Z

The docker image that was pulled was xabarilcoding/healthchecksui-k8s-operator:latest

I did the manual installation of this:
https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks/tree/master/deploy/operator

First the CRD, then the rest.

CarlosLanderas · 2020-09-17T07:58:55Z

Could you paste the output for the following command to see the actual deployment image version that is being used?

kubectl describe pod healthchecks-ui-k8s-operator-675949bc7-xrgtp -n healthchecks

brechtvhb · 2020-09-17T08:02:50Z

healthchecks-ui-k8s-operator:
    Container ID:   docker://007c360eadddf327cb75b9e57299fdc5bea2b459b1192bcdff86d71f8c40debf
    Image:          xabarilcoding/healthchecksui-k8s-operator:latest
    Image ID:       docker-pullable://xabarilcoding/healthchecksui-k8s-operator@sha256:67cd48452b9e6f70fac7d6a902c1685bfc024ee58eefc166e4920a03a0d44be8
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Thu, 17 Sep 2020 09:27:12 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  200Mi
    Requests:
      cpu:        300m
      memory:     100Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from healthchecks-admin-token-tv2jp (ro)

CarlosLanderas · 2020-09-17T08:21:16Z

Thank you. And could you paste the output of this?

kubectl describe crd healthchecks.aspnetcore.ui

If you can, do a kubectl logs podname -n healthchecks as well and paste all the operator log.

brechtvhb · 2020-09-17T08:32:47Z

Name:         healthchecks.aspnetcore.ui
Namespace:
Labels:       <none>
Annotations:  API Version:  apiextensions.k8s.io/v1
Kind:         CustomResourceDefinition
Metadata:
  Creation Timestamp:  2020-09-16T11:49:59Z
  Generation:          1
  Managed Fields:
    API Version:  apiextensions.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:acceptedNames:
          f:kind:
          f:listKind:
          f:plural:
          f:shortNames:
          f:singular:
        f:conditions:
    Manager:      kube-apiserver
    Operation:    Update
    Time:         2020-09-16T11:49:59Z
    API Version:  apiextensions.k8s.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        f:conversion:
          .:
          f:strategy:
        f:group:
        f:names:
          f:kind:
          f:listKind:
          f:plural:
          f:shortNames:
          f:singular:
        f:preserveUnknownFields:
        f:scope:
        f:validation:
          .:
          f:openAPIV3Schema:
            .:
            f:properties:
              .:
              f:spec:
                .:
                f:properties:
                  .:
                  f:deployAnnotations:
                    .:
                    f:items:
                    f:type:
                  f:healthChecksPath:
                    .:
                    f:type:
                  f:healthChecksScheme:
                    .:
                    f:type:
                  f:image:
                    .:
                    f:type:
                  f:imagePullPolicy:
                    .:
                    f:type:
                  f:name:
                    .:
                    f:type:
                  f:portNumber:
                    .:
                    f:type:
                  f:scope:
                    .:
                    f:enum:
                    f:type:
                  f:serviceAnnotations:
                    .:
                    f:items:
                    f:type:
                  f:serviceType:
                    .:
                    f:enum:
                    f:type:
                  f:servicesLabel:
                    .:
                    f:type:
                  f:stylesheetContent:
                    .:
                    f:type:
                  f:uiApiPath:
                    .:
                    f:pattern:
                    f:type:
                  f:uiNoRelativePaths:
                    .:
                    f:type:
                  f:uiPath:
                    .:
                    f:pattern:
                    f:type:
                  f:uiResourcesPath:
                    .:
                    f:pattern:
                    f:type:
                  f:uiWebhooksPath:
                    .:
                    f:pattern:
                    f:type:
                  f:webhooks:
                    .:
                    f:items:
                    f:type:
                f:required:
        f:version:
        f:versions:
      f:status:
        f:storedVersions:
    Manager:         kubectl-client-side-apply
    Operation:       Update
    Time:            2020-09-16T11:49:59Z
  Resource Version:  6925778
  Self Link:         /apis/apiextensions.k8s.io/v1/customresourcedefinitions/healthchecks.aspnetcore.ui
  UID:               73070956-62d1-4756-8ed0-9dee7edf23c3
Spec:
  Conversion:
    Strategy:  None
  Group:       aspnetcore.ui
  Names:
    Kind:       HealthCheck
    List Kind:  HealthChecks
    Plural:     healthchecks
    Short Names:
      hc
    Singular:               healthcheck
  Preserve Unknown Fields:  true
  Scope:                    Namespaced
  Versions:
    Name:  v1
    Schema:
      openAPIV3Schema:
        Properties:
          Spec:
            Properties:
              Deploy Annotations:
                Items:
                  Properties:
                    Name:
                      Type:  string
                    Value:
                      Type:  string
                  Required:
                    name
                    value
                  Type:  object
                Type:    array
              Health Checks Path:
                Type:  string
              Health Checks Scheme:
                Type:  string
              Image:
                Type:  string
              Image Pull Policy:
                Type:  string
              Name:
                Type:  string
              Port Number:
                Type:  number
              Scope:
                Enum:
                  Cluster
                  Namespaced
                Type:  string
              Service Annotations:
                Items:
                  Properties:
                    Name:
                      Type:  string
                    Value:
                      Type:  string
                  Type:      object
                Type:        array
              Service Type:
                Enum:
                  ClusterIP
                  LoadBalancer
                  NodePort
                Type:  string
              Services Label:
                Type:  string
              Stylesheet Content:
                Type:  string
              Ui API Path:
                Pattern:  ^/
                Type:     string
              Ui No Relative Paths:
                Type:  boolean
              Ui Path:
                Pattern:  ^/
                Type:     string
              Ui Resources Path:
                Pattern:  ^/
                Type:     string
              Ui Webhooks Path:
                Pattern:  ^/
                Type:     string
              Webhooks:
                Items:
                  Properties:
                    Name:
                      Type:  string
                    Payload:
                      Type:  string
                    Restored Payload:
                      Type:  string
                    Uri:
                      Type:  string
                  Required:
                    name
                    uri
                    payload
                    restoredPayload
                  Type:  object
                Type:    array
            Required:
              name
              scope
              servicesLabel
    Served:   true
    Storage:  true
Status:
  Accepted Names:
    Kind:       HealthCheck
    List Kind:  HealthChecks
    Plural:     healthchecks
    Short Names:
      hc
    Singular:  healthcheck
  Conditions:
    Last Transition Time:  2020-09-16T11:49:59Z
    Message:               no conflicts found
    Reason:                NoConflicts
    Status:                True
    Type:                  NamesAccepted
    Last Transition Time:  2020-09-16T11:49:59Z
    Message:               the initial names have been accepted
    Reason:                InitialNamesAccepted
    Status:                True
    Type:                  Established
    Last Transition Time:  2020-09-16T11:49:59Z
    Message:               [spec.versions[0].schema.openAPIV3Schema.properties[spec].type: Required value: must not be empty for specified object fields, spec.versions[0].schema.openAPIV3Schema.type: Required value: must not be e
mpty at the root]
    Reason:                Violations
    Status:                True
    Type:                  NonStructuralSchema
  Stored Versions:
    v1
Events:  <none>

Operator logs:

[07:27:13 INF] The operator is starting
[07:27:14 INF] Creating secret for hc resource - namespace healthchecks
[07:27:15 INF] Creating configmap for hc resource - namespace healthchecks
[07:27:15 INF] Creating deployment for hc resource - namespace healthchecks
[07:27:15 INF] Creating service for hc resource - namespace healthchecks
[07:27:15 INF] Service watcher started for namespace All
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service nominations-api-service with uri : http://192.168.219.158:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service pdfprinter-api-service with uri : http://192.168.160.73:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service trackontradeproxy-api-service with uri : http://192.168.47.140:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Received HTTP response after 17.5138ms - OK
[07:27:15 INF] End processing HTTP request after 92.3408ms - OK
[07:27:15 INF] [PushService] Notification result for nominations-api-service - status code: OK
[07:27:15 INF] Received HTTP response after 16.6805ms - OK
[07:27:15 INF] End processing HTTP request after 16.8767ms - OK
[07:27:15 INF] [PushService] Notification result for pdfprinter-api-service - status code: OK
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service infrachecker-api-service with uri : http://192.168.92.164:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Received HTTP response after 12.6991ms - OK
[07:27:15 INF] End processing HTTP request after 12.876ms - OK
[07:27:15 INF] [PushService] Notification result for trackontradeproxy-api-service - status code: OK
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service acs-api-service with uri : http://192.168.50.252:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Received HTTP response after 3.1679ms - OK
[07:27:15 INF] End processing HTTP request after 3.5415ms - OK
[07:27:15 INF] [PushService] Notification result for acs-api-service - status code: OK
[07:27:15 INF] Received HTTP response after 6.9401ms - OK
[07:27:15 INF] End processing HTTP request after 7.0922ms - OK
[07:27:15 INF] [PushService] Notification result for infrachecker-api-service - status code: OK

CarlosLanderas · 2020-09-17T16:07:41Z

I'm trying to reproduce the issue without any luck, this is the output for my tests:

Test: Unlabel and label a service

kubectl label svc hcnamespaces HealthChecks- -n namespaceddemo

logs:

[PushService] Namespace demo - Sending Type: Deleted - Service hcnamespaces with uri : http://10.96.67.66:80/health to ui endpoint: http://10.108.108.106:80

kubectl label svc hcnamespaces HealthChecks= -n namespaceddemo

logs:

[PushService] Namespace demo - Sending Type: Added - Service hcnamespaces with uri : http://10.96.67.66:80/health to ui endpoint: http://10.108.108.106:80

Test: Create a new deployment, create a service and label after a while

kubectl create ns newns
kubectl create deployment newns-deploy --image carloslanderas/healthchecks-sample-app -n newns
kubectl expose deployment newns-deploy --target-port 80 --port 80 -n newns
kubectl label svc newns-deploy HealthChecksPath=/health -n newns
kubectl label svc newns-deploy HealthChecks= -n newns

[15:44:31 INF] [PushService] Namespace demo - Sending Type: Added - Service newns-deploy with uri : http://10.97.252.242:80/health to ui endpoint: http://10.108.108.106:80

kubectl label svc newns-deploy HealthChecks- -n newns
kubectl label svc newns-deploy HealthChecksPath- -n newns

logs:

[PushService] Namespace demo - Sending Type: Deleted - Service newns-deploy with uri : http://10.97.252.242:80/health to ui endpoint: http://10.108.108.106:80

Labelled service modification:

kubectl label svc newns-deploy HealthChecksPath=health -n newns
kubectl label svc newns-deploy HealthChecks=true -n newns
kubectl label svc newns-deploy HealthChecksPath=health2 -n newns --overwrite

logs:

[PushService] Namespace demo - Sending Type: Added - Service newns-deploy with uri : http://10.97.252.242:80/health to ui endpoint: http://10.108.108.106:80

[PushService] Namespace demo - Sending Type: Modified - Service newns-deploy with uri : http://10.97.252.242:80/health to ui endpoint: http://10.108.108.106:80

CarlosLanderas · 2020-09-17T16:17:09Z

Could you delete and reinstall all operator required definitions to discard problems?

We have an install / uninstall tool you can download and execute in the terminal:

wget https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks/blob/master/deploy/operator/installer/releases/operator-installer-win.exe && operator-installer-win.exe --delete

and then

operator-installer-win.exe

If you are using linux, you also have the linux installer.

This will uninstall and reinstall all the operator definitions

brechtvhb · 2020-09-18T18:05:56Z

OK, will try

CarlosLanderas · 2020-09-23T13:30:29Z

Hello @brechtvhb, any updates?

brechtvhb · 2020-09-23T18:17:50Z

Tomorrow or Friday I should have the time to have a look at it.

brechtvhb · 2020-09-24T12:25:53Z

I still have the issue even when using the installer.
I think the service stops working after exactly one hour. Could the issue be authentication or token related? How is the operator authenticating to the AKS cluster?

CarlosLanderas · 2020-09-24T17:20:25Z

Very weird. I've just updated some labels and annotations after days of not using the operator and is responsive.

If you are inside the cluster it uses inside cluster configuration:

https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks/blob/master/src/HealthChecks.UI.K8s.Operator/Program.cs#L40

Here is the source implementation:

https://github.com/kubernetes-client/csharp/blob/6d5fefdbab6f354089301fe9f0a81694a6fe8491/src/KubernetesClient/KubernetesClientConfiguration.InCluster.cs

To be honest, I do not know what's going on as the operator logs dropped watch connections.

Could you show me the commands you use to add/update your services please?

brechtvhb · 2020-09-24T18:14:14Z

I use kubectl apply -f, through azure dev ops.

Here's an example yaml:

apiVersion: v1
kind: Service
metadata:
  name: nominations-api-service
  namespace: default
  labels:
    HealthChecks: enabled  
spec:
  type: ClusterIP
  selector:
    app: nominations-api
  ports:
  - port: 80
    targetPort: 80

Could you add a log statement that tells if InClusterConfig or BuildConfig was used? Just to be sure InClusterConfig is being used.

Kampfmoehre · 2020-11-16T17:05:04Z

I had the same problem but I could resolve it by redeploying the operator (deleting the pod and wait until a new pod has been started).

brechtvhb · 2020-11-16T18:34:21Z

@Kampfmoehre Yes indeed that's my workaround too till there's a permanent fix.

CarlosLanderas · 2020-11-16T18:47:10Z

This started happening when I updated the Kubernetes Client version. I do not know if the watch open connection drops. These days we are going to start migrating everything to dotnet 5.0 and I'll try all this stuff

CarlosLanderas · 2020-11-21T02:11:59Z

Ok, I think I found the problem. This is a regression from the previous version.
It looks like the internal HttpClient is disconnecting due to inactivity. I can see this in the logs:

[02:10:10 ERR] The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.

The previous operator version used to reconnect on watch errors and I am gonna bring it back. This will be available in the next version

brechtvhb · 2020-11-21T15:32:36Z

Nice, thank you!

CarlosLanderas · 2021-06-07T10:25:44Z

kubernetes-client/csharp#533

We've suffered same behaviour in a different project. Will try to fix this ASAP with a workaround

kipusoep · 2021-10-14T10:00:15Z

Shouldn't this issue be opened again in that case?

brechtvhb changed the title ~~Operator dus not see updated services~~ Operator does not see updated services Sep 17, 2020

CarlosLanderas added the in-progress label Nov 22, 2020

CarlosLanderas closed this as completed Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operator does not see updated services #635

Operator does not see updated services #635

brechtvhb commented Sep 17, 2020

CarlosLanderas commented Sep 17, 2020

brechtvhb commented Sep 17, 2020

CarlosLanderas commented Sep 17, 2020

brechtvhb commented Sep 17, 2020

CarlosLanderas commented Sep 17, 2020 •

edited

Loading

brechtvhb commented Sep 17, 2020 •

edited

Loading

CarlosLanderas commented Sep 17, 2020 •

edited

Loading

CarlosLanderas commented Sep 17, 2020 •

edited

Loading

brechtvhb commented Sep 18, 2020

CarlosLanderas commented Sep 23, 2020

brechtvhb commented Sep 23, 2020

brechtvhb commented Sep 24, 2020

CarlosLanderas commented Sep 24, 2020

brechtvhb commented Sep 24, 2020 •

edited

Loading

Kampfmoehre commented Nov 16, 2020

brechtvhb commented Nov 16, 2020

CarlosLanderas commented Nov 16, 2020

CarlosLanderas commented Nov 21, 2020 •

edited

Loading

brechtvhb commented Nov 21, 2020

CarlosLanderas commented Jun 7, 2021 •

edited

Loading

kipusoep commented Oct 14, 2021

Operator does not see updated services #635

Operator does not see updated services #635

Comments

brechtvhb commented Sep 17, 2020

CarlosLanderas commented Sep 17, 2020

brechtvhb commented Sep 17, 2020

CarlosLanderas commented Sep 17, 2020

brechtvhb commented Sep 17, 2020

CarlosLanderas commented Sep 17, 2020 • edited Loading

brechtvhb commented Sep 17, 2020 • edited Loading

CarlosLanderas commented Sep 17, 2020 • edited Loading

CarlosLanderas commented Sep 17, 2020 • edited Loading

brechtvhb commented Sep 18, 2020

CarlosLanderas commented Sep 23, 2020

brechtvhb commented Sep 23, 2020

brechtvhb commented Sep 24, 2020

CarlosLanderas commented Sep 24, 2020

brechtvhb commented Sep 24, 2020 • edited Loading

Kampfmoehre commented Nov 16, 2020

brechtvhb commented Nov 16, 2020

CarlosLanderas commented Nov 16, 2020

CarlosLanderas commented Nov 21, 2020 • edited Loading

brechtvhb commented Nov 21, 2020

CarlosLanderas commented Jun 7, 2021 • edited Loading

kipusoep commented Oct 14, 2021

CarlosLanderas commented Sep 17, 2020 •

edited

Loading

brechtvhb commented Sep 17, 2020 •

edited

Loading

CarlosLanderas commented Sep 17, 2020 •

edited

Loading

CarlosLanderas commented Sep 17, 2020 •

edited

Loading

brechtvhb commented Sep 24, 2020 •

edited

Loading

CarlosLanderas commented Nov 21, 2020 •

edited

Loading

CarlosLanderas commented Jun 7, 2021 •

edited

Loading