Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator does not see updated services #635

Closed
brechtvhb opened this issue Sep 17, 2020 · 21 comments
Closed

Operator does not see updated services #635

brechtvhb opened this issue Sep 17, 2020 · 21 comments

Comments

@brechtvhb
Copy link

What happened:
When updating a service, the operator does not see that the HealthChecks label got added to the service so it does not get pushed to the healthchecks ui.

What you expected to happen:
I expect to see the updated service in the health checks ui.

How to reproduce it (as minimally and precisely as possible):
Add the "HealthChecks" label to the service and apply the changes.
Note: The namespace of the service (default) is in a diffirent namespace than the healthchecks operator / dashboard (healthchecks).

The logs don't show anything that can give me a clue. The logs end after the last successfull PushService call.

[07:06:37 INF] [PushService] Notification result for infrachecker-api-service - status code: OK

Is it somehow possible to enable more verbose logs that might give a clue?

Environment:

  • .NET Core version: 3.1.8
  • Healthchecks version: latest
  • Operative system: AKS 18.6
@brechtvhb brechtvhb changed the title Operator dus not see updated services Operator does not see updated services Sep 17, 2020
@CarlosLanderas
Copy link
Contributor

Have you installed the latest operator version 3.1.1? Is the first version supporting Cluster Scope

@brechtvhb
Copy link
Author

The docker image that was pulled was xabarilcoding/healthchecksui-k8s-operator:latest

I did the manual installation of this:
https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks/tree/master/deploy/operator

First the CRD, then the rest.

@CarlosLanderas
Copy link
Contributor

Could you paste the output for the following command to see the actual deployment image version that is being used?

kubectl describe pod healthchecks-ui-k8s-operator-675949bc7-xrgtp -n healthchecks

@brechtvhb
Copy link
Author

healthchecks-ui-k8s-operator:
    Container ID:   docker://007c360eadddf327cb75b9e57299fdc5bea2b459b1192bcdff86d71f8c40debf
    Image:          xabarilcoding/healthchecksui-k8s-operator:latest
    Image ID:       docker-pullable://xabarilcoding/healthchecksui-k8s-operator@sha256:67cd48452b9e6f70fac7d6a902c1685bfc024ee58eefc166e4920a03a0d44be8
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Thu, 17 Sep 2020 09:27:12 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  200Mi
    Requests:
      cpu:        300m
      memory:     100Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from healthchecks-admin-token-tv2jp (ro)

@CarlosLanderas
Copy link
Contributor

CarlosLanderas commented Sep 17, 2020

Thank you. And could you paste the output of this?

kubectl describe crd healthchecks.aspnetcore.ui

If you can, do a kubectl logs podname -n healthchecks as well and paste all the operator log.

@brechtvhb
Copy link
Author

brechtvhb commented Sep 17, 2020

Name:         healthchecks.aspnetcore.ui
Namespace:
Labels:       <none>
Annotations:  API Version:  apiextensions.k8s.io/v1
Kind:         CustomResourceDefinition
Metadata:
  Creation Timestamp:  2020-09-16T11:49:59Z
  Generation:          1
  Managed Fields:
    API Version:  apiextensions.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:acceptedNames:
          f:kind:
          f:listKind:
          f:plural:
          f:shortNames:
          f:singular:
        f:conditions:
    Manager:      kube-apiserver
    Operation:    Update
    Time:         2020-09-16T11:49:59Z
    API Version:  apiextensions.k8s.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        f:conversion:
          .:
          f:strategy:
        f:group:
        f:names:
          f:kind:
          f:listKind:
          f:plural:
          f:shortNames:
          f:singular:
        f:preserveUnknownFields:
        f:scope:
        f:validation:
          .:
          f:openAPIV3Schema:
            .:
            f:properties:
              .:
              f:spec:
                .:
                f:properties:
                  .:
                  f:deployAnnotations:
                    .:
                    f:items:
                    f:type:
                  f:healthChecksPath:
                    .:
                    f:type:
                  f:healthChecksScheme:
                    .:
                    f:type:
                  f:image:
                    .:
                    f:type:
                  f:imagePullPolicy:
                    .:
                    f:type:
                  f:name:
                    .:
                    f:type:
                  f:portNumber:
                    .:
                    f:type:
                  f:scope:
                    .:
                    f:enum:
                    f:type:
                  f:serviceAnnotations:
                    .:
                    f:items:
                    f:type:
                  f:serviceType:
                    .:
                    f:enum:
                    f:type:
                  f:servicesLabel:
                    .:
                    f:type:
                  f:stylesheetContent:
                    .:
                    f:type:
                  f:uiApiPath:
                    .:
                    f:pattern:
                    f:type:
                  f:uiNoRelativePaths:
                    .:
                    f:type:
                  f:uiPath:
                    .:
                    f:pattern:
                    f:type:
                  f:uiResourcesPath:
                    .:
                    f:pattern:
                    f:type:
                  f:uiWebhooksPath:
                    .:
                    f:pattern:
                    f:type:
                  f:webhooks:
                    .:
                    f:items:
                    f:type:
                f:required:
        f:version:
        f:versions:
      f:status:
        f:storedVersions:
    Manager:         kubectl-client-side-apply
    Operation:       Update
    Time:            2020-09-16T11:49:59Z
  Resource Version:  6925778
  Self Link:         /apis/apiextensions.k8s.io/v1/customresourcedefinitions/healthchecks.aspnetcore.ui
  UID:               73070956-62d1-4756-8ed0-9dee7edf23c3
Spec:
  Conversion:
    Strategy:  None
  Group:       aspnetcore.ui
  Names:
    Kind:       HealthCheck
    List Kind:  HealthChecks
    Plural:     healthchecks
    Short Names:
      hc
    Singular:               healthcheck
  Preserve Unknown Fields:  true
  Scope:                    Namespaced
  Versions:
    Name:  v1
    Schema:
      openAPIV3Schema:
        Properties:
          Spec:
            Properties:
              Deploy Annotations:
                Items:
                  Properties:
                    Name:
                      Type:  string
                    Value:
                      Type:  string
                  Required:
                    name
                    value
                  Type:  object
                Type:    array
              Health Checks Path:
                Type:  string
              Health Checks Scheme:
                Type:  string
              Image:
                Type:  string
              Image Pull Policy:
                Type:  string
              Name:
                Type:  string
              Port Number:
                Type:  number
              Scope:
                Enum:
                  Cluster
                  Namespaced
                Type:  string
              Service Annotations:
                Items:
                  Properties:
                    Name:
                      Type:  string
                    Value:
                      Type:  string
                  Type:      object
                Type:        array
              Service Type:
                Enum:
                  ClusterIP
                  LoadBalancer
                  NodePort
                Type:  string
              Services Label:
                Type:  string
              Stylesheet Content:
                Type:  string
              Ui API Path:
                Pattern:  ^/
                Type:     string
              Ui No Relative Paths:
                Type:  boolean
              Ui Path:
                Pattern:  ^/
                Type:     string
              Ui Resources Path:
                Pattern:  ^/
                Type:     string
              Ui Webhooks Path:
                Pattern:  ^/
                Type:     string
              Webhooks:
                Items:
                  Properties:
                    Name:
                      Type:  string
                    Payload:
                      Type:  string
                    Restored Payload:
                      Type:  string
                    Uri:
                      Type:  string
                  Required:
                    name
                    uri
                    payload
                    restoredPayload
                  Type:  object
                Type:    array
            Required:
              name
              scope
              servicesLabel
    Served:   true
    Storage:  true
Status:
  Accepted Names:
    Kind:       HealthCheck
    List Kind:  HealthChecks
    Plural:     healthchecks
    Short Names:
      hc
    Singular:  healthcheck
  Conditions:
    Last Transition Time:  2020-09-16T11:49:59Z
    Message:               no conflicts found
    Reason:                NoConflicts
    Status:                True
    Type:                  NamesAccepted
    Last Transition Time:  2020-09-16T11:49:59Z
    Message:               the initial names have been accepted
    Reason:                InitialNamesAccepted
    Status:                True
    Type:                  Established
    Last Transition Time:  2020-09-16T11:49:59Z
    Message:               [spec.versions[0].schema.openAPIV3Schema.properties[spec].type: Required value: must not be empty for specified object fields, spec.versions[0].schema.openAPIV3Schema.type: Required value: must not be e
mpty at the root]
    Reason:                Violations
    Status:                True
    Type:                  NonStructuralSchema
  Stored Versions:
    v1
Events:  <none>

Operator logs:

[07:27:13 INF] The operator is starting
[07:27:14 INF] Creating secret for hc resource - namespace healthchecks
[07:27:15 INF] Creating configmap for hc resource - namespace healthchecks
[07:27:15 INF] Creating deployment for hc resource - namespace healthchecks
[07:27:15 INF] Creating service for hc resource - namespace healthchecks
[07:27:15 INF] Service watcher started for namespace All
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service nominations-api-service with uri : http://192.168.219.158:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service pdfprinter-api-service with uri : http://192.168.160.73:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service trackontradeproxy-api-service with uri : http://192.168.47.140:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Received HTTP response after 17.5138ms - OK
[07:27:15 INF] End processing HTTP request after 92.3408ms - OK
[07:27:15 INF] [PushService] Notification result for nominations-api-service - status code: OK
[07:27:15 INF] Received HTTP response after 16.6805ms - OK
[07:27:15 INF] End processing HTTP request after 16.8767ms - OK
[07:27:15 INF] [PushService] Notification result for pdfprinter-api-service - status code: OK
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service infrachecker-api-service with uri : http://192.168.92.164:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Received HTTP response after 12.6991ms - OK
[07:27:15 INF] End processing HTTP request after 12.876ms - OK
[07:27:15 INF] [PushService] Notification result for trackontradeproxy-api-service - status code: OK
[07:27:15 INF] [PushService] Namespace healthchecks - Sending Type: Added - Service acs-api-service with uri : http://192.168.50.252:80/health to ui endpoint: http://192.168.18.8:80
[07:27:15 INF] Start processing HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Sending HTTP request POST http://192.168.18.8/healthchecks/push?key=4ac46c78-a423-47e2-ab75-454141a8b60b
[07:27:15 INF] Received HTTP response after 3.1679ms - OK
[07:27:15 INF] End processing HTTP request after 3.5415ms - OK
[07:27:15 INF] [PushService] Notification result for acs-api-service - status code: OK
[07:27:15 INF] Received HTTP response after 6.9401ms - OK
[07:27:15 INF] End processing HTTP request after 7.0922ms - OK
[07:27:15 INF] [PushService] Notification result for infrachecker-api-service - status code: OK

@CarlosLanderas
Copy link
Contributor

CarlosLanderas commented Sep 17, 2020

I'm trying to reproduce the issue without any luck, this is the output for my tests:

Test: Unlabel and label a service

kubectl label svc hcnamespaces HealthChecks- -n namespaceddemo

logs:

[PushService] Namespace demo - Sending Type: Deleted - Service hcnamespaces with uri : http://10.96.67.66:80/health to ui endpoint: http://10.108.108.106:80

kubectl label svc hcnamespaces HealthChecks= -n namespaceddemo

logs:

[PushService] Namespace demo - Sending Type: Added - Service hcnamespaces with uri : http://10.96.67.66:80/health to ui endpoint: http://10.108.108.106:80

Test: Create a new deployment, create a service and label after a while

kubectl create ns newns
kubectl create deployment newns-deploy --image carloslanderas/healthchecks-sample-app -n newns
kubectl expose deployment newns-deploy --target-port 80 --port 80 -n newns
kubectl label svc newns-deploy HealthChecksPath=/health -n newns
kubectl label svc newns-deploy HealthChecks= -n newns

[15:44:31 INF] [PushService] Namespace demo - Sending Type: Added - Service newns-deploy with uri : http://10.97.252.242:80/health to ui endpoint: http://10.108.108.106:80

image

kubectl label svc newns-deploy HealthChecks- -n newns
kubectl label svc newns-deploy HealthChecksPath- -n newns

logs:

[PushService] Namespace demo - Sending Type: Deleted - Service newns-deploy with uri : http://10.97.252.242:80/health to ui endpoint: http://10.108.108.106:80

Labelled service modification:

kubectl label svc newns-deploy HealthChecksPath=health -n newns
kubectl label svc newns-deploy HealthChecks=true -n newns
kubectl label svc newns-deploy HealthChecksPath=health2 -n newns --overwrite

logs:

[PushService] Namespace demo - Sending Type: Added - Service newns-deploy with uri : http://10.97.252.242:80/health to ui endpoint: http://10.108.108.106:80

[PushService] Namespace demo - Sending Type: Modified - Service newns-deploy with uri : http://10.97.252.242:80/health to ui endpoint: http://10.108.108.106:80

@CarlosLanderas
Copy link
Contributor

CarlosLanderas commented Sep 17, 2020

Could you delete and reinstall all operator required definitions to discard problems?

We have an install / uninstall tool you can download and execute in the terminal:

wget https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks/blob/master/deploy/operator/installer/releases/operator-installer-win.exe && operator-installer-win.exe --delete

and then

operator-installer-win.exe

If you are using linux, you also have the linux installer.

This will uninstall and reinstall all the operator definitions

@brechtvhb
Copy link
Author

OK, will try

@CarlosLanderas
Copy link
Contributor

Hello @brechtvhb, any updates?

@brechtvhb
Copy link
Author

Tomorrow or Friday I should have the time to have a look at it.

@brechtvhb
Copy link
Author

I still have the issue even when using the installer.
I think the service stops working after exactly one hour. Could the issue be authentication or token related? How is the operator authenticating to the AKS cluster?

@CarlosLanderas
Copy link
Contributor

Very weird. I've just updated some labels and annotations after days of not using the operator and is responsive.

If you are inside the cluster it uses inside cluster configuration:

https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks/blob/master/src/HealthChecks.UI.K8s.Operator/Program.cs#L40

Here is the source implementation:

https://github.com/kubernetes-client/csharp/blob/6d5fefdbab6f354089301fe9f0a81694a6fe8491/src/KubernetesClient/KubernetesClientConfiguration.InCluster.cs

To be honest, I do not know what's going on as the operator logs dropped watch connections.

Could you show me the commands you use to add/update your services please?

@brechtvhb
Copy link
Author

brechtvhb commented Sep 24, 2020

I use kubectl apply -f, through azure dev ops.

Here's an example yaml:

apiVersion: v1
kind: Service
metadata:
  name: nominations-api-service
  namespace: default
  labels:
    HealthChecks: enabled  
spec:
  type: ClusterIP
  selector:
    app: nominations-api
  ports:
  - port: 80
    targetPort: 80

Could you add a log statement that tells if InClusterConfig or BuildConfig was used? Just to be sure InClusterConfig is being used.

@Kampfmoehre
Copy link

I had the same problem but I could resolve it by redeploying the operator (deleting the pod and wait until a new pod has been started).

@brechtvhb
Copy link
Author

@Kampfmoehre Yes indeed that's my workaround too till there's a permanent fix.

@CarlosLanderas
Copy link
Contributor

This started happening when I updated the Kubernetes Client version. I do not know if the watch open connection drops. These days we are going to start migrating everything to dotnet 5.0 and I'll try all this stuff

@CarlosLanderas
Copy link
Contributor

CarlosLanderas commented Nov 21, 2020

Ok, I think I found the problem. This is a regression from the previous version.
It looks like the internal HttpClient is disconnecting due to inactivity. I can see this in the logs:

[02:10:10 ERR] The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.

The previous operator version used to reconnect on watch errors and I am gonna bring it back. This will be available in the next version

@brechtvhb
Copy link
Author

Nice, thank you!

@CarlosLanderas
Copy link
Contributor

CarlosLanderas commented Jun 7, 2021

kubernetes-client/csharp#533

We've suffered same behaviour in a different project. Will try to fix this ASAP with a workaround

@kipusoep
Copy link

Shouldn't this issue be opened again in that case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants