Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streams resetting resource version #231

Closed
elliottneilclark opened this issue Mar 1, 2023 · 4 comments · Fixed by #232
Closed

Streams resetting resource version #231

elliottneilclark opened this issue Mar 1, 2023 · 4 comments · Fixed by #232

Comments

@elliottneilclark
Copy link
Contributor

We have a k8s application that streams a watch of just about every kind or resource. When there's a lot of activity we will see a lot of log spam like this:

15:46:24.846 mfa=K8s.Client.Runner.Stream.Watch.process_object/2 [error] K8s.Client.Runner.Stream.Watch Erronous event received from watcher: Liveness probe failed: Get "http://10.244.0.16:9090/healthz/": dial tcp 10.244.0.16:9090: connect: connection refused - resetting the resource version
15:46:24.847 mfa=K8s.Client.Runner.Stream.Watch.process_object/2 [error] K8s.Client.Runner.Stream.Watch Erronous event received from watcher: Readiness probe failed: Get "http://10.244.0.16:9090/readyz/": dial tcp 10.244.0.16:9090: connect: connection refused - resetting the resource version
15:46:24.867 mfa=K8s.Client.Runner.Stream.Watch.process_object/2 [error] K8s.Client.Runner.Stream.Watch Erronous event received from watcher: Container image "ghcr.io/aquasecurity/trivy-operator:0.11.1" already present on machine - resetting the resource version
15:46:24.879 mfa=K8s.Client.Runner.Stream.Watch.process_object/2 [error] K8s.Client.Runner.Stream.Watch Erronous event received from watcher: Created container trivy-operator - resetting the resource version
15:46:24.950 mfa=K8s.Client.Runner.Stream.Watch.process_object/2 [error] K8s.Client.Runner.Stream.Watch Erronous event received from watcher: Started container trivy-operator - resetting the resource version

https://github.com/coryodaniel/k8s/blob/develop/lib/k8s/client/runner/stream/watch.ex#L170 It seems like watches are resetting the version when they should just ignore that event line.

@mruoss
Copy link
Collaborator

mruoss commented Mar 2, 2023

Oooh you're probably right, it seems in this case we should not reset the RV. Still, can you give me some more details on what you're doing? So you're watching pods and you get these as a pod is starting up?

As for the log spam

  • This should not be a Logger.error(). Probably gonna change it to notice.
  • There are debates on whether libs should log. They do give us all insight though. I have added a :library context to all log stmts so you can configure them in config.exs:
    config :logger,
      compile_time_purge_matching: [
        [library: :k8s]
        # or only show errors [library: :k8s, level_lower_than: :error]
      ]

@mruoss
Copy link
Collaborator

mruoss commented Mar 2, 2023

Ooooh you're watching Events, right? Ha! Thats some nice edge case. The implementation is wrong of course.

@elliottneilclark
Copy link
Contributor Author

Ooooh you're watching Events

Among other things yea. We start watching streams for ~80 different resource kinds (some that are on the cluster as crds and some that aren't). Essentially we are streaming everything that is happening on the clusters. Event is the most interesting.

@elliottneilclark
Copy link
Contributor Author

As for the log spam

I didn't mean to imply there is anything wrong with the amount that k8s is logging. I think it's totally fine. I actually prefer a noisy log with extra info. We just run with just about all logging turned on so I consider the whole logging stream as a firehose, and that verbiage slipped in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants