Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional panic in kubernetes autodiscovery on pod deletion #23385

Closed
marqc opened this issue Jan 7, 2021 · 2 comments · Fixed by #23419
Closed

Occasional panic in kubernetes autodiscovery on pod deletion #23385

marqc opened this issue Jan 7, 2021 · 2 comments · Fixed by #23419
Assignees
Labels
bug Team:Integrations Label for the Integrations team

Comments

@marqc
Copy link
Contributor

marqc commented Jan 7, 2021

filebeat v7.10.1 running on kubernetes 1.19.4 (self hosted). This makes filebeat panic and restart.

This happened 3 times in last 48 hours on one instance (of 8 running). Termination log is:

2021-01-06T06:27:59.774+0100    INFO    log/harvester.go:325    File was removed: /var/log/containers/envoydemo-cron-1609910760-7xb8q_default_app-6988bc5f0b40a02b78ca61ab008e40f2bc1519a3f21c10401cef013df31be034.log. Closing because close_removed is enabled.
2021-01-06T06:28:19.538+0100    INFO    log/harvester.go:333    File is inactive: /var/log/containers/glapi-worker-1609910400-5p64z_default_leads-8a63b334c71bf990558fe170936954967cdaf329959aa2b8cd9e46254d643d37.log. Closing because close_inactive of 5m0s reached.
2021-01-06T06:28:23.359+0100    INFO    input/input.go:136      input ticker stopped
2021-01-06T06:28:30.527+0100    INFO    input/input.go:136      input ticker stopped
panic: interface conversion: interface {} is cache.DeletedFinalStateUnknown, not *v1.Pod

goroutine 53011 [running]:
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).OnDelete.func1()
        /go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:174 +0x7c
created by time.goFunc
        /usr/local/go/src/time/sleep.go:168 +0x44
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 7, 2021
@andresrc andresrc added the Team:Integrations Label for the Integrations team label Jan 8, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 8, 2021
@ChrsMark
Copy link
Member

The root cause of the issue should be the type casting that takes place at

time.AfterFunc(p.config.CleanupTimeout, func() { p.emit(obj.(*kubernetes.Pod), "stop") })
.

The behaviour is documented at client's code as well as at Beats code.

Most probably we need something like this https://github.com/kubernetes/kubernetes/pull/34694/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Integrations Label for the Integrations team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants