Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kindling-agent was OOM-killed frequently #352

Closed
yifu1024 opened this issue Nov 9, 2022 · 14 comments · Fixed by #499
Closed

kindling-agent was OOM-killed frequently #352

yifu1024 opened this issue Nov 9, 2022 · 14 comments · Fixed by #499
Labels
bug Something isn't working need more info

Comments

@yifu1024
Copy link

yifu1024 commented Nov 9, 2022

Describe the bug
A clear and concise description of what the bug is.

kindling-agent 频繁OOM,Limits memory调到了5Gi,还是会OOM
发现这个问题主要发生在物理机上

How to reproduce?
Steps to reproduce the behavior.

What did you expect to see?
A clear and concise description of what you expected to happen.

What did you see instead?
A clear and concise description of what you saw instead.

Screenshots
If applicable, add screenshots to help explain your problem.

What config did you use?
Config: (e.g. the yaml config file)

Logs
Please attach the logs by running the following command:

Nov  8 18:26:14  kernel: Memory cgroup out of memory: Kill process 211759 (kindling-collec) score 2200 or sacrifice child
Nov  8 18:30:39  kernel: Memory cgroup out of memory: Kill process 282735 (kindling-collec) score 2184 or sacrifice child


# kubectl get pod -n kindling |grep kindling-agent-xgr2k
kindling-agent-xgr2k       1/1     Running   **100**        14d

# kubectl describe pod -n kindling kindling-agent-xgr2k

    State:          Running
      Started:      Wed, 09 Nov 2022 09:27:52 +0800
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Tue, 08 Nov 2022 19:58:21 +0800
      Finished:     Wed, 09 Nov 2022 09:27:51 +0800
    Ready:          True
    Restart Count:  100
    Limits:
      memory:  5Gi
    Requests:
      memory:   2000Mi

# kubectl logs -n kindling kindling-agent-xgr2k|grep -v times_tota |more
kindling-falcolib-probe/
kindling-falcolib-probe/4.18.0-147.el8.x86_64.ko
kindling-falcolib-probe/3.10.0-862.el7.x86_64.ko
kindling-falcolib-probe/4.18.0-305.10.2.el8_4.x86_64.ko
kindling-falcolib-probe/4.19.91-21.2.al7.x86_64.o
kindling-falcolib-probe/4.18.0-147.el8.x86_64.o
kindling-falcolib-probe/3.10.0-1127.el7.x86_64.o
kindling-falcolib-probe/3.10.0-514.el7.x86_64.ko
kindling-falcolib-probe/4.19.91-21.2.al7.x86_64.ko
kindling-falcolib-probe/3.10.0-1160.el7.x86_64.o
kindling-falcolib-probe/5.10.23-5.al8.x86_64.o
kindling-falcolib-probe/4.19.91-25.8.al7.x86_64.o
kindling-falcolib-probe/5.7.8-1.el7.elrepo.x86_64.o
kindling-falcolib-probe/5.10.84-10.al8.x86_64.o
kindling-falcolib-probe/3.10.0-1062.el7.x86_64.ko
kindling-falcolib-probe/3.10.0-229.el7.x86_64.ko
kindling-falcolib-probe/4.18.0-80.el8.x86_64.ko
kindling-falcolib-probe/3.10.0-327.el7.x86_64.ko
kindling-falcolib-probe/4.19.67-16.al7.x86_64.o
kindling-falcolib-probe/4.19.34-11.al7.x86_64.ko
kindling-falcolib-probe/5.10.23-4.al8.x86_64.o
kindling-falcolib-probe/5.7.8-1.el8.elrepo.x86_64.o
kindling-falcolib-probe/4.19.1-1.el7.elrepo.x86_64.o
kindling-falcolib-probe/4.19.91-23.al7.x86_64.ko
kindling-falcolib-probe/4.18.0-240.el8.x86_64.o
kindling-falcolib-probe/4.19.91-22.2.al7.x86_64.o
kindling-falcolib-probe/5.10.23-5.al8.x86_64.ko
kindling-falcolib-probe/4.18.0-348.el8.x86_64.ko
kindling-falcolib-probe/4.18.0-193.el8.x86_64.ko
kindling-falcolib-probe/4.19.81-17.2.al7.x86_64.o
kindling-falcolib-probe/4.19.57-15.1.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-19.1.al7.x86_64.o
kindling-falcolib-probe/4.19.91-25.7.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-21.al7.x86_64.o
kindling-falcolib-probe/5.10.60-9.al8.x86_64.ko
kindling-falcolib-probe/4.19.24-9.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-26.al7.x86_64.o
kindling-falcolib-probe/5.10.23-6.al8.x86_64.ko
kindling-falcolib-probe/4.19.91-24.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-25.6.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-22.1.al7.x86_64.o
kindling-falcolib-probe/4.18.0-193.el8.x86_64.o
kindling-falcolib-probe/4.19.91-22.1.al7.x86_64.ko
kindling-falcolib-probe/3.10.0-693.el7.x86_64.ko
kindling-falcolib-probe/3.10.0-957.el7.x86_64.o
kindling-falcolib-probe/5.10.60-9.al8.x86_64.o
kindling-falcolib-probe/4.19.91-24.al7.x86_64.o
kindling-falcolib-probe/4.18.0-240.el8.x86_64.ko
kindling-falcolib-probe/4.19.91-25.6.al7.x86_64.o
kindling-falcolib-probe/4.19.91-23.al7.x86_64.o
kindling-falcolib-probe/4.19.91-25.1.al7.x86_64.o
kindling-falcolib-probe/4.19.91-21.al7.x86_64.ko
kindling-falcolib-probe/4.19.81-17.2.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-18.al7.x86_64.ko
kindling-falcolib-probe/4.19.43-13.2.al7.x86_64.o
kindling-falcolib-probe/3.10.0-1160.el7.x86_64.ko
kindling-falcolib-probe/4.19.91-26.al7.x86_64.ko
kindling-falcolib-probe/4.19.57-15.1.al7.x86_64.o
kindling-falcolib-probe/4.18.0-305.10.2.el8_4.x86_64.o
kindling-falcolib-probe/4.19.81-17.1.al7.x86_64.o
kindling-falcolib-probe/4.19.91-19.2.al7.x86_64.o
kindling-falcolib-probe/5.10.23-4.al8.x86_64.ko
kindling-falcolib-probe/4.19.81-17.1.al7.x86_64.ko
kindling-falcolib-probe/5.4.153-1.el7.elrepo.x86_64.o
kindling-falcolib-probe/4.19.43-13.2.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-24.1.al7.x86_64.o
kindling-falcolib-probe/4.18.0-80.el8.x86_64.o
kindling-falcolib-probe/4.19.91-22.2.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-25.1.al7.x86_64.ko
kindling-falcolib-probe/5.4.153-1.el7.elrepo.x86_64.ko
kindling-falcolib-probe/4.19.91-25.7.al7.x86_64.o
kindling-falcolib-probe/4.19.91-23.1.al7.x86_64.o
kindling-falcolib-probe/5.7.8-1.el7.elrepo.x86_64.ko
kindling-falcolib-probe/4.18.0-305.3.1.el8.x86_64.o
kindling-falcolib-probe/3.10.0-1062.el7.x86_64.o
kindling-falcolib-probe/3.10.0-957.el7.x86_64.ko
kindling-falcolib-probe/4.19.67-16.al7.x86_64.ko
kindling-falcolib-probe/5.10.23-6.al8.x86_64.o
kindling-falcolib-probe/5.10.84-10.al8.x86_64.ko
kindling-falcolib-probe/4.19.91-19.1.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-24.1.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-23.1.al7.x86_64.ko
kindling-falcolib-probe/4.18.0-348.el8.x86_64.o
kindling-falcolib-probe/4.18.0-305.3.1.el8.x86_64.ko
kindling-falcolib-probe/5.7.8-1.el8.elrepo.x86_64.ko
kindling-falcolib-probe/4.19.91-19.2.al7.x86_64.ko
kindling-falcolib-probe/3.10.0-123.el7.x86_64.ko
kindling-falcolib-probe/4.19.91-25.8.al7.x86_64.ko
kindling-falcolib-probe/4.19.91-18.al7.x86_64.o
kindling-falcolib-probe/3.10.0-1127.el7.x86_64.ko
kindling-falcolib-probe/4.19.1-1.el7.elrepo.x86_64.ko
* Mounting debugfs
* BPF probe located, it's now possible to start kindling
* Load probe succeeded, and will create /opt/kernel-support for kubernetes
2022/11/09 01:27:54 GitCommitInfo:; go1.17.7 linux/amd64
2022-11-09T01:27:54.666Z        INFO    component/telemetry.go:48       Log Initialize Success! ConsoleLevel: info,FileRotationLevel: info
FileRotationConfig:     LogFile: agent.log      MaxSize: 512m   backup: 5       MaxAge: 30day
2022-11-09T01:27:54.667Z        INFO    observability/telemetry.go:157  Initializing self-observability exporter whose type is stdout
2022-11-09T01:27:54.669Z        INFO    otelexporter/prometheus.go:18   Prometheus Server listening at port: [:9500]
2022/11/09 01:28:01 rcv buffer size for netlink socket is 2097152 bytes
2022/11/09 01:28:01 initialized conntrack with target_rate_limit=5000 messages/sec
2022-11-09T01:28:01.757Z        INFO    analyzer/manager.go:59  Starting analyzer [networkanalyzer]
2022-11-09T01:28:01.757Z        INFO    analyzer/manager.go:59  Starting analyzer [tcpmetricanalyzer]
2022-11-09T01:28:01.757Z        INFO    analyzer/manager.go:59  Starting analyzer [tcpconnectanalyzer]
2022-11-09T01:28:01.757Z        INFO    cgoreceiver/cgoreceiver.go:60   Start CgoReceiver
2022-11-09T01:28:20.044Z        INFO    cgoreceiver/cgoreceiver.go:179  The subscribed events are: [{net syscall_exit-writev} {net syscall_exit-readv} {net syscall_exit-write} {net syscall_exit-read
} {net syscall_exit-sendto} {net syscall_exit-recvfrom} {net syscall_exit-sendmsg} {net syscall_exit-recvmsg} { kprobe-tcp_close} { kprobe-tcp_rcv_established} { kprobe-tcp_drop} { kprobe-tcp_retransmit_skb
} { syscall_exit-connect} { kretprobe-tcp_connect} { kprobe-tcp_set_state}]
sub event name:syscall_exit-writev  &&  category:net
sub event name:syscall_exit-readv  &&  category:net
sub event name:syscall_exit-write  &&  category:net
sub event name:syscall_exit-read  &&  category:net
sub event name:syscall_exit-sendto  &&  category:net
sub event name:syscall_exit-recvfrom  &&  category:net
sub event name:syscall_exit-sendmsg  &&  category:net
sub event name:syscall_exit-recvmsg  &&  category:net
sub event name:kprobe-tcp_close  &&  category:
sub event name:kprobe-tcp_rcv_established  &&  category:
sub event name:kprobe-tcp_drop  &&  category:
sub event name:kprobe-tcp_retransmit_skb  &&  category:
sub event name:syscall_exit-connect  &&  category:
sub event name:kretprobe-tcp_connect  &&  category:
sub event name:kprobe-tcp_set_state  &&  category:
CPU 0 configuration change detected.
CPU 0 configuration change detected.
CPU 0 configuration change detected.
CPU 0 configuration change detected.
CPU 0 configuration change detected.

......

2022-11-09T03:07:30.223Z        INFO    internal/connect_monitor.go:149 Receive another unexpected tcp_connect event    {"connKey": "src: 127.0.0.1:58040, dst: 127.0.0.1:8401"}
2022-11-09T03:08:03.569Z        INFO    internal/connect_monitor.go:149 Receive another unexpected tcp_connect event    {"connKey": "src: 127.0.0.1:52186, dst: 127.0.0.1:8401"}
2022-11-09T03:08:03.686Z        INFO    internal/connect_monitor.go:149 Receive another unexpected tcp_connect event    {"connKey": "src: 127.0.0.1:52284, dst: 127.0.0.1:8401"}
2022-11-09T03:08:06.674Z        INFO    internal/connect_monitor.go:149 Receive another unexpected tcp_connect event    {"connKey": "src: 127.0.0.1:54298, dst: 127.0.0.1:8401"}

Environment (please complete the following information)

  • Kindling agent version v0.4.1
  • Kindlinng-falcon-lib version
  • Node OS version CentOS Linux release 7.6.1810 (Core)
  • Node Kernel version 3.10.0-1160.el7.x86_64
  • Kubernetes version v1.20.10
  • Prometheus version
  • Grafana version

Additional context
Add any other context about the problem here, like appliction protocol.

@yifu1024 yifu1024 added the bug Something isn't working label Nov 9, 2022
@dxsup dxsup changed the title kindling-agent 频繁 OOM kindling-agent was OOM frequently Nov 9, 2022
@dxsup
Copy link
Member

dxsup commented Nov 9, 2022

能否提供一下日志文件?里面有一些自监控指标有助于排查问题。

@yifu1024
Copy link
Author

yifu1024 commented Nov 9, 2022

能否提供一下日志文件?里面有一些自监控指标有助于排查问题。

kindling-agent的日志?

@dxsup dxsup changed the title kindling-agent was OOM frequently kindling-agent was OOM-killed frequently Nov 9, 2022
@dxsup
Copy link
Member

dxsup commented Nov 9, 2022

需要标准输出的内容,自监控的日志只输出到标准输出了

@dxsup
Copy link
Member

dxsup commented Nov 14, 2022

请问您在这个node上的应用数量有多少?请求量在什么样的量级?

@yifu1024
Copy link
Author

需要标准输出的内容,自监控的日志只输出到标准输出了
指的是agent.log么?
agent.log
2022-11-15T05:17:44.010Z INFO component/telemetry.go:48 Log Initialize Success! ConsoleLevel: info,FileRotationLevel: info
FileRotationConfig: LogFile: agent.log MaxSize: 512m backup: 5 MaxAge: 30day
2022-11-15T05:17:44.011Z INFO observability/telemetry.go:157 Initializing self-observability exporter whose type is stdout
2022-11-15T05:17:44.016Z INFO otelexporter/prometheus.go:18 Prometheus Server listening at port: [:9500]
2022-11-15T05:17:51.572Z INFO analyzer/manager.go:59 Starting analyzer [networkanalyzer]
2022-11-15T05:17:51.572Z INFO analyzer/manager.go:59 Starting analyzer [tcpmetricanalyzer]
2022-11-15T05:17:51.572Z INFO analyzer/manager.go:59 Starting analyzer [tcpconnectanalyzer]
2022-11-15T05:17:51.572Z INFO cgoreceiver/cgoreceiver.go:60 Start CgoReceiver
2022-11-15T05:18:11.981Z INFO cgoreceiver/cgoreceiver.go:179 The subscribed events are: [{net syscall_exit-writev} {net syscall_exit-readv} {net syscall_exit-write} {net syscall_exit-read} {net syscall_exit-sendto
} {net syscall_exit-recvfrom} {net syscall_exit-sendmsg} {net syscall_exit-recvmsg} { kprobe-tcp_close} { kprobe-tcp_rcv_established} { kprobe-tcp_drop} { kprobe-tcp_retransmit_skb} { syscall_exit-connect} { kretprobe-tcp_co
nnect} { kprobe-tcp_set_state}]
2022-11-15T05:19:06.369Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48752, dst: 20.201.165.2:8080"}
2022-11-15T05:19:06.464Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48800, dst: 20.201.165.2:8080"}
2022-11-15T05:19:07.023Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48934, dst: 20.201.165.2:8080"}
2022-11-15T05:19:07.997Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:50012, dst: 20.201.165.2:8080"}
2022-11-15T05:19:09.879Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:50476, dst: 20.201.165.2:8080"}
2022-11-15T05:19:57.764Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48750, dst: 20.201.165.2:8080"}
2022-11-15T05:19:57.951Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48794, dst: 20.201.165.2:8080"}
2022-11-15T05:19:57.992Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48808, dst: 20.201.165.2:8080"}
2022-11-15T05:19:58.067Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48818, dst: 20.201.165.2:8080"}
2022-11-15T05:19:58.089Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48866, dst: 20.201.165.2:8080"}
2022-11-15T05:19:59.644Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:49690, dst: 20.201.165.2:8080"}
2022-11-15T05:20:49.708Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48862, dst: 20.201.165.2:8080"}
2022-11-15T05:20:59.261Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 127.0.0.1:51752, dst: 127.0.0.1:6379"}
2022-11-15T05:21:59.243Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:22:14.281Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:22:34.342Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:22:45.134Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:49252, dst: 20.201.165.2:8080"}
2022-11-15T05:22:54.380Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:23:40.163Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:50385, dst: 20.201.165.2:8080"}
2022-11-15T05:24:04.188Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:24:15.313Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 127.0.0.1:37310, dst: 127.0.0.1:8401"}
2022-11-15T05:24:33.077Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 10.244.10.54:48830, dst: 20.201.165.2:8080"}
2022-11-15T05:25:21.147Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 127.0.0.1:44274, dst: 127.0.0.1:8401"}
2022-11-15T05:25:39.255Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:25:44.172Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:25:54.357Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:26:24.314Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:27:21.138Z INFO internal/connect_monitor.go:149 Receive another unexpected tcp_connect event {"connKey": "src: 127.0.0.1:51440, dst: 127.0.0.1:8401"}
2022-11-15T05:28:24.217Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:29:34.421Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:29:49.424Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}
2022-11-15T05:31:24.308Z WARN otel@v1.2.0/handler.go:106 Opentelemetry-go encountered an error: {"error": "negative value is out of range for this instrument"}

@yifu1024
Copy link
Author

请问您在这个node上的应用数量有多少?请求量在什么样的量级?

目前node上pod124个

@dxsup
Copy link
Member

dxsup commented Nov 15, 2022

想看下自监控指标,这个指标会输出到“标准输出”中,不是agent.log。就是使用kubectl logs -n kindling kindling-agent-xgr2k得到的结果。

@yifu1024
Copy link
Author

想看下自监控指标,这个指标会输出到“标准输出”中,不是agent.log。就是使用kubectl logs -n kindling kindling-agent-xgr2k得到的结果。

开篇提供日志~,只是日志太多,删除了部分

kubectl logs -n kindling kindling-agent-xgr2k|grep -v times_tota |more

@dxsup
Copy link
Member

dxsup commented Nov 15, 2022

😂刚好删除了需要的部分,可以把结果输出到文件中,然后发一下吗

@yifu1024
Copy link
Author

kindling-agent-xgr2k.log

@yehaifeng
Copy link

I found the kindling pods restarted for Readiness probe failure.

10m         Normal    Created                  pod/kindling-agent-npgbt   Created container kindling-agent
10m         Normal    Started                  pod/kindling-agent-npgbt   Started container kindling-agent
33m         Warning   Unhealthy                pod/kindling-agent-npgbt   Readiness probe failed: cat: /opt/kernel-support: No such file or directory
168m        Warning   Unhealthy                pod/kindling-agent-npgbt   Readiness probe errored: rpc error: code = NotFound desc =container is not created or running: checking if PID of 576cd8f9bf694a2be52ca2c89fef0c7ccec805d36793decdf05330be8d445442 is running failed: container process not found

@dxsup
Copy link
Member

dxsup commented Nov 21, 2022

I found the kindling pods restarted for Readiness probe failure.

10m         Normal    Created                  pod/kindling-agent-npgbt   Created container kindling-agent
10m         Normal    Started                  pod/kindling-agent-npgbt   Started container kindling-agent
33m         Warning   Unhealthy                pod/kindling-agent-npgbt   Readiness probe failed: cat: /opt/kernel-support: No such file or directory
168m        Warning   Unhealthy                pod/kindling-agent-npgbt   Readiness probe errored: rpc error: code = NotFound desc =container is not created or running: checking if PID of 576cd8f9bf694a2be52ca2c89fef0c7ccec805d36793decdf05330be8d445442 is running failed: container process not found

It won't restart if only the readiness probe fails. It restarts because the kernel is not supported. You should compile your own probe for your kernel version.

@yanhongchang
Copy link
Contributor

I have the same problem,the kindling agent always killed by OOM,Whether there is memory leak?
k8s version: 1.20.8
kindling version:0.5.0
kindling-agent-zccw5.log
截图_1671001346267

@dxsup
Copy link
Member

dxsup commented Dec 14, 2022

Thanks for the feedback. We are still working on this issue. You can reset the enviroment variable GOGC to 100 to decrease the frequency of OOM-killed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working need more info
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants