You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have recently migrated from different services to the ECK-managed Elastic Agent for our monitoring. Everything works as it should, however we noticed that our Kubernetes state metrics stopped working after ~30 minutes.
After some digging I found the following error logs from the Agent:
[elastic_agent.metricbeat][error] Failed to list light metricsets for module kubernetes: getting metricsets for module 'kubernetes': loading light module 'kubernetes' definition: loading module configuration from '/usr/share/elastic-agent/data/elastic-agent-0e1a73/install/metricbeat-8.5.3-linux-x86_64/module/kubernetes/module.yml': config file ("/usr/share/elastic-agent/data/elastic-agent-0e1a73/install/metricbeat-8.5.3-linux-x86_64/module/kubernetes/module.yml") must be owned by the user identifier (uid=0) or root
along with nearly identical messages for a bunch more modules not being loaded properly, like rabbitmq, windows, golang, haproxy etc.
It appears that the error coincides with the time at which state-metrics stop working.
Next thing I did is shell into the pod to see why, and it turns out that the installed packages in the /usr/share/elastic-agent/data/elastic-agent-[hash]/install folder are owned by elastic-agent, rather than root:
root@elastic-agent-agent-5h22z:/usr/share/elastic-agent/data/elastic-agent-0e1a73/install# ll
total 77
drwxr-xr-x 10 elastic-agent elastic-agent 10 Dec 6 00:08 ./
drwxrwx--- 5 root root 6 Dec 6 00:08 ../
drwxr-xr-x 2 elastic-agent elastic-agent 8 Dec 5 04:55 apm-server-8.5.3-linux-x86_64/
drwxr-xr-x 2 elastic-agent elastic-agent 9 Dec 6 00:08 cloudbeat-8.5.3-linux-x86_64/
drwxr-xr-x 2 elastic-agent elastic-agent 6 Dec 6 00:08 endpoint-security-8.5.3-linux-x86_64/
drwxr-xr-x 6 elastic-agent elastic-agent 14 Jan 2 12:59 filebeat-8.5.3-linux-x86_64/
drwxr-xr-x 2 elastic-agent elastic-agent 3 Dec 5 05:52 fleet-server-8.5.3-linux-x86_64/
drwxr-xr-x 4 elastic-agent elastic-agent 12 Dec 6 00:08 heartbeat-8.5.3-linux-x86_64/
drwxr-xr-x 6 elastic-agent elastic-agent 14 Jan 2 12:59 metricbeat-8.5.3-linux-x86_64/
drwxr-xr-x 3 elastic-agent elastic-agent 13 Dec 6 00:08 osquerybeat-8.5.3-linux-x86_64/
This is set by default in the official containers to make the compatible with non-root users, however this conflicts with the safety requirements of metricbeat.
Could this be related to the fact that Elastic Agent recently changed to require running it with root? #6147
Even though we do not currently use the hostPath it still will not run with a non-root user due to some other permission issue.
@gertjanvg I'm working to try and replicate your issue. I'm taking ownership as I'm wondering if it's related to #6239 , and the other issues referenced in that issue. I'll update when I can with more information.
@gertjanvg I've been unable to replicate this issue with both ECK version 2.5.0, and 2.6.0 using stack version 8.5.3 with a very similar manifest as you have
In both cases, I have had Elastic agents in the daemonset pulling kubernetes metrics for about 24 hours with no errors.
Could. you possibly provide more details into your fleet/kibana/elasticsearch configuration, and potentially attempt with a newer version of ECK 2.6.1, and Elastic stack version 8.6.2? Thanks.
Closing this as I've been unable to reproduce this with the newest version of ECK, and further information has not been provided. If new information comes to light, please feel free to re-open this issue. Thanks.
Bug Report
We have recently migrated from different services to the ECK-managed Elastic Agent for our monitoring. Everything works as it should, however we noticed that our Kubernetes state metrics stopped working after ~30 minutes.
After some digging I found the following error logs from the Agent:
along with nearly identical messages for a bunch more modules not being loaded properly, like rabbitmq, windows, golang, haproxy etc.
It appears that the error coincides with the time at which state-metrics stop working.
Next thing I did is shell into the pod to see why, and it turns out that the installed packages in the
/usr/share/elastic-agent/data/elastic-agent-[hash]/install
folder are owned byelastic-agent
, rather thanroot
:This is set by default in the official containers to make the compatible with non-root users, however this conflicts with the safety requirements of metricbeat.
Could this be related to the fact that Elastic Agent recently changed to require running it with root? #6147
Even though we do not currently use the
hostPath
it still will not run with a non-root user due to some other permission issue.Environment
Elasticsearch version: 8.5.3
Elastic Agent version: 8.5.3
ECK version: 2.5.0
Kubernetes version
Agent configuration
The text was updated successfully, but these errors were encountered: