-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8.7.1: K8s Stand-alone Kubernetes Deployment - Filestream input with ID '' Already Exists #2701
Comments
Hmm, Is there another instance of filestream in the agent policy? It may also be the case that the kubernetes variable substitution is failing but I don't recall ever seeing that happen. If you can upload the archive output by |
Hi @cmacknz, Please find attached the diagnostics bundle that I've removed our API keys and Elasticsearch domains from. I had to revert to the variable based filestream id as shown in the standalone config, as I had changed this for troubleshooting. I also had Pod OOMs running the diagnostics tool. I pushed the limit to 2 GB and it was all working fine. I might also add, I've been having OOMs and had to push it from 700 to 1024 MB memory limits to get it to be stable in regular production usage. Anyway, hope these logs assist. It does appear that the variables are being resolved in the computed config, but I'm still receiving the errors. Thanks Edit: Masking more environment stuff |
Thanks we're aware the memory usage when up in 8.6.x as part of an incremental architecture change, with another change coming that should drop it back down. You should be able to fix this by duplicating the - data_stream:
namespace: default
id: container-log-${kubernetes.pod.name}-${kubernetes.container.id}
meta:
package:
name: kubernetes
version: 1.29.2
streams:
- data_stream:
dataset: kubernetes.container_logs
type: logs
id: container-log-${kubernetes.pod.name}-${kubernetes.container.id} # <-- Added unique stream ID
parsers:
- container: null
paths:
- /var/log/containers/*${kubernetes.container.id}.log
processors:
- add_fields:
fields:
name: oureksclustername
url: https://ourekscluster.yl4.ap-southeast-2.eks.amazonaws.com
target: orchestrator.cluster
prospector.scanner.symlinks: true
type: filestream
use_output: default For a full explanation see #2573 (comment). Once we confirm this is the fix we'll leave this issue open to adjust the reference template and document this explanation. |
HI @cmacknz, I swear I had tried this before; I may have moved the id key to the data_streams section instead of duplicating, as I remember having a syntax error. Eitherway, the above appears to have resolved the issue for me. I'll leave the issue open as requested. Many thanks, |
Closing, looks like the id is set to be unique in the latest example YAML:
|
Hi @cmacknz, I'll just add that the id is not set in stream array like you asked me to add in the standalone config, only the top-level data_stream section, which was the cause of my problem. Cheers, |
Ah you are correct, I misread it. I'll fix this, thanks. |
I've used the example K8s manifest to deploy Elastic Agent on our AWS EKS cluster. I'm running the matching version of kube-state-metrics (v2.3.0) for our AWS EKS instances. I followed the documentation in the EKS section to comment out modules that are unavailable in AWS EKS such as the control plane and audit logs.
Recently, I upgraded to 8.7.0 and noticed a huge amount of log volume increase. A large portion of this was coming from Elastic Agent logs itself. Some of this was resolved in 8.7.1 due to issues with the logging, and since then setting the logging level to warning has reduced my events per minute from about 100k to about 2k. Most of the remaining 2k Elastic Agent logs are the following:
filestream input with ID '' already exists, this will lead to data duplication, please use a different ID
. I then added manual IDs to every data_stream in my configmap, and discovered it's related to:This seems related to the PR #742. However, being a much newer version, seems like this should be fixed. The standalone config map does have the id added from the PR - so I'm not sure why we're seeing this. In combination with this, I have also been receiving the warning:
DEPRECATED: Log input. Use Filestream input instead.
Is there a working
filestream
configuration that could be used for a Standalone Deployment that doesn't result large volumes of the warnings described above? Just trying to clear this up because space is always at a premium on Elasticsearch.The text was updated successfully, but these errors were encountered: