-
Notifications
You must be signed in to change notification settings - Fork 42
k8s-autodiscover for elastic-agent fails #1992
Comments
@mdelapenya do we have a way to getting logs from the failing Pod? Without this it's almost impossible to know why the Pod is failing, it can be that Agent is not able to get enrolled, a panic or whatever. Only way to troubleshoot this is by trying to reproduce it locally by running the suite which is quite time consuming. |
Thank you for the update @mdelapenya , that will help a lot with the debugging efforts. |
BTW it's still possible to reproduce this locally, I'm updating the steps to reproduce in the description |
Running the suite locally with I see the following in the failing Agent's Pod:
This may be related to elastic/beats#29811 or its related issues. |
Let me try to run the tests after latest changes |
Test logs
State of the elastic-agent pod:
Pod logs
|
Thank you @mdelapenya , it looks like a panic when Agent tries to open a connection to ES? Do you have more extensive output of the error so as to see the full stacktrace? |
Unfortunately no. That is the entire output of the elastic-agent pod. Pods
This situation is easy to reproduce: # sync code
git pull upstream main
# run tests
TAGS="elastic-agent" TIMEOUT_FACTOR=3 LOG_LEVEL=TRACE DEVELOPER_MODE=true ELASTIC_APM_ACTIVE=false PROVIDER=docker make -C e2e/_suites/kubernetes-autodiscover functional-test when you see there are a lot of retries, hit Ctrl+C to abort the execution and start accessing the kind cluster and the pods, simply reading the test logs to find the kubeconfig file name, the namespace, pod name, etc |
This is weird cause now I'm only seeing the following:
It seems that the new images do not have this package installed and at the same time when I |
Solving the
Pushed a fix #2141, let's see if that solves the issue. |
The fix went green @mdelapenya . |
Steps to reproduce
Run this command from the root dir of the test framework:
TAGS="elastic-agent" TIMEOUT_FACTOR=3 LOG_LEVEL=TRACE DEVELOPER_MODE=true ELASTIC_APM_ACTIVE=false PROVIDER=docker make -C e2e/_suites/kubernetes-autodiscover functional-test
Expected behaviour: the elastic-agent container is up and running
Current behaviour: the elastic-agent container cannot be found (see logs)
CI log
First error build: https://beats-ci.elastic.co/job/e2e-tests/job/e2e-testing-k8s-autodiscovery-daily-mbp/job/main/10/ (6 days ago)
Last successful build: https://beats-ci.elastic.co/job/e2e-tests/job/e2e-testing-k8s-autodiscovery-daily-mbp/job/main/9/ (6 days ago)
It seems the elastic-agent container is not there
The text was updated successfully, but these errors were encountered: