Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-pick #20305 to 7.9: [Autodiscovery] Ignore ErrInputNotFinished errors in autodiscover config checks #20338

Merged
merged 2 commits into from
Jul 30, 2020

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Jul 30, 2020

Cherry-pick of PR #20305 to 7.9 branch. Original message:

What does this PR do?

This PR ignores ErrInputNotFinished error occur in autodiscover stop/start process. This is required in order to avoid having stoped configs that never come back since the start event fails due to this error on first attempt if the previous state is not cleaned yet.

Currently configs that fail due to ErrInputNotFinished are skipped at

. However, this is not a config error but a state error and in that case we need to add them in the list of configs at
a.configs[eventID][hash] = &reload.ConfigWithMeta{
so as to be handled properly by the retry mechanism of autodiscover at
retry = err != nil

Why is it important?

In order to resolve a permanent issue with updated Pods, which makes Filebeat stop collecting logs after a Pod is updated.

How to test this PR locally

  1. Deploy Filebeat on k8s using the following config for autodiscover (set a valid output too so as to ship logs to ES):
filebeat.autodiscover:
  providers:
    - type: kubernetes
      node: ${NODE_NAME}
      templates:
        - condition:
            equals:
              kubernetes.pod.name: "mytarget3"
          config:
            - type: container
              paths:
                - /var/log/containers/*${data.kubernetes.container.id}.log
  1. While Filebeat is up and running deploy a target pod to be autodiscovered and make Filebeat collects its logs:
---
apiVersion: v1
kind: Pod
metadata:
  name: mytarget3
  labels:
    app: test
spec:
  containers:
    - name: test
      image: ubuntu:latest
      command:
        - bash
        - -c
        - |
          #!/bin/bash
          echo "$(date): started the process"

          while :
          do
                 echo "$(date): sleeping 5 seconds"
                 sleep 5
          done
  1. Update the target Pod's manifest by adding an extra label like team: qa
  2. apply the Pod's update with kubectl apply -f <manifest_filename>.yml
  3. Make sure that after a while, Filebeat continues collecting logs after the update of the Pod.

@ChrsMark ChrsMark added [zube]: In Review backport Team:Platforms Label for the Integrations - Platforms team labels Jul 30, 2020
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 30, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 30, 2020
@ChrsMark ChrsMark requested a review from a team July 30, 2020 07:23
@elasticmachine
Copy link
Collaborator

❕ Build Aborted

There is a new build on-going so the previous on-going builds have been aborted.

Pipeline View Test View Changes Artifacts

Expand to view the summary

Build stats

  • Build Cause: [Pull request #20338 opened]

  • Reason: Aborted from #2

  • Start Time: 2020-07-30T07:13:30.847+0000

  • Duration: 10 min 13 sec

  • Commit: 9d8ab56

Log output

Expand to view the last 100 lines of log output

[2020-07-30T07:23:16.906Z] Stage "Libbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:16.916Z] Stage "Metricbeat x-pack" skipped due to earlier failure(s)
[2020-07-30T07:23:16.926Z] Stage "Packetbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:16.934Z] Stage "dockerlogbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:16.943Z] Stage "Winlogbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:16.952Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:16.962Z] Stage "Journalbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:16.971Z] Stage "Generators" skipped due to earlier failure(s)
[2020-07-30T07:23:18.136Z] Failed in branch Elastic Agent x-pack
[2020-07-30T07:23:18.146Z] Failed in branch Elastic Agent x-pack Windows
[2020-07-30T07:23:18.156Z] Failed in branch Elastic Agent Mac OS X
[2020-07-30T07:23:18.165Z] Failed in branch Filebeat oss
[2020-07-30T07:23:18.174Z] Failed in branch Filebeat x-pack
[2020-07-30T07:23:18.184Z] Failed in branch Filebeat Mac OS X
[2020-07-30T07:23:18.192Z] Failed in branch Filebeat x-pack Mac OS X
[2020-07-30T07:23:18.200Z] Failed in branch Filebeat Windows
[2020-07-30T07:23:18.209Z] Failed in branch Filebeat x-pack Windows
[2020-07-30T07:23:18.218Z] Failed in branch Auditbeat oss Linux
[2020-07-30T07:23:18.226Z] Failed in branch Auditbeat crosscompile
[2020-07-30T07:23:18.235Z] Failed in branch Auditbeat oss Mac OS X
[2020-07-30T07:23:18.244Z] Failed in branch Auditbeat oss Windows
[2020-07-30T07:23:18.254Z] Failed in branch Auditbeat x-pack
[2020-07-30T07:23:18.264Z] Failed in branch Auditbeat x-pack Mac OS X
[2020-07-30T07:23:18.276Z] Failed in branch Auditbeat x-pack Windows
[2020-07-30T07:23:18.286Z] Failed in branch Libbeat x-pack
[2020-07-30T07:23:18.296Z] Failed in branch Metricbeat OSS Unit tests
[2020-07-30T07:23:18.304Z] Failed in branch Metricbeat OSS Integration tests
[2020-07-30T07:23:18.313Z] Failed in branch Metricbeat Python integration tests
[2020-07-30T07:23:18.322Z] Failed in branch Metricbeat crosscompile
[2020-07-30T07:23:18.330Z] Failed in branch Metricbeat Mac OS X
[2020-07-30T07:23:18.340Z] Failed in branch Metricbeat x-pack Mac OS X
[2020-07-30T07:23:18.350Z] Failed in branch Metricbeat Windows
[2020-07-30T07:23:18.360Z] Failed in branch Metricbeat x-pack Windows
[2020-07-30T07:23:18.372Z] Failed in branch Winlogbeat Windows x-pack
[2020-07-30T07:23:18.381Z] Failed in branch Kubernetes
[2020-07-30T07:23:18.995Z] Stage "Heartbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:19.006Z] Stage "Libbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:19.016Z] Stage "Metricbeat x-pack" skipped due to earlier failure(s)
[2020-07-30T07:23:19.025Z] Stage "Packetbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:19.034Z] Stage "Winlogbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:19.043Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:19.052Z] Stage "Generators" skipped due to earlier failure(s)
[2020-07-30T07:23:19.479Z] Failed in branch dockerlogbeat
[2020-07-30T07:23:19.487Z] Failed in branch Journalbeat
[2020-07-30T07:23:19.986Z] Stage "Heartbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:19.995Z] Stage "Libbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:20.006Z] Stage "Packetbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:20.015Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-07-30T07:23:20.024Z] Stage "Generators" skipped due to earlier failure(s)
[2020-07-30T07:23:20.139Z] Failed in branch Metricbeat x-pack
[2020-07-30T07:23:20.148Z] Failed in branch Winlogbeat
[2020-07-30T07:23:20.636Z] Failed in branch Heartbeat
[2020-07-30T07:23:20.646Z] Failed in branch Libbeat
[2020-07-30T07:23:20.654Z] Failed in branch Packetbeat
[2020-07-30T07:23:20.662Z] Failed in branch Functionbeat
[2020-07-30T07:23:20.663Z] Stage "Generators" skipped due to earlier failure(s)
[2020-07-30T07:23:20.872Z] Failed in branch Generators
[2020-07-30T07:23:22.596Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20338/src/github.com/elastic/beats
[2020-07-30T07:23:22.935Z] + find . -type f -name TEST*.xml -path */build/* -delete
[2020-07-30T07:23:22.959Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20338/src/github.com/elastic/beats/Lint
[2020-07-30T07:23:23.424Z] + cat
[2020-07-30T07:23:23.424Z] + /usr/local/bin/runbld ./runbld-script
[2020-07-30T07:23:23.424Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-07-30T07:23:30.024Z] runbld>>> runbld started
[2020-07-30T07:23:30.024Z] runbld>>> 1.6.12/f45d832f2ba0aa2722ab4ec1fda8ad140f027f8b
[2020-07-30T07:23:31.940Z] runbld>>> The following profiles matched the job 'Beats/beats/PR-20338' in order of occurrence in the config (last value wins).
[2020-07-30T07:23:33.327Z] runbld>>> Debug logging enabled.
[2020-07-30T07:23:33.327Z] runbld>>> Storing result
[2020-07-30T07:23:33.589Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-07-30T07:23:33.589Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200730072333-5AFD2B60
[2020-07-30T07:23:33.589Z] runbld>>> Adding system facts.
[2020-07-30T07:23:34.979Z] runbld>>> Adding vcs info for the latest commit:  9d8ab5668fcf73427f08f20d64f6448cd915c63e
[2020-07-30T07:23:34.979Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-07-30T07:23:34.979Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-07-30T07:23:34.979Z] + echo 'Processing JUnit reports with runbld...'
[2020-07-30T07:23:34.979Z] Processing JUnit reports with runbld...
[2020-07-30T07:23:35.241Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-07-30T07:23:35.241Z] runbld>>> DURATION: 33ms
[2020-07-30T07:23:35.241Z] runbld>>> STDOUT: 40 bytes
[2020-07-30T07:23:35.241Z] runbld>>> STDERR: 49 bytes
[2020-07-30T07:23:35.241Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-07-30T07:23:35.241Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats_PR-20338/src/github.com/elastic/beats
[2020-07-30T07:23:36.186Z] runbld>>> Storing build metadata: 
[2020-07-30T07:23:36.186Z] runbld>>> Adding test report.
[2020-07-30T07:23:36.186Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats_PR-20338/src/github.com/elastic/beats
[2020-07-30T07:23:37.130Z] runbld>>> Found 0 test output files
[2020-07-30T07:23:37.130Z] runbld>>> Test output logs contained: Errors: 0 Failures: 0 Tests: 0 Skipped: 0
[2020-07-30T07:23:37.130Z] runbld>>> Storing result
[2020-07-30T07:23:37.392Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-07-30T07:23:37.392Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200730072333-5AFD2B60
[2020-07-30T07:23:37.392Z] runbld>>> Email notification disabled by environment variable.
[2020-07-30T07:23:37.392Z] runbld>>> Slack notification disabled by environment variable.
[2020-07-30T07:23:42.993Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats_PR-20338
[2020-07-30T07:23:43.163Z] [INFO] getVaultSecret: Getting secrets
[2020-07-30T07:23:43.237Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-07-30T07:23:44.418Z] + chmod 755 generate-build-data.sh
[2020-07-30T07:23:44.418Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20338/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20338/runs/1 ABORTED 613299
[2020-07-30T07:23:44.418Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20338/runs/1/steps/?limit=10000 -o steps-info.json
[2020-07-30T07:23:44.668Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20338/runs/1/tests/?status=FAILED -o tests-errors.json
[2020-07-30T07:23:44.919Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20338/runs/1/log/ -o pipeline-log.txt

@ChrsMark ChrsMark requested a review from jsoriano July 30, 2020 07:34
@ChrsMark ChrsMark merged commit 0255c57 into elastic:7.9 Jul 30, 2020
@zube zube bot removed the [zube]: Done label Oct 28, 2020
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants