Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Prevent mutable script params reference from leaking into pipeline #23534

Merged

Conversation

andrewkroh
Copy link
Member

What does this PR do?

A reference to a list contained in the script processor's params was
leaking into the rest of the pipeline and then being modified by later
processors. Now a copy of the List is written into the event.

For concurrent processing this resulted in ConcurrentModificationExceptions,
but I assume it led to incorrect event.type values in single worker deployments
because the params object was modified.

Why is it important?

This fixes processing issues when multiple workers or Filebeats are used.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

A reference to a list contained in the `script` processor's `params` was
leaking into the rest of the pipeline and then being modified by later
processors. Now a copy of the List is written into the event.

For concurrent processing this resulted in `ConcurrentModificationException`s,
but I assume it led to incorrect `event.type` values in single worker deployments
because the `params` object was modified.
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 15, 2021
@andrewkroh andrewkroh added the needs_backport PR is waiting to be backported to other branches. label Jan 15, 2021
@andrewstucki
Copy link

Just wondering, did we try this out with multiple workers and a large-ish log input to make sure no concurrent modification exceptions were thrown?

@andrewkroh
Copy link
Member Author

I ran a test using 16 workers and saw no error.message in events and no exceptions in the ES log.

filebeat.modules:
- module: suricata
  eve:
    enabled: true
    var.paths: [/Users/akroh/Downloads/eve.json-2020120917]

filebeat.overwrite_pipelines: true

output.elasticsearch:
  hosts: ["localhost:9200"]
  bulk_max_size: 1024
  worker: 16
  compression_level: 6
  username: elastic
  password: changeme

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 15, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #23534 updated

    • Start Time: 2021-01-15T16:28:48.712+0000
  • Duration: 44 min 43 sec

  • Commit: cb5e1e8

Test stats 🧪

Test Results
Failed 0
Passed 5129
Skipped 574
Total 5703

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 5129
Skipped 574
Total 5703

@octavioranieri-zz
Copy link

FYI, below the tests ingesting a 4.7GB eve.json file with 16 workers, 0 failures for ~10kk events:

image

@andrewkroh andrewkroh merged commit c5e2cb3 into elastic:master Jan 15, 2021
andrewkroh added a commit to andrewkroh/beats that referenced this pull request Jan 15, 2021
…lastic#23534)

A reference to a list contained in the `script` processor's `params` was
leaking into the rest of the pipeline and then being modified by later
processors. Now a copy of the List is written into the event.

For concurrent processing this resulted in `ConcurrentModificationException`s,
but I assume it led to incorrect `event.type` values in single worker deployments
because the `params` object was modified.

(cherry picked from commit c5e2cb3)
@andrewkroh andrewkroh added v7.12.0 and removed needs_backport PR is waiting to be backported to other branches. labels Jan 15, 2021
andrewkroh added a commit that referenced this pull request Jan 25, 2021
…eference from leaking into pipeline (#23535)

A reference to a list contained in the `script` processor's `params` was
leaking into the rest of the pipeline and then being modified by later
processors. Now a copy of the List is written into the event.

For concurrent processing this resulted in `ConcurrentModificationException`s,
but I assume it led to incorrect `event.type` values in single worker deployments
because the `params` object was modified.

(cherry picked from commit c5e2cb3)
andrewkroh added a commit to andrewkroh/beats that referenced this pull request Jan 25, 2021
…lastic#23534)

A reference to a list contained in the `script` processor's `params` was
leaking into the rest of the pipeline and then being modified by later
processors. Now a copy of the List is written into the event.

For concurrent processing this resulted in `ConcurrentModificationException`s,
but I assume it led to incorrect `event.type` values in single worker deployments
because the `params` object was modified.

(cherry picked from commit c5e2cb3)
andrewkroh added a commit that referenced this pull request Jan 25, 2021
…23534) (#23654)

A reference to a list contained in the `script` processor's `params` was
leaking into the rest of the pipeline and then being modified by later
processors. Now a copy of the List is written into the event.

For concurrent processing this resulted in `ConcurrentModificationException`s,
but I assume it led to incorrect `event.type` values in single worker deployments
because the `params` object was modified.

(cherry picked from commit c5e2cb3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants