Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Packetbeat] Refactor packetbeat for use with Elastic Agent #22134

Merged
merged 9 commits into from
Nov 3, 2020

Conversation

andrewstucki
Copy link

@andrewstucki andrewstucki commented Oct 23, 2020

What does this PR do?

This PR refactors packetbeat to support for agent-based configuration. In order to facilitate control by agent, it does a few things:

  1. Gets rid of any global variables that are initialized based on the configuration state
  2. Adds handling that overrides any configuration if packetbeat.inputs are provided.
  3. Makes the index that packetbeat pushes events to overrideable on each input (flows or protocols)
  4. Uses centralized configuration management reloading to restart the sniffer and all worker goroutines when configuration changes

Follow ups

We'll likely want to do a few things in the future, I can create issues if we want:

  1. Right now, due to the way a single sniffer gets initialized and then delegates all of the protocol parsing to internally registered protocol handlers, each configuration reload, if even a single input changes, the sniffer needs to be stopped and started--so effectively all inputs are tied together. Not sure if it would make sense to move to some sort of worker routine per protocol or not, but it would get rid of some of the custom reload logic I had to add.
  2. Once again, due to the single sniffer, the sniffing interface can really only be configured once. Right now I just choose the "default" interface based off of the OS, but ideally this would be customizable. That said, what it might mean is that we need to make all packetbeat integrations a single package or change packetbeat so we run a sniffer per input.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label Team:Ingest Management labels Oct 23, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 23, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Oct 23, 2020

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #22134 updated]

  • Start Time: 2020-10-29T15:29:18.030+0000

  • Duration: 76 min 42 sec

Test stats 🧪

Test Results
Failed 0
Passed 16382
Skipped 1344
Total 17726

@andrewstucki andrewstucki requested a review from a team October 23, 2020 22:18
Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must decide, if packetbeat should be packaged by default or not. @ph @mostlyjason

packetbeat/beater/packetbeat.go Show resolved Hide resolved
@andrewstucki andrewstucki changed the title [Packetbeat] Add packetbeat support to Elastic Agent [Packetbeat] Refactor packetbeat for use with Elastic Agent Oct 26, 2020
@ruflin
Copy link
Member

ruflin commented Oct 26, 2020

Would it be possible to support inputs on the top level instead of under the packetbeat prefix? Like this Agent could just forward.

In general, I think this PR is update to the team to get reviewed and merged as it does not tackle Agent yet, thanks for taking it out.

@andrewstucki
Copy link
Author

@ruflin quick question about that--right now the agent configuration files do a lot more than just moving some keys around. That includes:

  1. injecting agent metadata
  2. adding stream dataset info into events
  3. flattening streams input hashes into each individual input itself
  4. and injecting per-index configuration for each input

having the inputs be at the top level doesn't mitigate any of the above, so agent still wouldn't be able to forward configuration from Kibana unmodified. We'd either still need to rely heavily on the current spec transpilation logic or add some processors into each integration package and update the configuration deserialization code in each beat to support deserializing (and using) some shared notion of a Streams configuration structure.

I stripped out any additions to the transpiler that would have been required to massage this into standard packetbeat configuration, but I would think that any of the above would likely be beyond the scope of getting packetbeat functioning with agent.

So, just want to clarify--just want me to un-nest inputs and keep the additional configuration we'd need in the spec file in place, correct?

@mostlyjason
Copy link

mostlyjason commented Oct 26, 2020

We must decide, if packetbeat should be packaged by default or not. @ph @mostlyjason

Just checked our telemetry and packetbeat is used by a small percentage of clusters. Would save a lot of bandwidth if it was downloaded on demand. This requires an internet connection or proxy though.

CHANGELOG.next.asciidoc Outdated Show resolved Hide resolved
packetbeat/beater/processor.go Outdated Show resolved Hide resolved
packetbeat/beater/processor.go Outdated Show resolved Hide resolved
packetbeat/beater/reloader.go Outdated Show resolved Hide resolved
packetbeat/config/agent.go Outdated Show resolved Hide resolved
packetbeat/config/agent.go Outdated Show resolved Hide resolved
packetbeat/config/agent.go Outdated Show resolved Hide resolved
@ruflin
Copy link
Member

ruflin commented Oct 28, 2020

For the config and inputs: My favorite solution would be that Elastic Agent just forwards the policy to packetbeat and packetbeat adds all the processors and data needed. The processors are to add important fields but packetbeat actually exactly knows which fields have to be added without Agent sending it. It would be nice if you could check on your end, how big the effort would be to directly go this step. This is what we do with apm-server and plan to also do for filebeat / metricbeat. If we don't go this step directly, we need to migrate to it at one stage.

Perhaps you can provide some examples on what config is put into the agent and what the config looks like when sent to packetbeat as yaml here. This could simplify the discussion and making sure we talk about exactly the same thing.

@andrewstucki
Copy link
Author

andrewstucki commented Oct 29, 2020

Ok, so I wound up going and swapping the configuration to what I mentioned here #22227 (comment)

The changes involve some additional normalization code to inject processors into each stream. On the plus side, the reloading code becomes much simpler since it just wraps cfgfile.RunnerList. For the configuration, I currently enforce a single input up to 100 inputs of type packet to be passed in (corresponding to 100 sniffers/integrations). If packetbeat receives any additional inputs, it fails re-configuration. This is to punt for the time being on:

1. figuring out the nitty-gritty details of running multiple packet sniffers on a system and
2. keep the current goroutine synchronization functional (the error channels will need some work if we move to multiple sniffers)

A demo package I've been using to test all of this stuff is here:

andrewstucki/integrations@28455c5

@ruflin this approach pretty much completely removes all transpiler rules in the packetbeat spec apart from filtering and injecting agent version information (see updated #22145). We'll still need to figure out how to inject additional configuration at the input level in the configuration if we want to use that to control the sniffer behavior, but I figure that can wait. Let me know what you think.

UPDATE:

Added support for up to 100 sniffers/integrations, appears to work like a charm with minimal configuration. The thought would be then that we could use the above format for integrating packetbeat with whatever package we want. For example, MySQL could have a packet input with a stream type set for mysql while another package could add icmp or flow support--each instance of a package (integration) would then create its own sniffer and just specify the streams (protocols or flow) it wanted to capture.

@elasticmachine
Copy link
Collaborator

elasticmachine commented Oct 29, 2020

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 16382
Skipped 1344
Total 17726

@ruflin
Copy link
Member

ruflin commented Oct 30, 2020

@andrewstucki Read your comment above last after going through all the other comments 🤦‍♂️ We should probably have all the config discussions in a single place :-) Great to hear you solved the sniffer limitation issue.

How do we get all the conversations into a single place and can drive it there to a conclusion?

@andrewstucki
Copy link
Author

@ruflin I'm fine with doing the rest of the config discussion here and using the added unit test file packetbeat/config/agent_test.go to finish driving the discussion (no matter where we discuss, any spec file changes result in changes to the normalization code here, so talking about it here keeps us from having to context-switch). That test pretty shows the agent spec file configuration format of a single input (after it's been injected with agent metadata).

The issue I created and linked to (#22227) is still valid though and can be a follow up to the initial implementation, as there's still work to be done to figure out how to configure shared, sniffer-specific options.

Copy link
Contributor

@michalpristas michalpristas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to work as expected together with #22145

@andrewstucki andrewstucki requested a review from a team November 2, 2020 20:03
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

return nil, err
}

watcher := procs.ProcessesWatcher{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the process watcher be something that benefits from being shared across each processor?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the process watcher has traditionally been a global. I more or less kept it that way--making it a singleton for the entirety of the agent-level "input" (meaning 1 sniffer + 1 process watcher). I would imagine eventually we could potentially scope this based on a protocol-by-protocol basis (moving from "input" level, to "stream" level), but we'd probably need to see if that had a performance impact.

In general I tried to keep the traditionally global configuration options scoped to a single input run (1 sniffer, etc.) to make things a bit more flexible for the future and avoid introducing three different levels of configuration (global, input-level, stream-level) when configured via agent. Instead we just have input-level and stream-level, so nothing is shared across multiple inputs.

Hope that answers the question/makes sense.

@andrewstucki andrewstucki merged commit 8c05a41 into elastic:master Nov 3, 2020
@andrewstucki andrewstucki deleted the packetbeat-config-rebase branch November 3, 2020 18:07
andrewstucki pushed a commit to andrewstucki/beats that referenced this pull request Nov 11, 2020
…22134)

* Refactor packetbeat to support agent-based configuration

* Add documentation changes and a Changelog entry

* Update reference template

* Fix funny merge

* Incorporate feedback

* use streams instead of inputs

* support multiple sniffers

* fix shutdown_timeout behavior

(cherry picked from commit 8c05a41)
andrewstucki pushed a commit that referenced this pull request Nov 11, 2020
…ith Elastic Agent (#22546)

* [Packetbeat] Refactor packetbeat for use with Elastic Agent (#22134)

* Refactor packetbeat to support agent-based configuration

* Add documentation changes and a Changelog entry

* Update reference template

* Fix funny merge

* Incorporate feedback

* use streams instead of inputs

* support multiple sniffers

* fix shutdown_timeout behavior

(cherry picked from commit 8c05a41)

* Fix up changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants