Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture stdout/stderr of spawned components #1702

Merged
merged 55 commits into from
Nov 28, 2022
Merged
Show file tree
Hide file tree
Changes from 54 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
837db32
[v2] Add v2 component specification and validation. (#502)
blakerouse Jun 3, 2022
3b9a723
Add component spec command to validate component specifications. (#510)
blakerouse Jun 3, 2022
7b2ec07
Merge branch 'main' into feature-arch-v2
blakerouse Jun 14, 2022
7bb4acd
[v2] Calculate the expected runtime components from policy (#550)
blakerouse Jun 15, 2022
2679c82
Initial Flat Structure (#544)
michalpristas Jun 21, 2022
7f6b42a
Generate checksum file for components (#604)
michalpristas Jun 24, 2022
8812fc9
V2 Runtime Component Manager (#645)
blakerouse Jul 13, 2022
5acdc40
[v2] Use the v2 components runtime as the core of the Elastic Agent (…
blakerouse Jul 26, 2022
e141de5
[v2] Delete unused code from refactor (#777)
blakerouse Jul 26, 2022
ff667df
[v2] Delete more unused code from v2 transition (#790)
blakerouse Jul 27, 2022
9b68ea4
[v2] Merge July 27th main into v2 feature branch (#789)
blakerouse Jul 28, 2022
2cc2338
[v2] Fix inspect command (#805)
blakerouse Aug 2, 2022
2705093
Expand check-in payload for V2 (#916)
aleksmaus Aug 16, 2022
d56e3f5
[v2] Update protocol to use new UnitExpectedConfig. (#850)
blakerouse Aug 18, 2022
9bba975
Fix action dispatching that was using ActionType instead of InputType…
aleksmaus Aug 24, 2022
43ad01d
Fix bootstrapping a Fleet Server with v2. (#1010)
blakerouse Aug 29, 2022
dee2403
Query just related files on build (#1045)
michalpristas Aug 30, 2022
712b300
Update main to 8.5.0 (#793) (#1050)
mergify[bot] Aug 30, 2022
5f1e54f
Create archive directory if it doesn't exist. (#1058)
cmacknz Aug 31, 2022
6ff50ac
fixed docker build (#1105)
michalpristas Sep 7, 2022
f95c9ed
V2 command work dir (#1061)
blakerouse Sep 7, 2022
5051218
[v2] Move queue management to dispatcher (#1109)
michel-laterman Sep 15, 2022
e91e3ce
Fix [V2]: Elastic Agent Install is broken. (#1331)
aleksmaus Sep 29, 2022
e6a038e
Fix agent shutdown on SIGINT (#1258)
aleksmaus Sep 29, 2022
9d1cea3
[v2] Re-enable diagnostics for Elastic Agent and all components (#1140)
blakerouse Sep 29, 2022
90523cc
Check and create downloads dir before using (#1410)
michalpristas Oct 5, 2022
bfc490a
[v2] Add upgrade action retry (#1219)
michel-laterman Oct 14, 2022
ec83c2c
V1 metrics monitoring for V2 (#1487)
michalpristas Oct 19, 2022
0602091
[v2] Merge main on Oct. 18 (#1557)
blakerouse Oct 20, 2022
9420497
Add input name alias for cloudbeat integrations (#1596)
fearful-symmetry Oct 24, 2022
96e071e
Change the stater to include a local flag. (#1308)
michel-laterman Oct 24, 2022
fc3eba3
Service runtime V2 (#1529)
aleksmaus Oct 25, 2022
bd36958
Sync components with state during container start (#1653)
michalpristas Nov 3, 2022
4b17703
Subprocess reader start.
blakerouse Nov 4, 2022
2a84a70
Implement io.Writer to handle reading stdout/stderr for spawned compo…
blakerouse Nov 7, 2022
3ecddd6
Don't inject logging args to beats components. Always have beats log …
blakerouse Nov 7, 2022
d7e7d16
Update to v0.2.15 of elastic-agent-libs.
blakerouse Nov 7, 2022
73b3d2e
[V2] Enable support for shippers (#1527)
blakerouse Nov 8, 2022
2ebabb2
More work on the logging.
blakerouse Nov 8, 2022
21e5966
Merge branch 'feature-arch-v2' into v2-subprocess-io-writer
blakerouse Nov 8, 2022
5fda6fe
More fixes.
blakerouse Nov 8, 2022
692aaa6
Change back to streams.
blakerouse Nov 8, 2022
c877db0
Fix go.mod.
blakerouse Nov 8, 2022
99b5427
Fix import.
blakerouse Nov 8, 2022
81b01ee
Merge branch 'main' into v2-subprocess-io-writer
blakerouse Nov 9, 2022
4903c1b
Fix issues with merge of main.
blakerouse Nov 9, 2022
4d7e0fb
remove log helper.
blakerouse Nov 9, 2022
b6e2216
Add NewWithoutConfig.
blakerouse Nov 9, 2022
06e4a39
Merge branch 'main' into v2-subprocess-io-writer
blakerouse Nov 22, 2022
925c5b6
Fix the spawned filestream to ingest logs into elasticsearch for moni…
blakerouse Nov 23, 2022
b3f0751
Merge branch 'main' into v2-subprocess-io-writer
blakerouse Nov 23, 2022
c3f45d4
Add changelog entry.
blakerouse Nov 23, 2022
639144c
Remove debug print.
blakerouse Nov 28, 2022
79292d5
Merge branch 'main' into v2-subprocess-io-writer
blakerouse Nov 28, 2022
6d3db5c
Update 1669236059-Capture-stdout-stderr-of-all-spawned-components-to-…
blakerouse Nov 28, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ fleet.enc.lock
# Files generated with the bump version automations
*.bck


# agent
build/
elastic-agent
Expand Down
4 changes: 2 additions & 2 deletions NOTICE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1273,11 +1273,11 @@ SOFTWARE

--------------------------------------------------------------------------------
Dependency : github.com/elastic/elastic-agent-libs
Version: v0.2.6
Version: v0.2.15
Licence type (autodetected): Apache-2.0
--------------------------------------------------------------------------------

Contents of probable licence file $GOMODCACHE/github.com/elastic/elastic-agent-libs@v0.2.6/LICENSE:
Contents of probable licence file $GOMODCACHE/github.com/elastic/elastic-agent-libs@v0.2.15/LICENSE:

Apache License
Version 2.0, January 2004
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Kind can be one of:
# - breaking-change: a change to previously-documented behavior
# - deprecation: functionality that is being removed in a later release
# - bug-fix: fixes a problem in a previous version
# - enhancement: extends functionality but does not break or fix existing behavior
# - feature: new functionality
# - known-issue: problems that we are aware of in a given version
# - security: impacts on the security of a product or a user’s deployment.
# - upgrade: important information for someone upgrading from a prior version
# - other: does not fit into any of the other categories
kind: feature

# Change summary; a 80ish characters long description of the change.
summary: Capture stdout/stderr of all spawned components to simplify logging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should mention that the default log level for Beat sub-processes is not longer debug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not going to fit in the 80ish character limit. Should I add that as another changelog entry? Or just make it really long?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just got it to fit in the same one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the description field if you need it to be longer than 80 characters.


# Long description; in case the summary is not enough to describe the change
# this field accommodate a description without length limits.
#description:

# Affected component; a word indicating the component this changeset affects.
component:

# PR number; optional; the PR number that added the changeset.
# If not present is automatically filled by the tooling finding the PR where this changelog fragment has been added.
# NOTE: the tooling supports backports, so it's able to fill the original PR number instead of the backport PR number.
# Please provide it if you are adding a fragment for a different PR.
pr: 1702

# Issue number; optional; the GitHub issue related to this changeset (either closes or is part of).
# If not present is automatically filled by the tooling with the issue linked to the PR number.
issue: 221
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ require (
github.com/elastic/e2e-testing v1.99.2-0.20220117192005-d3365c99b9c4
github.com/elastic/elastic-agent-autodiscover v0.2.1
github.com/elastic/elastic-agent-client/v7 v7.0.0-20220804181728-b0328d2fe484
github.com/elastic/elastic-agent-libs v0.2.6
github.com/elastic/elastic-agent-libs v0.2.15
github.com/elastic/elastic-agent-system-metrics v0.4.4
github.com/elastic/go-licenser v0.4.0
github.com/elastic/go-sysinfo v1.8.1
Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -387,8 +387,8 @@ github.com/elastic/elastic-agent-autodiscover v0.2.1/go.mod h1:gPnzzfdYNdgznAb+i
github.com/elastic/elastic-agent-client/v7 v7.0.0-20220804181728-b0328d2fe484 h1:uJIMfLgCenJvxsVmEjBjYGxt0JddCgw2IxgoNfcIXOk=
github.com/elastic/elastic-agent-client/v7 v7.0.0-20220804181728-b0328d2fe484/go.mod h1:fkvyUfFwyAG5OnMF0h+FV9sC0Xn9YLITwQpSuwungQs=
github.com/elastic/elastic-agent-libs v0.2.5/go.mod h1:chO3rtcLyGlKi9S0iGVZhYCzDfdDsAQYBc+ui588AFE=
github.com/elastic/elastic-agent-libs v0.2.6 h1:DpcUcCVYZ7lNtHLUlyT1u/GtGAh49wpL15DTH7+8O5o=
github.com/elastic/elastic-agent-libs v0.2.6/go.mod h1:chO3rtcLyGlKi9S0iGVZhYCzDfdDsAQYBc+ui588AFE=
github.com/elastic/elastic-agent-libs v0.2.15 h1:hdAbrZZ2mCPcQLRCE3E8xw3mHKl8HFMt36w7jan/XGo=
github.com/elastic/elastic-agent-libs v0.2.15/go.mod h1:0J9lzJh+BjttIiVjYDLncKYCEWUUHiiqnuI64y6C6ss=
github.com/elastic/elastic-agent-system-metrics v0.4.4 h1:Br3S+TlBhijrLysOvbHscFhgQ00X/trDT5VEnOau0E0=
github.com/elastic/elastic-agent-system-metrics v0.4.4/go.mod h1:tF/f9Off38nfzTZHIVQ++FkXrDm9keFhFpJ+3pQ00iI=
github.com/elastic/elastic-package v0.32.1/go.mod h1:l1fEnF52XRBL6a5h6uAemtdViz2bjtjUtgdQcuRhEAY=
Expand Down
178 changes: 57 additions & 121 deletions internal/pkg/agent/application/monitoring/v1_monitor.go
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ var (
supportedBeatsComponents = []string{"filebeat", "metricbeat", "apm-server", "fleet-server", "auditbeat", "cloudbeat", "heartbeat", "osquerybeat", "packetbeat"}
)

// Beats monitor is providing V1 monitoring support.
// BeatsMonitor is providing V1 monitoring support for metrics and logs for endpoint-security only.
type BeatsMonitor struct {
enabled bool // feature flag disabling whole v1 monitoring story
config *monitoringConfig
Expand Down Expand Up @@ -178,21 +178,10 @@ func (b *BeatsMonitor) EnrichArgs(unit, binary string, args []string) []string {
}
}

loggingPath := loggingPath(unit, b.operatingSystem)
if loggingPath != "" {
if !b.config.C.LogMetrics {
appendix = append(appendix,
"-E", "logging.files.path="+filepath.Dir(loggingPath),
"-E", "logging.files.name="+filepath.Base(loggingPath),
"-E", "logging.files.keepfiles=7",
"-E", "logging.files.permission=0640",
"-E", "logging.files.interval=1h",
"-E", "logging.metrics.enabled=false",
)

if !b.config.C.LogMetrics {
appendix = append(appendix,
"-E", "logging.metrics.enabled=false",
)
}
}

return append(args, appendix...)
Expand Down Expand Up @@ -291,24 +280,21 @@ func (b *BeatsMonitor) injectMonitoringOutput(source, dest map[string]interface{

func (b *BeatsMonitor) injectLogsInput(cfg map[string]interface{}, componentIDToBinary map[string]string, monitoringOutput string) error {
monitoringNamespace := b.monitoringNamespace()
//fixedAgentName := strings.ReplaceAll(agentName, "-", "_")
logsDrop := filepath.Dir(loggingPath("unit", b.operatingSystem))

streams := []interface{}{
map[string]interface{}{
idKey: "filestream-monitoring-agent",
// "data_stream" is not used when creating an Input on Filebeat
"data_stream": map[string]interface{}{
"type": "filestream",
"dataset": "elastic_agent",
"namespace": monitoringNamespace,
},
idKey: "filestream-monitoring-agent",
"type": "filestream",
"paths": []interface{}{
filepath.Join(logsDrop, agentName+"-*.ndjson"),
filepath.Join(logsDrop, agentName+"-watcher-*.ndjson"),
},
"index": fmt.Sprintf("logs-elastic_agent-%s", monitoringNamespace),
"data_stream": map[string]interface{}{
"type": "logs",
"dataset": "elastic_agent",
"namespace": monitoringNamespace,
},
"close": map[string]interface{}{
"on_state_change": map[string]interface{}{
"inactive": "5m",
Expand All @@ -325,133 +311,86 @@ func (b *BeatsMonitor) injectLogsInput(cfg map[string]interface{}, componentIDTo
},
},
"processors": []interface{}{
// copy original dataset so we can drop the dataset field
map[string]interface{}{
"add_fields": map[string]interface{}{
"target": "data_stream",
"fields": map[string]interface{}{
"type": "logs",
"dataset": "elastic_agent",
"namespace": monitoringNamespace,
},
},
},
map[string]interface{}{
"add_fields": map[string]interface{}{
"target": "event",
"fields": map[string]interface{}{
"dataset": "elastic_agent",
},
},
},
map[string]interface{}{
"add_fields": map[string]interface{}{
"target": "elastic_agent",
"fields": map[string]interface{}{
"id": b.agentInfo.AgentID(),
"version": b.agentInfo.Version(),
"snapshot": b.agentInfo.Snapshot(),
},
},
},
map[string]interface{}{
"add_fields": map[string]interface{}{
"target": "agent",
"fields": map[string]interface{}{
"id": b.agentInfo.AgentID(),
"copy_fields": map[string]interface{}{
"fields": []interface{}{
map[string]interface{}{
"from": "data_stream.dataset",
"to": "data_stream.dataset_original",
},
},
},
},
// drop the dataset field so following copy_field can copy to it
map[string]interface{}{
"drop_fields": map[string]interface{}{
"fields": []interface{}{
"ecs.version", //coming from logger, already added by libbeat
"data_stream.dataset",
},
"ignore_missing": true,
},
}},
},
}
for unit, binaryName := range componentIDToBinary {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm just wondering, have you checked overall CPU consumption with the move towards stdout/err monitoring?
seems like std should be a bit better right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about the overall load change, I would say its a net wash. Being that the components no longer need to worry about rotation, syncing files, or worrying about deleting old logs.

I think the benefits of components not needing to really have to worry about logging, rotation, etc; is a win!

if !isSupportedBinary(binaryName) {
continue
}

fixedBinaryName := strings.ReplaceAll(binaryName, "-", "_")
name := strings.ReplaceAll(unit, "-", "_") // conform with index naming policy
logFile := loggingPath(unit, b.operatingSystem)
streams = append(streams, map[string]interface{}{
idKey: "filestream-monitoring-" + name,
"data_stream": map[string]interface{}{
// "data_stream" is not used when creating an Input on Filebeat
"type": "filestream",
"dataset": fmt.Sprintf("elastic_agent.%s", fixedBinaryName),
"namespace": monitoringNamespace,
},
"type": "filestream",
"index": fmt.Sprintf("logs-elastic_agent.%s-%s", fixedBinaryName, monitoringNamespace),
"paths": []interface{}{logFile, logFile + "*"},
"close": map[string]interface{}{
"on_state_change": map[string]interface{}{
"inactive": "5m",
},
},
"parsers": []interface{}{
map[string]interface{}{
"ndjson": map[string]interface{}{
"message_key": "message",
"overwrite_keys": true,
"add_error_key": true,
"target": "",
},
},
},
"processors": []interface{}{
// copy component.dataset as the real dataset
map[string]interface{}{
"add_fields": map[string]interface{}{
"target": "data_stream",
"fields": map[string]interface{}{
"type": "logs",
"dataset": fmt.Sprintf("elastic_agent.%s", fixedBinaryName),
"namespace": monitoringNamespace,
"copy_fields": map[string]interface{}{
"fields": []interface{}{
map[string]interface{}{
"from": "component.dataset",
"to": "data_stream.dataset",
},
},
"fail_on_error": false,
"ignore_missing": true,
},
},
// possible it's a log message from agent itself (doesn't have component.dataset)
map[string]interface{}{
"add_fields": map[string]interface{}{
"target": "event",
"fields": map[string]interface{}{
"dataset": fmt.Sprintf("elastic_agent.%s", fixedBinaryName),
"copy_fields": map[string]interface{}{
"fields": []interface{}{
map[string]interface{}{
"from": "data_stream.dataset_original",
"to": "data_stream.dataset",
},
},
"fail_on_error": false,
},
},
// drop the original dataset copied and the event.dataset (as it will be updated)
map[string]interface{}{
"add_fields": map[string]interface{}{
"target": "elastic_agent",
"fields": map[string]interface{}{
"id": b.agentInfo.AgentID(),
"version": b.agentInfo.Version(),
"snapshot": b.agentInfo.Snapshot(),
"drop_fields": map[string]interface{}{
"fields": []interface{}{
"data_stream.dataset_original",
"event.dataset",
},
},
},
// update event.dataset with the now used data_stream.dataset
map[string]interface{}{
"add_fields": map[string]interface{}{
"target": "agent",
"fields": map[string]interface{}{
"id": b.agentInfo.AgentID(),
"copy_fields": map[string]interface{}{
"fields": []interface{}{
map[string]interface{}{
"from": "data_stream.dataset",
"to": "event.dataset",
},
},
},
},
// coming from logger, added by agent (drop)
map[string]interface{}{
"drop_fields": map[string]interface{}{
"fields": []interface{}{
"ecs.version", //coming from logger, already added by libbeat
"ecs.version",
},
"ignore_missing": true,
},
},
},
})
// adjust destination data_stream based on the data_stream fields
map[string]interface{}{
"add_formatted_index": map[string]interface{}{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the destination datastreams all the same as they were before this change? Are there any new datastreams created here?

I haven't tested this locally yet, I just want to do an early check that we haven't introduced anything that could be considered a breaking change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resulting datastreams are different in that now they are grouped by spawned component. This is done on purpose and a design decision of V2. The allows logs from each component that now has its own status to be grouped properly. So you can correlate status of a component to the exact logs of that component.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to be in the changelog. I agree it makes more sense and better fits how datastreams are supposed to be used under agent.

"index": "%{[data_stream.type]}-%{[data_stream.dataset]}-%{[data_stream.namespace]}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice approach

},
}},
},
}

inputs := []interface{}{
Expand All @@ -460,10 +399,7 @@ func (b *BeatsMonitor) injectLogsInput(cfg map[string]interface{}, componentIDTo
"name": "filestream-monitoring-agent",
"type": "filestream",
useOutputKey: monitoringOutput,
"data_stream": map[string]interface{}{
"namespace": monitoringNamespace,
},
"streams": streams,
"streams": streams,
},
}
inputsNode, found := cfg[inputsKey]
Expand Down
Loading