-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dynamic instrumentation system-probe module #28639
Conversation
Go Package Import DifferencesBaseline: bb0760f
|
Regression DetectorRegression Detector ResultsRun ID: f2c9f802-3461-4a47-85f6-920bba96c017 Metrics dashboard Target profiles Baseline: bb0760f Performance changes are noted in the perf column of each table:
No significant changes in experiment optimization goalsConfidence level: 90.00% There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
|
perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
---|---|---|---|---|---|---|
➖ | uds_dogstatsd_to_api_cpu | % cpu utilization | +2.26 | [+1.47, +3.05] | 1 | Logs |
➖ | pycheck_lots_of_tags | % cpu utilization | +1.22 | [-1.21, +3.64] | 1 | Logs |
➖ | tcp_dd_logs_filter_exclude | ingress throughput | +0.00 | [-0.01, +0.01] | 1 | Logs |
➖ | uds_dogstatsd_to_api | ingress throughput | -0.00 | [-0.00, +0.00] | 1 | Logs |
➖ | idle | memory utilization | -0.12 | [-0.17, -0.07] | 1 | Logs |
➖ | file_tree | memory utilization | -0.46 | [-0.51, -0.41] | 1 | Logs |
➖ | basic_py_check | % cpu utilization | -0.62 | [-3.30, +2.05] | 1 | Logs |
➖ | otel_to_otel_logs | ingress throughput | -0.73 | [-1.54, +0.08] | 1 | Logs |
➖ | tcp_syslog_to_blackhole | ingress throughput | -0.93 | [-13.82, +11.96] | 1 | Logs |
Bounds Checks
perf | experiment | bounds_check_name | replicates_passed |
---|---|---|---|
✅ | idle | memory_usage | 10/10 |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
Test changes on VMUse this command from test-infra-definitions to manually test this PR changes on a VM: inv create-vm --pipeline-id=43775673 --os-family=ubuntu Note: This applies to commit 3250fb9 |
6d5766e
to
247ef95
Compare
events, err := ebpf.NewMap(&ebpf.MapSpec{ | ||
Name: "events", | ||
Type: ebpf.RingBuf, | ||
MaxEntries: 1 << 24, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to choose a better default size since that memory is pre-allocated. It is also best to make this configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that this was going to be one of the knobs we change once we have actual benchmarks (Alex is working on this) since we don't want to optimize too early. But still, how would you suggest we find a better default size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We provide ring buffer utilization metrics via datadog.system_probe.ebpf.perf.usage_pct
. This can be used to estimate an appropriate size given some workload.
08d3f20
to
2776133
Compare
Signed-off-by: grant.seltzerrichman <grant.seltzerrichman@datadoghq.com>
Signed-off-by: grant.seltzerrichman <grant.seltzerrichman@datadoghq.com>
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: vagrant <vagrant@dev-new-ubuntu-22>
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
4dff90a
to
7f711ad
Compare
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, some minor comments but nothing that definitely needs to be addressed in this PR.
for i := range configEvent { | ||
err = AnalyzeBinary(configEvent[i]) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you could wrap the error with more information about the binary that failed,
return err | |
return fmt.Errorf("inspection of PID %d (path=%s) failed: %w", configEvent[i].PID, configEvent[i].BinaryPath, err) |
} | ||
} | ||
|
||
func (cm *FileWatchingConfigManager) updateProcessInfo(procs ditypes.DIProcs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure but I think that the process monitor can be calling this function concurrently, and I think functions such as cm.configTracker.UpdateProcesses
aren't thread safe. Maybe it would be worth it to add a lock, although even without it this will probably work ok (there are other places where we use the process monitor without locks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any issue with adding a lock, good call 👍
return loadFunctionDefinitions(dwarfData, targetFunctions) | ||
} | ||
|
||
var dwarfMap = make(map[string]*dwarf.Data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see any cleanup for this map, probably not necessary for a first implementation but I think it should be addressed later to avoid excessive memory usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point, though i'm not entirely sure at what frequency we would do the cleanup, perhaps putting a limit of amount of binaries. I'd like to hold off until we have proper benchmarking in place.
SEC("uprobe/{{.GetBPFFuncName}}") | ||
int {{.GetBPFFuncName}}(struct pt_regs *ctx) | ||
{ | ||
bpf_printk("{{.GetBPFFuncName}} probe in {{.ServiceName}} has triggered"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Use log_debug macros here to avoid the overhead of bpf_printk.
event = bpf_ringbuf_reserve(&events, sizeof(struct event), 0); | ||
if (!event) { | ||
bpf_printk("No space available on ringbuffer, dropping event"); | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upcoming PRs can replace this block with a helper telemetry macros. These macros record helper error telemetry per program per error, and expose these as metrics.
We might have to make some changes for this infrastructure to work when the number of probes are not known statically.
__u32 key = 0; | ||
zero_string = bpf_map_lookup_elem(&zeroval, &key); | ||
if (!zero_string) { | ||
bpf_printk("couldn't lookup zero value in zeroval array map, dropping event for {{.GetBPFFuncName}}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this information is important then I suggest emitting a metric here, or replace the bpf_printk call with a log_debug call instead.
bpf_probe_read(&event->base.probe_id, sizeof(event->base.probe_id), zero_string); | ||
bpf_probe_read(&event->base.program_counters, sizeof(event->base.program_counters), zero_string); | ||
bpf_probe_read(&event->output, sizeof(event->output), zero_string); | ||
bpf_probe_read(&event->base.probe_id, {{ .ID | len }}, "{{.ID}}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use bpf_probe_read_kernel
to read kernel space addresses.
bpf_probe_read(&event->base.program_counters[0], sizeof(__u64), ¤tPC); | ||
|
||
__u64 bp = ctx->regs[29]; | ||
bpf_probe_read(&bp, sizeof(__u64), (void*)bp); // dereference bp to get current stack frame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use bpf_probe_read_user
to read userspace memory.
|
||
#include "ktypes.h" | ||
|
||
// NOTE: Be careful when adding fields, alignment should always be to 8 bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the struct have the attribute
__attribute__((aligned(8)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting, I never thought to explicitly set the attribute, but will now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does alignment need to be to 8 bytes, btw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The protocol for reading from the output buffer is written such that it expects types at specific offsets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is no longer true, now that we have the type generation right?
bpf_probe_read(&event->base.probe_id, sizeof(event->base.probe_id), zero_string); | ||
bpf_probe_read(&event->base.program_counters, sizeof(event->base.program_counters), zero_string); | ||
bpf_probe_read(&event->output, sizeof(event->output), zero_string); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it does not look like verifier complexity is an issue here, you may use bpf_memset to zero-out the memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually did run into issues trying to use bpf_memset, though that may have had to do with how we invoked clang differently. But we use this bpf_probe_read method of zeroing because of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of instructions, bpf_probe_read
is going to be much smaller than bpf_memset
for such a large structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you get away with a single read? bpf_probe_read(event, sizeof(event), zero_string)
?
event := ditypes.DIEvent{} | ||
|
||
if len(record) < ditypes.SizeofBaseEvent { | ||
log.Info("malformed event record") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a louder log level is warranted here? If this is not actionable then I would suggest removing the log, or setting it to trace to avoid flooding the logs with this message.
// ParseParams extracts just the parsed parameters from the full event record | ||
func ParseParams(record []byte) ([]*ditypes.Param, error) { | ||
if len(record) < 392 { | ||
log.Info("malformed event record") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since an error is being returned here, you can remove the log entry to avoid flooding the logs
tasks/system_probe.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add ./pkg/dynamicinstrumentation/...
to TEST_PACKAGES_LIST
.
Currently your tests are not running in the CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not see any functional tests exercising the eBPF code.
- The eBPF code is not being tested on our distribution matrix
- Since the eBPF code is generated at runtime, it is hard to review the final output without being able to run a test to generate it atleast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you performed any load tests / dogfooding on staging clusters to see if there is increase in CPU or memory usage?
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should also add pkg/dynamicinstrumentation/**/*
here so that the tests are triggered on changes to dynamic instrumentation code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion regarding the use of bpf_probe_read_user
: This helper can fail with EFAULT, if the page it is reading from is paged out. This can happen quiet a lot.
Since it seems that the correctness of this feature depends on the eBPF program reporting the right values, you should consider either early-returning if this helper fails, or initializing the slots with a known value to recognize a failure when consuming in userspace.
if err != nil { | ||
return nil, fmt.Errorf("invalid dynamic instrumentation module configuration: %w", err) | ||
} | ||
|
||
m, err := dynamicinstrumentation.NewModule(config) | ||
m, err := dimod.NewModule(config) | ||
if errors.Is(err, ebpf.ErrNotImplemented) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can other errors propagate here? If so, why are they ignored?
struct { | ||
__uint(type, BPF_MAP_TYPE_RINGBUF); | ||
__uint(max_entries, 1 << 24); | ||
} events SEC(".maps"); | ||
|
||
struct { | ||
__uint(type, BPF_MAP_TYPE_ARRAY); | ||
__uint(key_size, sizeof(__u32)); | ||
__uint(value_size, sizeof(char[PARAM_BUFFER_SIZE])); | ||
__uint(max_entries, 1); | ||
} zeroval SEC(".maps"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: There are some formatting issues.
} | ||
|
||
func applyCaptureDepth(params []ditypes.Parameter, maxDepth int) []ditypes.Parameter { | ||
log.Info("Applying capture depth: ", maxDepth) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Info("Applying capture depth: ", maxDepth) | |
log.Tracef("Applying capture depth: %d", maxDepth) |
// Name={{.Name}} ID={{.ID}} TotalSize={{.TotalSize}} Kind={{.Kind}} | ||
// Write the kind and size to output buffer | ||
param_type = {{.Kind}}; | ||
bpf_probe_read_kernel(&event->output[outputOffset], 1, ¶m_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bpf_probe_read_kernel(&event->output[outputOffset], 1, ¶m_type); | |
bpf_probe_read_kernel(&event->output[outputOffset], sizeof(param_type), ¶m_type); |
param_type = {{.Kind}}; | ||
bpf_probe_read_kernel(&event->output[outputOffset], 1, ¶m_type); | ||
param_size = {{.TotalSize}}; | ||
bpf_probe_read_kernel(&event->output[outputOffset+1], 2, ¶m_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bpf_probe_read_kernel(&event->output[outputOffset+1], 2, ¶m_size); | |
bpf_probe_read_kernel(&event->output[outputOffset+1], sizeof(param_size), ¶m_size); |
err = json.Unmarshal([]byte(configEventParams[2].ValueStr), &conf) | ||
if err != nil { | ||
diagnostics.Diagnostics.SetError(procInfo.ServiceName, procInfo.RuntimeID, configPath.ProbeUUID.String(), "ATTACH_ERROR", err.Error()) | ||
log.Infof("could not unmarshal configuration, cannot apply: %s (Probe-ID: %s)\n", err, configPath.ProbeUUID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Infof("could not unmarshal configuration, cannot apply: %s (Probe-ID: %s)\n", err, configPath.ProbeUUID) | |
log.Infof("could not unmarshal configuration, cannot apply: %v (Probe-ID: %s)", err, configPath.ProbeUUID) |
|
||
runtimeID, err := uuid.ParseBytes([]byte(configEventParams[0].ValueStr)) | ||
if err != nil { | ||
log.Infof("Runtime ID \"%s\" is not a UUID: %s)\n", runtimeID, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Infof("Runtime ID \"%s\" is not a UUID: %s)\n", runtimeID, err) | |
log.Infof("Runtime ID \"%s\" is not a UUID: %s)", runtimeID, err) |
} | ||
|
||
func applyConfigUpdate(procInfo *ditypes.ProcessInfo, probe *ditypes.Probe) { | ||
log.Info("Applying config update", probe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Info("Applying config update", probe) | |
log.Tracef("Applying config update %v", probe) |
log.Info("Applying config update", probe) | ||
err := AnalyzeBinary(procInfo) | ||
if err != nil { | ||
log.Infof("couldn't inspect binary: %s\n", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Infof("couldn't inspect binary: %s\n", err) | |
log.Errorf("couldn't inspect binary: %v", err) |
CaptureParameters = true // CaptureParameters is the default value for if probes should capture parameter values | ||
ArgumentsMaxSize = 10000 // ArgumentsMaxSize is the default size in bytes of the output buffer used for param values | ||
StringMaxSize = 512 // StringMaxSize is the default size in bytes of a single string | ||
MaxReferenceDepth uint8 = 4 // MaxReferenceDepth is the default depth that DI will traverse datatypes for capturing values | ||
MaxFieldCount = 20 // MaxFieldCount is the default limit for how many fields DI will capture in a single data type | ||
SliceMaxSize = 1800 // SliceMaxSize is the default limit in bytes of a slice | ||
SliceMaxLength = 100 // SliceMaxLength is the default limit in number of elements of a slice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Consider having limits used in eBPF to be power of twos. They help in constraining the verifier without branches.
// DynamicInstrumentation is a system probe module which allows you to add instrumentation into | ||
// running Go services without restarts. | ||
var DynamicInstrumentation = module.Factory{ | ||
Name: config.DynamicInstrumentationModule, | ||
ConfigNamespaces: []string{}, | ||
Fn: func(agentConfiguration *sysconfigtypes.Config, _ module.FactoryDependencies) (module.Module, error) { | ||
config, err := dynamicinstrumentation.NewConfig(agentConfiguration) | ||
config, err := dimod.NewConfig(agentConfiguration) | ||
if err != nil { | ||
return nil, fmt.Errorf("invalid dynamic instrumentation module configuration: %w", err) | ||
} | ||
|
||
m, err := dynamicinstrumentation.NewModule(config) | ||
m, err := dimod.NewModule(config) | ||
if errors.Is(err, ebpf.ErrNotImplemented) { | ||
return nil, module.ErrNotEnabled | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the logic that checks for DD_DYNAMIC_INSTRUMENTATION_ENABLED
for datadog-agent?
I saw where we are checking the env var on the target process, but I can't find where the agent config value is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On this line the env var is bound to the config value: https://github.com/DataDog/datadog-agent/blob/main/pkg/config/setup/system_probe.go#L169
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it being enabled is checked here: https://github.com/DataDog/datadog-agent/blob/main/cmd/system-probe/api/module/loader.go#L69
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, since most of the feedback can be addresses in subsequent PRs.
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving since the module will not run by default, we can address the outstanding comments in future PRs.
/merge |
🚂 MergeQueue: pull request added to the queue The median merge time in Use |
What does this PR do?
Adds support for Go in the Dynamic Instrumentation product. The code must live in system-probe as it uses bpf which requires elevated privileges.
pkg/di
where all logic for dynamic instrumentation lives.cmd/system-probe/modules
) which executes the code inpkg/di
Motivation
To allow DataDog users to instrument their Go services without having to update their code.
Possible Drawbacks / Trade-offs
Work must continue to ensure changes in the di module do not over utilize resources or causes crashes that would affect other modules in the system-probe.
Describe how to test/QA your changes
DI can be run offline with a static probe configuration file or online via the datadog UI. I can provide a doc for reviewers on how to run and configure DI within the system-probe. Additionally I'd love feedback on where the most appropriate place for that doc would be.