Add dynamic instrumentation system-probe module #28639

grantseltzer · 2024-08-21T19:08:12Z

What does this PR do?

Adds support for Go in the Dynamic Instrumentation product. The code must live in system-probe as it uses bpf which requires elevated privileges.

Adds a new set of packages under pkg/di where all logic for dynamic instrumentation lives.
Adds a new system-probe module (cmd/system-probe/modules) which executes the code in pkg/di

Motivation

To allow DataDog users to instrument their Go services without having to update their code.

Possible Drawbacks / Trade-offs

Work must continue to ensure changes in the di module do not over utilize resources or causes crashes that would affect other modules in the system-probe.

Describe how to test/QA your changes

DI can be run offline with a static probe configuration file or online via the datadog UI. I can provide a doc for reviewers on how to run and configure DI within the system-probe. Additionally I'd love feedback on where the most appropriate place for that doc would be.

cit-pr-commenter · 2024-08-22T16:00:47Z

Go Package Import Differences

Baseline: bb0760f
Comparison: 3250fb9

binary

os

arch

change

system-probe

linux

amd64

+11, -0

+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/codegen
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/diagnostics
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/diconfig
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/ditypes
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/ebpf
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/eventparser
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/module
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/proctracker
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/ratelimiter
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/uploader
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/util

system-probe

linux

arm64

+11, -0

+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/codegen
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/diagnostics
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/diconfig
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/ditypes
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/ebpf
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/eventparser
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/module
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/proctracker
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/ratelimiter
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/uploader
+github.com/DataDog/datadog-agent/pkg/dynamicinstrumentation/util

pr-commenter · 2024-08-22T16:45:44Z

Regression Detector

Regression Detector Results

Run ID: f2c9f802-3461-4a47-85f6-920bba96c017 Metrics dashboard Target profiles

Baseline: bb0760f
Comparison: 3250fb9

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	uds_dogstatsd_to_api_cpu	% cpu utilization	+2.26	[+1.47, +3.05]	1	Logs
➖	pycheck_lots_of_tags	% cpu utilization	+1.22	[-1.21, +3.64]	1	Logs
➖	tcp_dd_logs_filter_exclude	ingress throughput	+0.00	[-0.01, +0.01]	1	Logs
➖	uds_dogstatsd_to_api	ingress throughput	-0.00	[-0.00, +0.00]	1	Logs
➖	idle	memory utilization	-0.12	[-0.17, -0.07]	1	Logs
➖	file_tree	memory utilization	-0.46	[-0.51, -0.41]	1	Logs
➖	basic_py_check	% cpu utilization	-0.62	[-3.30, +2.05]	1	Logs
➖	otel_to_otel_logs	ingress throughput	-0.73	[-1.54, +0.08]	1	Logs
➖	tcp_syslog_to_blackhole	ingress throughput	-0.93	[-13.82, +11.96]	1	Logs

Bounds Checks

perf	experiment	bounds_check_name	replicates_passed
✅	idle	memory_usage	10/10

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pkg/dynamicinstrumentation/ebpf/ebpf.go

pr-commenter · 2024-08-24T23:46:05Z

Test changes on VM

Use this command from test-infra-definitions to manually test this PR changes on a VM:

inv create-vm --pipeline-id=43775673 --os-family=ubuntu

Note: This applies to commit 3250fb9

.github/CODEOWNERS

pkg/dynamicinstrumentation/ebpf/ebpf.go

brycekahle · 2024-08-29T20:22:26Z

pkg/dynamicinstrumentation/ebpf/ebpf.go

+	events, err := ebpf.NewMap(&ebpf.MapSpec{
+		Name:       "events",
+		Type:       ebpf.RingBuf,
+		MaxEntries: 1 << 24,


You probably want to choose a better default size since that memory is pre-allocated. It is also best to make this configurable.

I was thinking that this was going to be one of the knobs we change once we have actual benchmarks (Alex is working on this) since we don't want to optimize too early. But still, how would you suggest we find a better default size?

We provide ring buffer utilization metrics via datadog.system_probe.ebpf.perf.usage_pct. This can be used to estimate an appropriate size given some workload.

pkg/dynamicinstrumentation/ebpf/ebpf.go

pkg/dynamicinstrumentation/codegen/codegen.go

Signed-off-by: grant.seltzerrichman <grant.seltzerrichman@datadoghq.com>

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

Signed-off-by: vagrant <vagrant@dev-new-ubuntu-22>

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

gjulianm

Looks good to me, some minor comments but nothing that definitely needs to be addressed in this PR.

gjulianm · 2024-09-04T08:06:19Z

pkg/dynamicinstrumentation/diconfig/binary_inspection.go

+	for i := range configEvent {
+		err = AnalyzeBinary(configEvent[i])
+		if err != nil {
+			return err


nit: you could wrap the error with more information about the binary that failed,

Suggested change

return err

return fmt.Errorf("inspection of PID %d (path=%s) failed: %w", configEvent[i].PID, configEvent[i].BinaryPath, err)

gjulianm · 2024-09-04T09:10:24Z

pkg/dynamicinstrumentation/diconfig/file_config_manager.go

+	}
+}
+
+func (cm *FileWatchingConfigManager) updateProcessInfo(procs ditypes.DIProcs) {


I'm not 100% sure but I think that the process monitor can be calling this function concurrently, and I think functions such as cm.configTracker.UpdateProcesses aren't thread safe. Maybe it would be worth it to add a lock, although even without it this will probably work ok (there are other places where we use the process monitor without locks).

I don't see any issue with adding a lock, good call 👍

gjulianm · 2024-09-04T09:45:26Z

pkg/dynamicinstrumentation/diconfig/dwarf.go

+	return loadFunctionDefinitions(dwarfData, targetFunctions)
+}
+
+var dwarfMap = make(map[string]*dwarf.Data)


I didn't see any cleanup for this map, probably not necessary for a first implementation but I think it should be addressed later to avoid excessive memory usage.

This is a good point, though i'm not entirely sure at what frequency we would do the cleanup, perhaps putting a limit of amount of binaries. I'd like to hold off until we have proper benchmarking in place.

usamasaqib · 2024-09-04T08:16:32Z

pkg/dynamicinstrumentation/codegen/c/dynamicinstrumentation.c

+SEC("uprobe/{{.GetBPFFuncName}}")
+int {{.GetBPFFuncName}}(struct pt_regs *ctx)
+{
+    bpf_printk("{{.GetBPFFuncName}} probe in {{.ServiceName}} has triggered");


Suggestion: Use log_debug macros here to avoid the overhead of bpf_printk.

usamasaqib · 2024-09-04T08:20:10Z

pkg/dynamicinstrumentation/codegen/c/dynamicinstrumentation.c

+    event = bpf_ringbuf_reserve(&events, sizeof(struct event), 0);
+    if (!event) {
+        bpf_printk("No space available on ringbuffer, dropping event");
+        return 0;
+    }


Upcoming PRs can replace this block with a helper telemetry macros. These macros record helper error telemetry per program per error, and expose these as metrics.

We might have to make some changes for this infrastructure to work when the number of probes are not known statically.

usamasaqib · 2024-09-04T08:21:24Z

pkg/dynamicinstrumentation/codegen/c/dynamicinstrumentation.c

+    __u32 key = 0;
+    zero_string = bpf_map_lookup_elem(&zeroval, &key);
+    if (!zero_string) {
+        bpf_printk("couldn't lookup zero value in zeroval array map, dropping event for {{.GetBPFFuncName}}");


If this information is important then I suggest emitting a metric here, or replace the bpf_printk call with a log_debug call instead.

usamasaqib · 2024-09-04T08:30:06Z

pkg/dynamicinstrumentation/codegen/c/dynamicinstrumentation.c

+    bpf_probe_read(&event->base.probe_id, sizeof(event->base.probe_id), zero_string);
+    bpf_probe_read(&event->base.program_counters, sizeof(event->base.program_counters), zero_string);
+    bpf_probe_read(&event->output, sizeof(event->output), zero_string);
+    bpf_probe_read(&event->base.probe_id, {{ .ID | len }}, "{{.ID}}");


Use bpf_probe_read_kernel to read kernel space addresses.

usamasaqib · 2024-09-04T08:33:01Z

pkg/dynamicinstrumentation/codegen/c/dynamicinstrumentation.c

+    bpf_probe_read(&event->base.program_counters[0], sizeof(__u64), &currentPC);
+
+    __u64 bp = ctx->regs[29];
+    bpf_probe_read(&bp, sizeof(__u64), (void*)bp); // dereference bp to get current stack frame


Use bpf_probe_read_user to read userspace memory.

usamasaqib · 2024-09-04T08:38:30Z

pkg/dynamicinstrumentation/codegen/c/types.h

+
+#include "ktypes.h"
+
+// NOTE: Be careful when adding fields, alignment should always be to 8 bytes


Should the struct have the attribute
__attribute__((aligned(8)))

It's interesting, I never thought to explicitly set the attribute, but will now.

Why does alignment need to be to 8 bytes, btw?

The protocol for reading from the output buffer is written such that it expects types at specific offsets

That is no longer true, now that we have the type generation right?

usamasaqib · 2024-09-04T08:55:54Z

pkg/dynamicinstrumentation/codegen/c/dynamicinstrumentation.c

+    bpf_probe_read(&event->base.probe_id, sizeof(event->base.probe_id), zero_string);
+    bpf_probe_read(&event->base.program_counters, sizeof(event->base.program_counters), zero_string);
+    bpf_probe_read(&event->output, sizeof(event->output), zero_string);


Since it does not look like verifier complexity is an issue here, you may use bpf_memset to zero-out the memory.

We actually did run into issues trying to use bpf_memset, though that may have had to do with how we invoked clang differently. But we use this bpf_probe_read method of zeroing because of that.

In terms of instructions, bpf_probe_read is going to be much smaller than bpf_memset for such a large structure.

Could you get away with a single read? bpf_probe_read(event, sizeof(event), zero_string)?

usamasaqib · 2024-09-04T09:00:16Z

pkg/dynamicinstrumentation/eventparser/event_parser.go

+	event := ditypes.DIEvent{}
+
+	if len(record) < ditypes.SizeofBaseEvent {
+		log.Info("malformed event record")


Maybe a louder log level is warranted here? If this is not actionable then I would suggest removing the log, or setting it to trace to avoid flooding the logs with this message.

usamasaqib · 2024-09-04T09:02:03Z

pkg/dynamicinstrumentation/eventparser/event_parser.go

+// ParseParams extracts just the parsed parameters from the full event record
+func ParseParams(record []byte) ([]*ditypes.Param, error) {
+	if len(record) < 392 {
+		log.Info("malformed event record")


Since an error is being returned here, you can remove the log entry to avoid flooding the logs

usamasaqib · 2024-09-04T09:29:01Z

tasks/system_probe.py

You need to add ./pkg/dynamicinstrumentation/... to TEST_PACKAGES_LIST.

Currently your tests are not running in the CI.

usamasaqib

I did not see any functional tests exercising the eBPF code.

The eBPF code is not being tested on our distribution matrix
Since the eBPF code is generated at runtime, it is hard to review the final output without being able to run a test to generate it atleast.

usamasaqib

Have you performed any load tests / dogfooding on staging clusters to see if there is increase in CPU or memory usage?

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

usamasaqib

You should also add pkg/dynamicinstrumentation/**/* here so that the tests are triggered on changes to dynamic instrumentation code.

usamasaqib

Suggestion regarding the use of bpf_probe_read_user: This helper can fail with EFAULT, if the page it is reading from is paged out. This can happen quiet a lot.

Since it seems that the correctness of this feature depends on the eBPF program reporting the right values, you should consider either early-returning if this helper fails, or initializing the slots with a known value to recognize a failure when consuming in userspace.

usamasaqib · 2024-09-05T07:03:12Z

cmd/system-probe/modules/dynamic_instrumentation.go

 		if err != nil {
 			return nil, fmt.Errorf("invalid dynamic instrumentation module configuration: %w", err)
 		}

-		m, err := dynamicinstrumentation.NewModule(config)
+		m, err := dimod.NewModule(config)
 		if errors.Is(err, ebpf.ErrNotImplemented) {


Can other errors propagate here? If so, why are they ignored?

usamasaqib · 2024-09-05T07:03:53Z

pkg/dynamicinstrumentation/codegen/c/dynamicinstrumentation.c

+struct {
+	__uint(type, BPF_MAP_TYPE_RINGBUF);
+	__uint(max_entries, 1 << 24);
+} events SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+    __uint(key_size, sizeof(__u32));
+    __uint(value_size, sizeof(char[PARAM_BUFFER_SIZE]));
+    __uint(max_entries, 1);
+} zeroval SEC(".maps");


Nit: There are some formatting issues.

usamasaqib · 2024-09-05T07:10:28Z

pkg/dynamicinstrumentation/codegen/output_offsets.go

+}
+
+func applyCaptureDepth(params []ditypes.Parameter, maxDepth int) []ditypes.Parameter {
+	log.Info("Applying capture depth: ", maxDepth)


Suggested change

log.Info("Applying capture depth: ", maxDepth)

log.Tracef("Applying capture depth: %d", maxDepth)

usamasaqib · 2024-09-05T07:24:23Z

pkg/dynamicinstrumentation/codegen/templates.go

+// Name={{.Name}} ID={{.ID}} TotalSize={{.TotalSize}} Kind={{.Kind}}
+// Write the kind and size to output buffer
+param_type = {{.Kind}};
+bpf_probe_read_kernel(&event->output[outputOffset], 1, &param_type);


Suggested change

bpf_probe_read_kernel(&event->output[outputOffset], 1, &param_type);

bpf_probe_read_kernel(&event->output[outputOffset], sizeof(param_type), &param_type);

usamasaqib · 2024-09-05T07:24:41Z

pkg/dynamicinstrumentation/codegen/templates.go

+param_type = {{.Kind}};
+bpf_probe_read_kernel(&event->output[outputOffset], 1, &param_type);
+param_size = {{.TotalSize}};
+bpf_probe_read_kernel(&event->output[outputOffset+1], 2, &param_size);


Suggested change

bpf_probe_read_kernel(&event->output[outputOffset+1], 2, &param_size);

bpf_probe_read_kernel(&event->output[outputOffset+1], sizeof(param_size), &param_size);

usamasaqib · 2024-09-05T07:40:41Z

pkg/dynamicinstrumentation/diconfig/config_manager.go

+		err = json.Unmarshal([]byte(configEventParams[2].ValueStr), &conf)
+		if err != nil {
+			diagnostics.Diagnostics.SetError(procInfo.ServiceName, procInfo.RuntimeID, configPath.ProbeUUID.String(), "ATTACH_ERROR", err.Error())
+			log.Infof("could not unmarshal configuration, cannot apply: %s (Probe-ID: %s)\n", err, configPath.ProbeUUID)


Suggested change

log.Infof("could not unmarshal configuration, cannot apply: %s (Probe-ID: %s)\n", err, configPath.ProbeUUID)

log.Infof("could not unmarshal configuration, cannot apply: %v (Probe-ID: %s)", err, configPath.ProbeUUID)

usamasaqib · 2024-09-05T07:40:57Z

pkg/dynamicinstrumentation/diconfig/config_manager.go

+
+		runtimeID, err := uuid.ParseBytes([]byte(configEventParams[0].ValueStr))
+		if err != nil {
+			log.Infof("Runtime ID \"%s\" is not a UUID: %s)\n", runtimeID, err)


Suggested change

log.Infof("Runtime ID \"%s\" is not a UUID: %s)\n", runtimeID, err)

log.Infof("Runtime ID \"%s\" is not a UUID: %s)", runtimeID, err)

usamasaqib · 2024-09-05T07:41:23Z

pkg/dynamicinstrumentation/diconfig/config_manager.go

+}
+
+func applyConfigUpdate(procInfo *ditypes.ProcessInfo, probe *ditypes.Probe) {
+	log.Info("Applying config update", probe)


Suggested change

log.Info("Applying config update", probe)

log.Tracef("Applying config update %v", probe)

usamasaqib · 2024-09-05T07:41:57Z

pkg/dynamicinstrumentation/diconfig/config_manager.go

+	log.Info("Applying config update", probe)
+	err := AnalyzeBinary(procInfo)
+	if err != nil {
+		log.Infof("couldn't inspect binary: %s\n", err)


Suggested change

log.Infof("couldn't inspect binary: %s\n", err)

log.Errorf("couldn't inspect binary: %v", err)

usamasaqib · 2024-09-05T07:44:06Z

pkg/dynamicinstrumentation/ditypes/config.go

+	CaptureParameters       = true  // CaptureParameters is the default value for if probes should capture parameter values
+	ArgumentsMaxSize        = 10000 // ArgumentsMaxSize is the default size in bytes of the output buffer used for param values
+	StringMaxSize           = 512   // StringMaxSize is the default size in bytes of a single string
+	MaxReferenceDepth uint8 = 4     // MaxReferenceDepth is the default depth that DI will traverse datatypes for capturing values
+	MaxFieldCount           = 20    // MaxFieldCount is the default limit for how many fields DI will capture in a single data type
+	SliceMaxSize            = 1800  // SliceMaxSize is the default limit in bytes of a slice
+	SliceMaxLength          = 100   // SliceMaxLength is the default limit in number of elements of a slice


Suggestion: Consider having limits used in eBPF to be power of twos. They help in constraining the verifier without branches.

cimi · 2024-09-05T16:01:48Z

cmd/system-probe/modules/dynamic_instrumentation.go

+// DynamicInstrumentation is a system probe module which allows you to add instrumentation into
+// running Go services without restarts.
 var DynamicInstrumentation = module.Factory{
 	Name:             config.DynamicInstrumentationModule,
 	ConfigNamespaces: []string{},
 	Fn: func(agentConfiguration *sysconfigtypes.Config, _ module.FactoryDependencies) (module.Module, error) {
-		config, err := dynamicinstrumentation.NewConfig(agentConfiguration)
+		config, err := dimod.NewConfig(agentConfiguration)
 		if err != nil {
 			return nil, fmt.Errorf("invalid dynamic instrumentation module configuration: %w", err)
 		}

-		m, err := dynamicinstrumentation.NewModule(config)
+		m, err := dimod.NewModule(config)
 		if errors.Is(err, ebpf.ErrNotImplemented) {
 			return nil, module.ErrNotEnabled
 		}


Where is the logic that checks for DD_DYNAMIC_INSTRUMENTATION_ENABLED for datadog-agent?

I saw where we are checking the env var on the target process, but I can't find where the agent config value is used.

On this line the env var is bound to the config value: https://github.com/DataDog/datadog-agent/blob/main/pkg/config/setup/system_probe.go#L169

And it being enabled is checked here: https://github.com/DataDog/datadog-agent/blob/main/cmd/system-probe/api/module/loader.go#L69

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

usamasaqib

Approved, since most of the feedback can be addresses in subsequent PRs.

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

cimi

Approving since the module will not run by default, we can address the outstanding comments in future PRs.

grantseltzer · 2024-09-06T21:01:47Z

/merge

dd-devflow · 2024-09-06T21:01:53Z

🚂 MergeQueue: pull request added to the queue

The median merge time in main is 23m.

Use /merge -c to cancel this operation!

grantseltzer requested review from a team as code owners August 21, 2024 19:08

github-actions bot added the component/system-probe label Aug 21, 2024

grantseltzer requested review from cimi and brycekahle August 21, 2024 19:08

github-actions bot added the team/agent-shared-components label Aug 21, 2024

cimi marked this pull request as draft August 22, 2024 11:02

grantseltzer added the changelog/no-changelog label Aug 22, 2024

grantseltzer commented Aug 22, 2024

View reviewed changes

pkg/dynamicinstrumentation/ebpf/ebpf.go Outdated Show resolved Hide resolved

grantseltzer commented Aug 22, 2024

View reviewed changes

pkg/dynamicinstrumentation/ebpf/ebpf.go Outdated Show resolved Hide resolved

grantseltzer force-pushed the grantseltzer/move-in-go-di branch from 6d5766e to 247ef95 Compare August 25, 2024 01:24

brycekahle reviewed Aug 29, 2024

View reviewed changes

.github/CODEOWNERS Outdated Show resolved Hide resolved

brycekahle reviewed Aug 29, 2024

View reviewed changes

pkg/dynamicinstrumentation/ebpf/ebpf.go Outdated Show resolved Hide resolved

brycekahle reviewed Aug 29, 2024

View reviewed changes

pkg/dynamicinstrumentation/ebpf/ebpf.go Outdated Show resolved Hide resolved

brycekahle reviewed Aug 29, 2024

View reviewed changes

pkg/dynamicinstrumentation/codegen/codegen.go Outdated Show resolved Hide resolved

grantseltzer force-pushed the grantseltzer/move-in-go-di branch 2 times, most recently from 08d3f20 to 2776133 Compare August 30, 2024 17:46

grantseltzer and others added 8 commits August 30, 2024 12:53

Start of moving things over

9ad2101

Signed-off-by: grant.seltzerrichman <grant.seltzerrichman@datadoghq.com>

Update CODEOWNERS

3b4ccea

Signed-off-by: grant.seltzerrichman <grant.seltzerrichman@datadoghq.com>

Set up configuration and stats method

df14663

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

Move all code over and update imports

3e1cb6b

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

Move all code over and update imports

ddf7f7f

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

Fix missing file content, update log to util package

f497eb8

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

Fix system probe build

5ad7ea4

Signed-off-by: vagrant <vagrant@dev-new-ubuntu-22>

Don't use event stream

a6a58c4

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

Fix broken header templates

7f711ad

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

grantseltzer force-pushed the grantseltzer/move-in-go-di branch from 4dff90a to 7f711ad Compare August 30, 2024 20:40

use io.Writer instead of passing around strings

701ff7a

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

gjulianm reviewed Sep 4, 2024

View reviewed changes

usamasaqib reviewed Sep 4, 2024

View reviewed changes

grantseltzer added 4 commits September 4, 2024 11:07

Respond to PR feedback

137ef7d

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

Reformat

9771050

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

Fix buffer used for slice headers

c8ba3a2

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

Merge branch 'main' into grantseltzer/move-in-go-di

03f0902

usamasaqib reviewed Sep 5, 2024

View reviewed changes

cimi reviewed Sep 5, 2024

View reviewed changes

Fix verifier issue, revert change with bpf_probe_read calls

5a7aedf

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

cimi marked this pull request as ready for review September 6, 2024 14:01

cimi requested a review from a team as a code owner September 6, 2024 14:01

grantseltzer added 2 commits September 6, 2024 10:06

Fix formatting, error handling, and size of read calls

07d6499

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

More formatting/error handling fixes

546df1d

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

usamasaqib approved these changes Sep 6, 2024

View reviewed changes

move gitignore line, fix conflict

173ed10

Signed-off-by: grantseltzer <grantseltzer@gmail.com>

grantseltzer removed the request for review from a team September 6, 2024 20:08

Merge branch 'main' into grantseltzer/move-in-go-di

3250fb9

grantseltzer removed the request for review from a team September 6, 2024 20:20

cimi approved these changes Sep 6, 2024

View reviewed changes

brycekahle added this to the 7.58.0 milestone Sep 6, 2024

dd-mergequeue bot merged commit d003e99 into main Sep 6, 2024
303 of 310 checks passed

dd-mergequeue bot deleted the grantseltzer/move-in-go-di branch September 6, 2024 21:25

	return err
	return fmt.Errorf("inspection of PID %d (path=%s) failed: %w", configEvent[i].PID, configEvent[i].BinaryPath, err)


		#include "ktypes.h"

		// NOTE: Be careful when adding fields, alignment should always be to 8 bytes

	log.Info("Applying capture depth: ", maxDepth)
	log.Tracef("Applying capture depth: %d", maxDepth)

	bpf_probe_read_kernel(&event->output[outputOffset], 1, &param_type);
	bpf_probe_read_kernel(&event->output[outputOffset], sizeof(param_type), &param_type);

	log.Infof("could not unmarshal configuration, cannot apply: %s (Probe-ID: %s)\n", err, configPath.ProbeUUID)
	log.Infof("could not unmarshal configuration, cannot apply: %v (Probe-ID: %s)", err, configPath.ProbeUUID)

	log.Infof("Runtime ID \"%s\" is not a UUID: %s)\n", runtimeID, err)
	log.Infof("Runtime ID \"%s\" is not a UUID: %s)", runtimeID, err)

	log.Info("Applying config update", probe)
	log.Tracef("Applying config update %v", probe)

	log.Infof("couldn't inspect binary: %s\n", err)
	log.Errorf("couldn't inspect binary: %v", err)

Add dynamic instrumentation system-probe module #28639

Add dynamic instrumentation system-probe module #28639

Conversation

grantseltzer commented Aug 21, 2024

What does this PR do?

Motivation

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

cit-pr-commenter bot commented Aug 22, 2024 • edited Loading

Go Package Import Differences

pr-commenter bot commented Aug 22, 2024 • edited Loading

Regression Detector

Regression Detector Results

No significant changes in experiment optimization goals

Fine details of change detection per experiment

Bounds Checks

Explanation

pr-commenter bot commented Aug 24, 2024 • edited Loading

Test changes on VM

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gjulianm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

usamasaqib left a comment

Choose a reason for hiding this comment

usamasaqib left a comment

Choose a reason for hiding this comment

usamasaqib left a comment

Choose a reason for hiding this comment

usamasaqib left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

usamasaqib left a comment

Choose a reason for hiding this comment

cimi left a comment

Choose a reason for hiding this comment

grantseltzer commented Sep 6, 2024

dd-devflow bot commented Sep 6, 2024

cit-pr-commenter bot commented Aug 22, 2024 •

edited

Loading

pr-commenter bot commented Aug 22, 2024 •

edited

Loading

pr-commenter bot commented Aug 24, 2024 •

edited

Loading

usamasaqib left a comment •

edited

Loading