Track un-active instrumentation for better reporting #2016

RonFed · 2024-12-16T15:48:30Z

This PR changes when the instrumentation manager tracks an instrumentation.
Once an instrumentation is initialized, we should start tracking it even if it fails to load.
The reason is, that we need to call the reporter once the process exits (so it can report/clean it) - however, if we only track the PID once the instrumentation is loaded we won't be able to invoke the reporter properly once a failed-to-load instrumentation needs to be cleaned.

Each PID we track is marked whether the instrumentation is active (loaded successfully) or not - by the inst field being nil or not.
For un-active PIDs we won't apply config updates.

In addition, in case we get an exec event on a non-active PID, we try to instrument it again. This is helpful in cases of chain-loading where the first executable is written in a language we can't instrument, while the second is valid for instrumentation.

blumamir · 2024-12-16T17:40:55Z

instrumentation/manager.go

+	// we need to track the instrumentation even if the load failed.
+	// consider a reporter which writes a persistent record for a failed/successful load
+	// we need to notify the reporter once that PID exists to clean up the resources
+	m.startTrackInstrumentation(e.PID, inst, pg, configGroup, loadErr == nil)


if we had an error with the Load, it means that at this point the inst is not longer relevant. We will not need to call Close or apply config to it since it's not loaded. Thus, I recommend to consider using nil instead of inst in this case so that it doesn't get called accidentally and cause hard to find bug

blumamir · 2024-12-16T17:44:46Z

instrumentation/manager.go

+	// active is used to track if the instrumentation is loaded successfully or not.
+	// we want to track the instrumentation even if it failed to load, to be able to report the error
+	// and clean up the instrumentation resources and the reporter resources once the process exits.
+	active bool


consider:

Suggested change

active bool

loaded bool

to maintain a consistent terminology with existing terms.

…porter

Track un-active instrumentation for better reporting

35616c7

RonFed marked this pull request as draft December 16, 2024 15:48

Check if active if same PID event happens more than once

a740abf

RonFed marked this pull request as ready for review December 16, 2024 16:53

RonFed requested a review from blumamir December 16, 2024 17:00

blumamir approved these changes Dec 16, 2024

View reviewed changes

RonFed added 2 commits December 17, 2024 11:36

track a nil instrumentation which is not active/loaded but used by re…

aaffcf4

…porter

Merge branch 'main' into ebpf_manager_track_failed_loads

83c5cca

RonFed merged commit 716aac7 into odigos-io:main Dec 17, 2024
31 checks passed

RonFed deleted the ebpf_manager_track_failed_loads branch December 17, 2024 10:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track un-active instrumentation for better reporting #2016

Track un-active instrumentation for better reporting #2016

RonFed commented Dec 16, 2024 •

edited

Loading

blumamir Dec 16, 2024

blumamir Dec 16, 2024

Track un-active instrumentation for better reporting #2016

Track un-active instrumentation for better reporting #2016

Conversation

RonFed commented Dec 16, 2024 • edited Loading

blumamir Dec 16, 2024

Choose a reason for hiding this comment

blumamir Dec 16, 2024

Choose a reason for hiding this comment

RonFed commented Dec 16, 2024 •

edited

Loading