kallsyms: change Modules caching strategy, cache address lookups #1590

ti-mo · 2024-10-17T18:34:17Z

See individual commits for context.

At a high level:

add caching for __ksym reference resolving
stop caching all text symbols that have an associated kmod
for both addresses and modules, cache only the requested symbols (pull-through cache)
at the beginning of Collection creation, scan all ProgramSpecs and instructions for module and symbol references and request them in two batches
perform individual symbol lookups (from the populated cache) in newProgramWithOptions onwards
don't open /proc/kallsyms during CollectionSpec loading

Since this function allows lookups to be performed in batches, caching lookup results becomes interesting, without having to resort to caching all of kallsyms. Signed-off-by: Timo Beckers <timo@isovalent.com>

patrickpichler · 2024-10-18T07:53:05Z

collection.go

+		}
+	}
+	if len(mods) != 0 {
+		if err := kallsyms.AssignModules(mods); err != nil {


I tried this branch locally and it fails for me with an ambiguous error for kernel module load_elf_phdrs. If I understood the old code correctly, it didn't care about duplicate symbols in /proc/kallsyms.

Correct, it would ignore kallsyms entries without modules, which is not correct. If a symbol is ambiguous between the core kernel and a module, we should not attach to it. The old code might've picked the address of the base symbol to attach to, but provided the BTF of the symbol in the kernel module.

@patrickpichler Is this a breaking change for you, or were you just testing out ambiguous symbol handling during kmod attach?

It is a breaking change, but I guess it was not right before due to ambiguity and anyway would have been a bug 😅

Yes, if you'd actually make use of CO:RE things, build your BPF C program against headers of one kernel, deploy your app to another host, then be unlucky enough for some of the fields to have changed between versions, you would've likely hit a verifier error at runtime.

'Unfortunately', in most cases like load_elf_phdrs, it's actually the same function with the same signature and underlying types that gets included into multiple compilation units (by way of including a common header) and then gets merged into the same binary during linking of the kernel binary. Things would've kept working fine in your case since it's likely an exact duplicate, but you would run into funky cases where some code paths in the kernel invoke one version of the symbol, while other paths invoke the other, meaning you'd miss events since you're only attaching to one of the symbols.

Let's see if any other skeletons fall out of the ceiling when we ship this in a release, but this has been a bugbear for bpf since the beginning.

dylandreimerink

This all makes sense. Very nice

alban · 2024-10-21T08:17:29Z

@ti-mo Thanks, I tested this PR in Inspektor Gadget, and it uses between 8 and 10MB less memory than before.

Renamed SymbolModule to Module and introduced AssignModules, a batch version of Module similar to the existing AssignAddresses. Along with this change, ambiguous symbols (with non-unique names) are now rejected to avoid sending the wrong BTF ID for a symbol for the kernel to use during verification. Similar to AssignAddresses, this API now only caches requested symbols instead of all of kallsyms. Some users found the permanent memory allocation burdensome and relied on flushing them after loading the programs they needed. btf.FlushKernelSpec now no longer calls FlushKernelModuleCache, and the latter has been removed due to the significantly lowered memory consumption. Signed-off-by: Timo Beckers <timo@isovalent.com>

The original implementation opened /proc/kallsyms and resolved all symbols during CollectionSpec loading. This creates issues like bpf2go not being able to generate its output when run unprivileged, since it can't properly resolve ksym addresses. /proc/kallsyms addresses are zero in this case. Also, some users load CollectionSpecs in non-linux environments like wasm, so we need the spec loading to remain hermetic. Probably a good idea to add some tests for this at some point, though not clear how at this time. Signed-off-by: Timo Beckers <timo@isovalent.com>

As mentioned in the code comments, scan all programs in the CollectionSpec to allow batching together kallsyms lookups, avoiding repeated scanning of /proc/kallsyms. Signed-off-by: Timo Beckers <timo@isovalent.com>

kallsyms: cache lookups from AssignAddresses

f3018c1

Since this function allows lookups to be performed in batches, caching lookup results becomes interesting, without having to resort to caching all of kallsyms. Signed-off-by: Timo Beckers <timo@isovalent.com>

ti-mo requested review from dylandreimerink and a team as code owners October 17, 2024 18:34

github-actions bot added the breaking-change Changes exported API label Oct 17, 2024

This was referenced Oct 17, 2024

Explicitly check ksym address was found during relocations #1587

Closed

prog: don't search for kmod when not needed #1585

Closed

patrickpichler reviewed Oct 18, 2024

View reviewed changes

dylandreimerink approved these changes Oct 18, 2024

View reviewed changes

alban mentioned this pull request Oct 21, 2024

Flush memory allocations when no longer needed inspektor-gadget/inspektor-gadget#3571

Closed

ti-mo added 3 commits October 23, 2024 17:28

collection: populate kallsyms caches in newCollectionLoader

2e21f0b

As mentioned in the code comments, scan all programs in the CollectionSpec to allow batching together kallsyms lookups, avoiding repeated scanning of /proc/kallsyms. Signed-off-by: Timo Beckers <timo@isovalent.com>

ti-mo force-pushed the tb/kallsyms-caching branch from daf2425 to 2e21f0b Compare October 23, 2024 15:30

ti-mo merged commit d3c63ab into cilium:main Oct 23, 2024
17 checks passed

ti-mo deleted the tb/kallsyms-caching branch October 23, 2024 15:44

This was referenced Oct 23, 2024

go.mod: update cilium/ebpf with more memory optimization inspektor-gadget/inspektor-gadget#3616

Merged

cilium/ebpf memory optimizations inspektor-gadget/inspektor-gadget#3583

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kallsyms: change Modules caching strategy, cache address lookups #1590

kallsyms: change Modules caching strategy, cache address lookups #1590

ti-mo commented Oct 17, 2024 •

edited

Loading

patrickpichler Oct 18, 2024

ti-mo Oct 21, 2024

ti-mo Oct 23, 2024

patrickpichler Oct 23, 2024

ti-mo Oct 23, 2024

dylandreimerink left a comment

alban commented Oct 21, 2024

kallsyms: change Modules caching strategy, cache address lookups #1590

kallsyms: change Modules caching strategy, cache address lookups #1590

Conversation

ti-mo commented Oct 17, 2024 • edited Loading

patrickpichler Oct 18, 2024

Choose a reason for hiding this comment

ti-mo Oct 21, 2024

Choose a reason for hiding this comment

ti-mo Oct 23, 2024

Choose a reason for hiding this comment

patrickpichler Oct 23, 2024

Choose a reason for hiding this comment

ti-mo Oct 23, 2024

Choose a reason for hiding this comment

dylandreimerink left a comment

Choose a reason for hiding this comment

alban commented Oct 21, 2024

ti-mo commented Oct 17, 2024 •

edited

Loading