Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kallsyms: change Modules caching strategy, cache address lookups #1590

Merged
merged 4 commits into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions btf/kernel.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ import (
"sync"

"github.com/cilium/ebpf/internal"
"github.com/cilium/ebpf/internal/kallsyms"
"github.com/cilium/ebpf/internal/linux"
)

Expand All @@ -22,8 +21,6 @@ var kernelBTF = struct {

// FlushKernelSpec removes any cached kernel type information.
func FlushKernelSpec() {
kallsyms.FlushKernelModuleCache()

kernelBTF.Lock()
defer kernelBTF.Unlock()

Expand Down
44 changes: 44 additions & 0 deletions collection.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"github.com/cilium/ebpf/asm"
"github.com/cilium/ebpf/btf"
"github.com/cilium/ebpf/internal"
"github.com/cilium/ebpf/internal/kallsyms"
"github.com/cilium/ebpf/internal/kconfig"
"github.com/cilium/ebpf/internal/linux"
)
Expand Down Expand Up @@ -400,6 +401,10 @@ func newCollectionLoader(coll *CollectionSpec, opts *CollectionOptions) (*collec
}
}

if err := populateKallsyms(coll.Programs); err != nil {
return nil, fmt.Errorf("populating kallsyms caches: %w", err)
}

return &collectionLoader{
coll,
opts,
Expand All @@ -408,6 +413,45 @@ func newCollectionLoader(coll *CollectionSpec, opts *CollectionOptions) (*collec
}, nil
}

// populateKallsyms populates kallsyms caches, making lookups cheaper later on
// during individual program loading. Since we have less context available
// at those stages, we batch the lookups here instead to avoid redundant work.
func populateKallsyms(progs map[string]*ProgramSpec) error {
// Look up associated kernel modules for all symbols referenced by
// ProgramSpec.AttachTo for program types that support attaching to kmods.
mods := make(map[string]string)
for _, p := range progs {
if p.AttachTo != "" && p.targetsKernelModule() {
mods[p.AttachTo] = ""
}
}
if len(mods) != 0 {
if err := kallsyms.AssignModules(mods); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this branch locally and it fails for me with an ambiguous error for kernel module load_elf_phdrs. If I understood the old code correctly, it didn't care about duplicate symbols in /proc/kallsyms.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it would ignore kallsyms entries without modules, which is not correct. If a symbol is ambiguous between the core kernel and a module, we should not attach to it. The old code might've picked the address of the base symbol to attach to, but provided the BTF of the symbol in the kernel module.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patrickpichler Is this a breaking change for you, or were you just testing out ambiguous symbol handling during kmod attach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a breaking change, but I guess it was not right before due to ambiguity and anyway would have been a bug 😅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if you'd actually make use of CO:RE things, build your BPF C program against headers of one kernel, deploy your app to another host, then be unlucky enough for some of the fields to have changed between versions, you would've likely hit a verifier error at runtime.

'Unfortunately', in most cases like load_elf_phdrs, it's actually the same function with the same signature and underlying types that gets included into multiple compilation units (by way of including a common header) and then gets merged into the same binary during linking of the kernel binary. Things would've kept working fine in your case since it's likely an exact duplicate, but you would run into funky cases where some code paths in the kernel invoke one version of the symbol, while other paths invoke the other, meaning you'd miss events since you're only attaching to one of the symbols.

Let's see if any other skeletons fall out of the ceiling when we ship this in a release, but this has been a bugbear for bpf since the beginning.

return fmt.Errorf("getting modules from kallsyms: %w", err)
}
}

// Look up addresses of all kernel symbols referenced by all programs.
addrs := make(map[string]uint64)
for _, p := range progs {
iter := p.Instructions.Iterate()
for iter.Next() {
ins := iter.Ins
meta, _ := ins.Metadata.Get(ksymMetaKey{}).(*ksymMeta)
if meta != nil {
addrs[meta.Name] = 0
}
}
}
if len(addrs) != 0 {
if err := kallsyms.AssignAddresses(addrs); err != nil {
return fmt.Errorf("getting addresses from kallsyms: %w", err)
}
}

return nil
}

// close all resources left over in the collectionLoader.
func (cl *collectionLoader) close() {
for _, m := range cl.maps {
Expand Down
19 changes: 6 additions & 13 deletions elf_reader.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ import (
"github.com/cilium/ebpf/asm"
"github.com/cilium/ebpf/btf"
"github.com/cilium/ebpf/internal"
"github.com/cilium/ebpf/internal/kallsyms"
"github.com/cilium/ebpf/internal/sys"
)

Expand All @@ -37,7 +36,6 @@ type ksymMetaKey struct{}

type ksymMeta struct {
Binding elf.SymBind
Addr uint64
Name string
}

Expand All @@ -53,7 +51,7 @@ type elfCode struct {
maps map[string]*MapSpec
vars map[string]*VariableSpec
kfuncs map[string]*btf.Func
ksyms map[string]uint64
ksyms map[string]struct{}
kconfig *MapSpec
}

Expand Down Expand Up @@ -146,7 +144,7 @@ func LoadCollectionSpecFromReader(rd io.ReaderAt) (*CollectionSpec, error) {
maps: make(map[string]*MapSpec),
vars: make(map[string]*VariableSpec),
kfuncs: make(map[string]*btf.Func),
ksyms: make(map[string]uint64),
ksyms: make(map[string]struct{}),
}

symbols, err := f.Symbols()
Expand Down Expand Up @@ -638,7 +636,7 @@ func (ec *elfCode) relocateInstruction(ins *asm.Instruction, rel elf.Symbol) err
}

kf := ec.kfuncs[name]
ksymAddr := ec.ksyms[name]
_, ks := ec.ksyms[name]

switch {
// If a Call / DWordLoad instruction is found and the datasec has a btf.Func with a Name
Expand All @@ -660,14 +658,13 @@ func (ec *elfCode) relocateInstruction(ins *asm.Instruction, rel elf.Symbol) err

ins.Constant = 0

case ksymAddr != 0 && ins.OpCode.IsDWordLoad():
case ks && ins.OpCode.IsDWordLoad():
if bind != elf.STB_GLOBAL && bind != elf.STB_WEAK {
return fmt.Errorf("asm relocation: %s: %w: %s", name, errUnsupportedBinding, bind)
}
ins.Metadata.Set(ksymMetaKey{}, &ksymMeta{
Binding: bind,
Name: name,
Addr: ksymAddr,
})

// If no kconfig map is found, this must be a symbol reference from inline
Expand Down Expand Up @@ -1307,16 +1304,12 @@ func (ec *elfCode) loadKsymsSection() error {
case *btf.Func:
ec.kfuncs[t.TypeName()] = t
case *btf.Var:
ec.ksyms[t.TypeName()] = 0
ec.ksyms[t.TypeName()] = struct{}{}
default:
return fmt.Errorf("unexpected variable type in .ksysm: %T", v)
return fmt.Errorf("unexpected variable type in .ksyms: %T", v)
}
}

if err := kallsyms.AssignAddresses(ec.ksyms); err != nil {
return fmt.Errorf("error while loading ksym addresses: %w", err)
}

return nil
}

Expand Down
13 changes: 7 additions & 6 deletions elf_reader_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -794,19 +794,20 @@ func TestKsym(t *testing.T) {
qt.Assert(t, qt.IsNil(err))

ksyms := map[string]uint64{
"socket_file_ops": 0,
"tty_fops": 0,
"bpf_init": 0,
"bpf_trace_run1": 0,
}

qt.Assert(t, qt.IsNil(kallsyms.AssignAddresses(ksyms)))
qt.Assert(t, qt.Not(qt.Equals(ksyms["bpf_init"], 0)))
qt.Assert(t, qt.Not(qt.Equals(ksyms["bpf_trace_run1"], 0)))

var value uint64
qt.Assert(t, qt.IsNil(obj.ArrayMap.Lookup(uint32(0), &value)))
qt.Assert(t, qt.Equals(value, ksyms["socket_file_ops"]))
qt.Assert(t, qt.Equals(value, ksyms["bpf_init"]))

err = obj.ArrayMap.Lookup(uint32(1), &value)
qt.Assert(t, qt.IsNil(err))
qt.Assert(t, qt.Equals(value, ksyms["tty_fops"]))
qt.Assert(t, qt.IsNil(obj.ArrayMap.Lookup(uint32(1), &value)))
qt.Assert(t, qt.Equals(value, ksyms["bpf_trace_run1"]))
}

func TestKsymWeakMissing(t *testing.T) {
Expand Down
20 changes: 20 additions & 0 deletions internal/kallsyms/cache.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
package kallsyms

import "sync"

type cache[K, V comparable] struct {
m sync.Map
}

func (c *cache[K, V]) Load(key K) (value V, _ bool) {
v, ok := c.m.Load(key)
if !ok {
return value, false
}
value = v.(V)
return value, true
}

func (c *cache[K, V]) Store(key K, value V) {
c.m.Store(key, value)
}
Loading
Loading