See where you use LibC the most.
Trace calls failing tests. Then - roast!
LibSee is a single-file library for profiling LibC calls and 🔜 fuzzy testing. To download and compile the script and run your favorite query:
gcc -g -O2 -fno-builtin -fPIC -nostdlib -nostartfiles -shared -o libsee.so libsee.c
LibSee overrides LibC symbols using LD_PRELOAD
, profiling the most commonly used functions, and, optionally, fuzzing their behavior for testing.
The library yields a few binaries when compiled:
libsee.so # Profiles LibC calls
libsee_and_knee.so # Correct LibC behavior, but fuzzed!
There are several things worth knowing, that came handy implementing this.
- One way to implement this library would be to override the
_start
symbols, but implementing correct loading sequence for a binary is tricky, so I use conventionaldlsym
to lookup the symbols on first function invocation. - On
x86_64
architecture, therdtscp
instruction yields both the CPU cycle and also the unique identifier of the core. Very handy if you are profiling a multi-threaded application. - Once the unloading sequence reaches
libsee.so
, theSTDOUT
is already closed. So if you want to print to the console, you may want to reopen the/dev/tty
device before printing usage stats. - Calling convention for system calls on Aarch64 and x86 differs significantly. On Aarch64 I use the generalized
openat
with opcode 56. On x86 it's opcode 2. - On MacOS the
sprintf
,vsprintf
,snprintf
,vsnprintf
are macros. You have to#undef
them. - On
Release
builds compilers love replacing your code withmemset
andmemcpy
calls. As the symbol can't be found from inside LibSee, it willSEGFAULT
so don't forget to disable such optimizations for built-ins-fno-builtin
. - No symbol versioning is implemented, vanilla
dlsym
is used over thedlvsym
.
LibC standard is surprisingly long, so not all of the functions are covered. Feel free to suggest PRs covering the rest:
- memory management
- byte-strings
- algorithms
- date and time
- input/output
- wide-character strings
- concurrency and atomics
- retrieving error numbers
- numerics
- multi-byte strings
- wide-character IO
- localization
- anything newer than C 11
There are a few other C libraries that most of the world reuses, rather than implementing from scratch in other languages:
- BLAS and LAPACK
- PCRE RegEx
-
hsearch
,tsearch
, and pattern matching extensions
Program support utilities aren't intended.