The protolesshooks
library provides a hooking engine that works without information on target functions prototypes.
This code is intended for use in the upcoming API Spy plugin for IO Ninja; API Spy is going to be an advanced cross-platform alternative for ltrace
.
The idea of API hooking is to intercept calls to system or 3rd-party libraries and redirect those through your spy functions, also known as hooks. Hooking is often required in reverse-engineering and many other non-trivial debugging scenarios. Depending on the chosen hooking method, with hooks you can:
- Display API function names called by the process;
- Measure the time of each call;
- Build a call-graph;
- Inspect/modify function call arguments;
- Inspect/modify return values;
- Block the target function completely.
Most hooking-related libraries, frameworks, and articles focus on injection techniques, i.e., the details of making your hook getting called every time before the original function. Once this task is accomplished, the problem is deemed to be solved -- your hook can now proxy-call the original function, pass its return value back to the caller, and perform logging/argument/retval modification as necessary.
The problem here, however, is that you can't proxy-call without target function prototypes! Yes, it's easy to jump directly to the original function (thus getting the capability (1) of the list above). But for (2), (3), and (5) your hook needs to regain control after return from the target function -- which is trivial with the knowledge of target function prototypes, and quite challenging without.
Not to state the obvious, but to encode prototypes for all the library calls in a process is nearly impossible -- there could be hundreds of different API calls, and many of those may be undocumented.
The protolesshooks
library provides return-hijacking thunks that work without the knowledge of target functions prototypes. This makes it possible, for example, to enumerate and intercept all shared libraries in a process, gain a birds-eye overview of the API call-graph, then gradually add prototype information for parameter/retval decoding as necessary.
The prototype information can be incomplete. For instance, we may have some clues about the first two parameters of a particular function, but no idea about the rest. With the traditional hooking (when your hook is inserted into the call chain), it's just not going to work -- you need exact information about the expected stack frame! With protolesshooks
it's absolutely fine.
- Works without information about target function prototypes;
- Function entry hooks;
- Function exit hooks;
- SEH-exception hooks (Windows x64 only);
- Arguments can be modified before jumping to the target function;
- Retvals can be modified before returning to the caller;
- The target function can be blocked if necessary;
- Thunks can be used with trampoline-based hooking engines, too.
Supported calling conventions:
- Microsoft x64 (MSC);
- SystemV AMD64 (GCC/Clang);
- x86 cdecl (MSC, GCC/Clang);
- x86 stdcall (MSC, GCC/Clang);
- x86 __thiscall (MSC);
- x86 __fastcall (MSC);
- x86 __attribute__((regparm(n)) (GCC/Clang).
Built-in enumerators for import tables:
- PE (Windows)
- ELF (Linux)
- Mach-O (macOS)
On Windows x64, thunks properly dispatch exceptions to lower SEH handlers without losing the hook after the first exception. This is important because multiple exceptions can occur without unwinding (if one of the SEH filters returns EXCEPTION_CONTINUE_EXECUTION
), for example:
void foo()
{
// recoverable exception happens here...
...
// now unrecoverable exception happens here...
}
int barFilter(EXCEPTION_POINTERS* exception)
{
if (/* can recover? */)
{
// recover, e.g. commit/protect the faulting page
return EXCEPTION_CONTINUE_EXECUTION;
}
return EXCEPTION_EXECUTE_HANDLER;
}
void bar()
{
__try
{
foo();
}
__except (barFilter(GetExceptionInformation()))
{
// unrecoverable exception is caught here
}
}
-
The hello-world sample. Allocates a basic enter/leave hook for a void function with no arguments; then calls it directly.
-
Demonstrates how to decode register/stack arguments and return values.
-
Demonstrates how to enumerate all loaded modules and imports for each module.
-
The global interception of all imports in all loaded modules.
-
Demonstrates how to modify register/stack arguments and return values.
-
Demonstrates how to pass-through, proxy-call, or completely block the target function.