x86 ABI

This documents the ABI (Application Binary Interface) for MLWorks running on x86 processors. This includes the calling convention, the stack frame layout, allocation sequences, etc.

Calling Convention

Registers

When one ML function calls another, the register usage is as follows:

Register	Preserved	Description
EDI	Yes	Callee Closure - the closure of the called function.
EBP	Yes	Caller Closure - the closure of the calling function.
EBX	No	Argument - the function argument.
ESP	Yes	Stack Pointer
ESI	Yes	Implicit Vector (Thread)
EAX	Yes	general callee-save
EDX	Yes	general callee-save
ECX	No	(scratch register)

On function return, the result is in EBX. The other registers apart from ECX are all unchanged (i.e. callee-save).

ESI points to the 'implicit vector', which is a per-thread structure managed by the runtime which ML machine code uses to access any non-closed-over state (for instance, the allocation pointer). See below.

Stack

Frames on the ML stack are linked: every frame has a 'frame pointer' to the next frame. Each frame looks like this (offsets are from the frame pointer):

Offset	Description
0	Frame pointer
4	Closure
...	GC stack slots (saves and spills)
...	non-GC stack slots (saves and spills)
fp-4N-4	Return address
...
fp-8	stack argument 1
fp-4	stack argument 0 (pushed by caller)
fp	next frame

So at the point of function entry, the stack looks like this:

Offset	Description
0	Return address
4	stack argument 0
8	stack argument 1
...
4N+4	stack argument N
4N+8	caller's stack frame

Here N is function-dependent and stored in the ancillary word for the function, as is the size of the non-GC area of the stack frame (they are the CCODEARGS and CCODENONGC slots in the ancillary word, respectively; see the Object Format page).

Implicit Vector

The ESI register points to a structure managed by the MLWorks runtime, which contains a large number of values used by ML code at runtime. Most of these values are pointers to code in the runtime; others are pointers to key data structures or memory areas. Within the runtime, this is a struct thread_state (see threads.h), which begins with a struct implicit_vector (see implicit.h).

The runtime build system makes rts/gen/__implicit.sml from rts/src/implicit.h using rts/awk/__implicit.awk. The ML file defines an SML structure structure ImplicitVector_ containing the offsets (as ML integers) of each slot in the implicit vector. This system depends on a couple of facts:

The layout of implicit.h is very consistent;
Every slot in the implicit vector is one word (32 bits) in size.

The implicit vector is the same for every platform, and changes very rarely. On register-rich architectures, several slots shadow registers, which contain the 'live' values: the values on the implicit vector may be out of date. On x86, there are no such slots: all the values in the implicit vector are 'live'.

It is laid out as follows:

Offset	Name	Description
0	ref_chain	List of arrays modified by ML code since the last GC
4	gc	code to enter GC
8	gc_leaf	code to enter GC for a leaf function
c	external	code to lookup an environment function
10	extend	code to handle stack overflow
14	raise_code	code to raise an exception
18	leaf_raise_code	code to raise an exception in a leaf function
1c	replace	code for replacing a function
20	replace_leaf	code for replacing a leaf function
24	intercept	code for intercepting a function
28	intercept_leaf	code for intercepting a leaf function
2c	interrupt	flag indicating that we are in an interrupt handler
30	event_check	code to handle an asynchronous event
34	event_check_leaf	code to handle an asynchronous event in a leaf function
38	profile_alloc	code for allocation during space profiling
3c	profile_alloc_2	(see below)
40	profile_alloc_3	(see below)
44	profile_alloc_leaf	code for allocation in a leaf function during space profiling
48	profile_alloc_leaf_2	(see below)
4c	profile_alloc_leaf_3	(see below)
50	gc_base	the current allocation point
54	gc_limit	the limit of the allocation area (except when space profiling on x86)
58	real_gc_limit	the actual allocation area limit
5c	handler	linked list of handler frames (see below)
60	stack_limit	the true ML stack limit
64	register_stack_limit	the stack limit, or -1 if there is a pending interrupt

On SPARC platforms, the code for allocation during space profiling is actually stored on the implicit vector, which is why three words are used. On other platforms, the profile_alloc_2 and profile_alloc_3 slots may be used for temporaries during the space-profiling allocation.

When space profiling on x86 platforms, the gc_limit slot is set to the base of the allocation area, so that every allocation enters the runtime (on other platforms, allocation code sequences are modified). The actual limit of the allocation area is in the real_gc_limit slot.

Exception Handling

Exception handling is done via a linked list of "handler frames" which are 4-tuples allocated on the stack. The head of the list is on the implicit vector (implicit->handler). To create a handler a new handler frame is allocated in the current stack frame, filled in, and pushed on the head of this list. When control flow passes out of the handler's scope, the handler is popped off the list.

Each handler frame has the following contents:

Offset	Name	Description
-1	previous	Previous handler
3	sp	Stack pointer of creator
7	closure	Handler function closure
b	continuation	Offset within creator of continuation code

The handler frame pointer is offset by 1 from the first field, so that it is tagged as a pointer, and its contents can be accessed by code as if it were a tuple.

Code in the runtime (ml_raise in interface.S) raises an exception by building a fake stack frame and calling the handler. If the handler returns (and therefore the exception has been successfully handled), ml_raise then unwinds the stack to the creator's frame and jumps to the continuation. If the handler doesn't handle the exception, or raises a different one, it calls ml_raise again.

TODO

More on the calling convention
Something on larger-scale stack organisation
Something on allocation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x86 ABI

x86 ABI

Calling Convention

Registers

Stack

Implicit Vector

Exception Handling

TODO

Clone this wiki locally