Skip to content

x86 FPU semantic model

Peter Matula edited this page Sep 9, 2019 · 5 revisions

This wiki page describes how x86 FPU (x87) instruction semantics is modeled in Capstone2llvmir library.

x86 FPU intro

  • FPU has 8 registers, st(0) to st(7).
  • Registers form a stack.
  • st(0) denotes register at the top of the stack, st(i) denotes register at distance i from the top of the stack.
  • Numbers are pushed onto the stack from memory, and popped off the stack to memory.
  • FPU instructions generally work on:
    • Single register on top of the stack: st(0).
    • Two registers on top of the stack: st(0), st(1).
    • Two registers, one on top of the stack: st(0), one on specified offset i from the top of the stack: st(i).
  • Example:
fld qword [a]       ; st(0) = a
fld qword [b]       ; st(0) = b,         st(1) = a
fld qword [c]       ; st(0) = c,         st(1) = b, st(2) = a
fmul st(0), st(2)   ; st(0) = c * a,     st(1) = b, st(2) = a
fmulp st(1), st(0)  ; st(0) = b * c * a, st(1) = a

x86 FPU in Capstone

  • Registers:
    • Data: X86_REG_ST0, X86_REG_ST1, X86_REG_ST2, X86_REG_ST3, X86_REG_ST4, X86_REG_ST5, X86_REG_ST6, X86_REG_ST7
    • Data registers represent registers relative to the current top of the stack.
    • FPU status register: X86_REG_FPSW
    • FPU tag registers: missing?
    • FPU control register: missing?
  • Instructions:
    • All x86 FPU instruction are represented.
    • X86_REG_STx are operands (implicit/explicit) of these instructions.
    • !!! Registers denote stack slots relative to the current FPU stack top !!! For example, st(1) in the following two fmul instructions is not the same stack slot, since stack was changed between the instructions:
                   ; st(0) = a,         st(1) = b
fmul st(0), st(1)  ; st(0) = a * b,     st(1) = b
fld qword [c]      ; st(0) = c,         st(1) = a * b, st(2) = b
fmul st(0), st(1)  ; st(0) = c * a * b, st(1) = a * b, st(2) = b

Therefore, we cannot simply represent fmul st(0), st(1) instruction as:

; global variables representing registers
@st0 = internal global x86_fp80
@st1 = internal global x86_fp80
; fmul st(0), st(1)
%op0 = load x86_fp80, x86_fp80* @st0
%op1 = load x86_fp80, x86_fp80* @st1
%res = fmul x86_fp80 %op0, %op1
store x86_fp80 %res, x86_fp80* @st0

x86 FPU in Capstone2llvmir

  • Registers:

    • Capstone2llvmir library adds x86 FPU registers on top of Capstone registers.
    • enum x87_reg_status defines registers representing parts of X86_REG_FPSW status register: X87_REG_IE, X87_REG_DE, X87_REG_ZE, X87_REG_OE, X87_REG_UE, X87_REG_PE, X87_REG_SF, X87_REG_ES, X87_REG_C0, X87_REG_C1, X87_REG_C2, X87_REG_C3, X87_REG_TOP, X87_REG_B.
    • The most important register here is X87_REG_TOP of i3 type, which represents the current top of the stack - how deep in the stack are we currently.
    • enum x87_reg_control defines registers representing the FPU control register: X87_REG_IM, X87_REG_DM, X87_REG_ZM, X87_REG_OM, X87_REG_UM, X87_REG_PM, X87_REG_PC, X87_REG_RC, X87_REG_X.
    • enum x87_reg_tag represent FPU tag registers associated with X86_REG_STx registers: X87_REG_TAG0, X87_REG_TAG1, X87_REG_TAG2, X87_REG_TAG3, X87_REG_TAG4, X87_REG_TAG5, X87_REG_TAG6, X87_REG_TAG7.
    • Capstone's X86_REG_STx registers are represented in LLVM IR as stx global variables of x86_fp80 type. They represent concrete FPU registers, not stack slots relative to the current stack top.
  • Instructions:

    • Instructions do not work with concrete FPU registers X86_REG_STx (global variables stx), since at the translation time we do not know which register is actually used - we only know register offsets from the current stack top.
    • Instructions are modeled as operations on stack - they implement a sort of stack machine.
    • It is up to a later analysis (in bin2llvmir) to analyze the FPU stack, assign concrete FPU registers to stack machine operations, and replace these operations with instances of concrete registers stx.
  • Stack machine:

    • Global variable (register) representing the current stack TOP - its position in stack.
      • @fpu_stat_TOP = internal global i3 0
      • Value points to the last pushed/occupied stack slot, not the first empty slot. This is because it is easier to work with it this way - e.g. fmul st(0), st(1) can load the value and use it to get st(0) right away, then it needs to add one to get st(1). If it was pointing to the first empty slot, we would need an add operation to get to any st(i), including st(0).
      • Assumed initial value is 8 (even though this can not be represented in i3 type, the subsequent FPU analysis can easily make this assumption). This represents an empty FPU stack - nothing was pushed.
      • Stack grows from 8 to zero.
    • Push operation decrements TOP.
      • e.g. top = 6 -> push -> top = 5
    • Pop operation increments TOP.
      • e.g. top = 5 -> pop -> top = 6
    • There are 4 pseudo functions used to get/set arbitrary stack slots. When used together with stack TOP and addition/subtraction, we can get/set stack slots relative to the current TOP.
      • void _x87DataStoreFunction(i3, fp80): stores fp80 value to stack position (FPU data stack) indicated by an i3 value.
      • void _x87TagStoreFunction(i3, i2): stores i2 value to stack position (FPU tag stack) indicated by an i3 value.
      • fp80 _x87DataLoadFunction(i3): loads fp80 value from stack position (FPU data stack) indicated by an i3 value.
      • i2 _x87TagLoadFunction(i3): loads i2 value from stack position (FPU tag stack) indicated by an i3 value.
  • Example push (pseudo function names are arbitrary):

    • fld ds:dbl_4090C0
    • DD /0 FLD m64fp Push m64fp onto the FPU register stack.
%21 = load double, double* inttoptr (i32 4231360 to double*) ; load double from memory 4231360
%22 = fpext double %21 to x86_fp80                           ; convert double value to fp80 value
%23 = load i3, i3* @fpu_stat_TOP                             ; get the current TOP
%24 = sub i3 %23, 1                                          ; decrement TOP -> get to the next empty slot
%25 = fcmp oeq x86_fp80 %22, 0xK00000000000000000000         ; compute tag based on value to push
%26 = select i1 %25, i2 1, i2 0                              ; compute tag based on value to push
call void @__x87_reg_store.fpu_tag(i3 %24, i2 %26)           ; set computed tag to the next empty tag slot
call void @__x87_reg_store.fpr(i3 %24, x86_fp80 %22)         ; set loaded value to the next empty data slot
store i3 %24, i3* @fpu_stat_TOP                              ; decrement TOP -> it points to just pushed values
  • Example pop (pseudo function names are arbitrary):
    • fstp qword ptr [esp+4]
    • DD /3 FSTP m64fp Copy ST(0) to m64fp and pop register stack.
%27 = load i3, i3* @fpu_stat_TOP                       ; get the current TOP
%28 = call x86_fp80 @__x87_reg_load.fpr(i3 %27)   ; get value from the data slot at the current top
%29 = load i32, i32* @esp                              ; get stack pointer
%30 = add i32 %29, 4                                   ; add +4 to stack pointer
%31 = fptrunc x86_fp80 %28 to double                   ; convert fp80 value to double value
%32 = inttoptr i32 %30 to double*                      ; convert esp+4 value to double pointer
store double %31, double* %32                          ; store FP vlaue to esp+4
call void @__x87_reg_store.fpu_tag(i3 %27, i2 -1) ; clear the current tag slot
%33 = add i3 %27, 1                                    ; increment TOP
store i3 %33, i3* @fpu_stat_TOP                        ; increment TOP -> it points to the next FPU stack slot -> the current one was skipped = poped
  • Example fmul operation (pseudo function names are arbitrary):
    • fmulp st(1), st
    • DE C9 FMULP Multiply ST(1) by ST(0), store result in ST(1), and pop the register stack.
%60 = load i3, i3* @fpu_stat_TOP                          ; get the current TOP
%61 = add i3 %60, 1                                       ; increment TOP
%62 = call x86_fp80 @__x87_reg_load.fpr(i3 %61)      ; get st(1) - value below the current top (top + 1)
%63 = call x86_fp80 @__x87_reg_load.fpr(i3 %60)      ; get st(0) - value at the current top
%64 = fmul x86_fp80 %62, %63                              ; st(1) * st(0)
%65 = fcmp oeq x86_fp80 %64, 0xK00000000000000000000      ; compute tag based on value to set
%66 = select i1 %65, i2 1, i2 0                           ; compute tag based on value to set
call void @__x87_reg_store.fpu_tag(i3 %61, i2 %66)   ; set computed tag to st(1) tag slot
call void @__x87_reg_store.fpr(i3 %61, x86_fp80 %64) ; set computed value to st(1)
call void @__x87_reg_store.fpu_tag(i3 %60, i2 -1)    ; clear the current TOP tag slot
%67 = add i3 %60, 1                                       ; increment TOP
store i3 %67, i3* @fpu_stat_TOP                           ; increment TOP -> st(0) is poped, st(1) becomes st(0)

x86 FPU analysis in Bin2llvmir

  • Assumes initial value for @fpu_stat_TOP is 8.
  • Tracks the value throughout the program.
  • Each time the value is used to get/set FPU data/tag register, the current x value between 0 and 7 is used to get X86_REG_STx register (stx LLVM IR global variable).
  • Pseudo functions used in stack machine are replaced by load/stores of these concrete registers (LLVM IR global variables).