This is a rudimentary attempt to build an autogenerated formal-ish model of x86 intrinsics by interpreting Intel's instruction pseudocode, transforming it into a model for Z3.
Intel's Intrinsics Guide provides an interactive guide with data for each x86 intrinsic instruction, including a pseudocode that specifies the instruction's behavior. The intrinsics guide is backed by an XML file with all of this data in an easily parseable format, which we use here to build a Z3 model.
So far, only these features of Intel's pseudocode are supported:
- Basic unary/binary arithmetic/bitwise operations, and ternary conditionals
- Bit slices, both reading and writing
IF
/CASE
conditionals (which are predicated when they can't be statically resolved)FOR
loops- Function definitions/calls
See
tokens
/rules
inparse.py
for the most up-to-date information.
There are many functions used in Intel's documentation that are not explicitly given in the XML. For now,
these are almost all unsupported (see functions defined in intr_builtins.py
for the current list).
This method of generating models is inherently limited. The intrinsics only cover a subset of x86 instructions (presumably until a PDF reader is added), so this will mostly be useful for investigating hand-rolled SIMD code, etc. Handling memory might be possible but would certainly be very slow. Handling control flow is likely out of scope for now too.
There are quite possibly bugs in this implementation, which can be pretty hard to find and fix. There are also definitely bugs in Intel's code (see below), so this shouldn't be relied on for anything serious.
So far, this can do some interesting non-trivial things, like derive lookup
tables for vpternlogd
:
check_print(_mm512_ternarylogic_epi32(_mm512_set1_epi8(0xAA),
_mm512_set1_epi8(0xCC), _mm512_set1_epi8(0xF0), i) == _mm512_set1_epi8(0x57))
# -> [y = 0x1f]
...or find an index vector for vpermb
that reverses its input:
values = range(2, 3*64, 3)
check(_mm512_set_epi8(*values) == _mm512_permutexvar_epi8(b, _mm512_set_epi8(*reversed(values))))
# <- [b = 0x000102030405060708090a0b0c0d0e0f1011...]
...or find a bug in Intel's pseudocode. Turns out that last one gives unsat
, which didn't make sense.
Investigating further, I noticed that I wasn't getting the right result for a _mm512_set_epi8(*range(64))
,
which led me to find this bug in the _mm512_set_epi8
pseudocode:
dst[495:488] := e61
dst[503:496] := e62
-dst[511:503] := e63
+dst[511:504] := e63
dst[MAX:512] := 0
All of these intrinsic functions used in these examples (and the ones in test.py
) are completely
autogenerated from Intel's data; there's no manual intrinsics created whatsoever.
To use this, you must first download the latest XML from Intel's Intrinsics Guide
(currently here: https://www.intel.com/content/dam/develop/public/us/en/include/intrinsics-guide/data-latest.xml)
and save it as data.xml
in the current directory. I don't believe I can legally
redistribute this file myself.
This project requires Z3, and my sprdpl parsing library (included as a submodule).
The library has two primary APIs for looking up and using intrinsics:
- The
parse_whitelist()
function, which returns a dictionary of intrinsic objects, suitable for injecting directly into global scope:
intrinsics = parse_whitelist('data.xml', regex='_mm256_set1_epi(8|32)')
globals().update(intrinsics)
check_print(_mm256_set1_epi8(0) == _mm256_set1_epi32(0))
- The
parse_meta()
function, which returns a magic object that lazily parses intrinsics pseudocode whenever attributes are accessed, like so:
meta = parse_meta('data.xml')
check_print(meta._mm256_xor_si256(0, 1) == 1)
# .prefixed() allows reducing duplicated prefixes:
avx = meta.prefixed('_mm256_')
check_print(avx.set1_epi8(0) == 0)
See test.py
for various example use cases.