Skip to content

Latest commit

 

History

History
376 lines (359 loc) · 17.1 KB

syntax.md

File metadata and controls

376 lines (359 loc) · 17.1 KB

Syntax for the 8086 emulator

This shows the syntax for the 8086 emulator. The main reference for this was 8086 family user manual : i8086 manual : https://edge.edx.org/c4x/BITSPilani/EEE231/asset/8086_family_Users_Manual_1_.pdf
In some places the reference has also been taken from https://css.csail.mit.edu/6.858/2014/readings/i386/c17.htm, which is for 80386 instruction set, but as most of it is backward compatible, it applies to 8086.

The opcodes and directives are case independent, but labels are case sensitive. In general this used little endian format. Lower byte first, then higher byte

Structure of Program

Note that all data directives must come before code directives and opcodes

Data directives

Code directives and opcodes

Commonly used terms :

  • Label : a single, no space containing word, immediately followed by a colon (:) when defining the label. Can contain _, 0-9, a-z, A-Z, but must not start with a number.
  • byte : the actual word "byte"
  • word : the actual word "word"
  • number : A number can be specified in three formats :
    • Decimal : using 0-9.
    • Binary : using 0 and 1, must start with 0b, eg : 5 = 0b0101
    • Hexadecimal : using 0-9,a-f, must start with 0x, eg : 5 = 0x5
    value can be set using offset data directive as well.
  • unsigned byte number : number in range 0 -> 255
  • signed byte number : number in range -128 -> 127, only decimal numbers with '-' can be used, for other format use 2's complement for negating.
  • unsigned word number : numbers in range 0 -> 65535.
  • signed word number : numbers in range -32768 -> 32767, only decimal numbers with '-' can be used, for other format use 2's complement for negating.
  • Word Registers : AX,BX,CX,DX,BP,SP,SI,DI
  • Byte Registers : AL,AH,BL,BH,CL,CH,DL,DH
  • Segment Registers : ES,DS,SS,CS
  • memory : 8086 allows four types of memory addressing : For all cases, when BP is used, the segment used is SS, for others, DS is used, unless segment register is provided as override, in which case that is sued as the segment.
    • (segment-register) [offset] : offset of the data from current DS or given segment register
    • (segment-register) [bx/bp/si/di] : The offset of value is taken from the specified register.
    • (segment-register)[bs/bp/si/di , signed word number] : The offset is taken from the registers, and the number is added to it.
    • (segment-register)[bs/bp , si/di , (signed word number) ] : The offset is taken from the base registers, and offset in index registers as well as the number is added to it. The number offset is optional.
Data Directives

Data directives supported by emulator

  • set : set directive is used for setting the value of ds when storing the data.

    syntax : set unsigned-word-number

  • DB : used to store a single byte

    syntax : label is optional

    • (label) DB signed/unsigned byte number : sets a single byte to given value
    • (label) DB [unsigned word number] : sets given number of bytes to 0 (can be used to declare empty array)
    • (label) DB [signed/unsigned byte number ; unsigned word number] : sets given number of bytes (second argument) to given value (first argument).
    • (label) DB "string" : stores a string , characters or not escaped, eg : \n will be stored as \ and n.

  • DW : used to store a word number

    syntax : label is optional

    • (label) DW signed/unsigned word number : sets a word to given value
    • (label) DW [unsigned word number] : sets given number of words to 0 (can be used to declare empty array)
    • (label) DW [signed/unsigned word number ; unsigned word number] : sets given number of words (second argument) to given value (first argument).
    • (label) DW "string" : stores a string , characters or not escaped, eg : \n will be stored as \ and n.

  • offset : used to get offset of value from the data segment it was defined in. Note that this only gives offset from the segment was defined in, so if DS was changed using set, it will contain offset from that value.

    syntax :

    • offset label_name : can be used in place of number, as this is determined at compile time.

Code Directives

Code directives supported by emulator

  • macro definition : used to define macros, which can be used to put code in place, where parameters are replaced by given values at compile time

    syntax : macro macro_name (comma separated parameter list) -> replace string <-

    The code between '->' and '<-' will be placed in place of macro use, where the parameters will be replaced by the ones given in macro call.
    Note that recursive macros direct/ indirect are not supported. For no parameter macro use single _ as parameter in definition as well as use.
    For passing macro name to macro for invocation, make sure to leave space between param name and brackets :
    MACRO a(q)-> ADD AX,q <- MACRO b(k,q) -> k **this space** (q)<- b(a,5)
  • macro use : used to 'call' macro, the code defined in macro will be placed in place of this,with parameters replaced.

    syntax : macro_name (comma separated value list)

    The code between '->' and '<-' will be placed in place of macro use, where the parameters will be replaced by the ones given in macro call.
  • procedure definition : used to define procedure

    syntax : def procedure_name {opcodes/macro use}
    procedure name has same format as label, except ':'.

Print Statements

This shows print commands syntax, which can be used in the code as well as in interactive user prompt.

  • print flags : This will print the value of various flags.
  • print reg : This will print the value of registers.
  • print mem start -> end : This will print the value of memory, from start to end, both inclusive. the start and end are unsigned number, in range 0 ->1048575
  • print mem start:offset : This will print the value of memory from start, to start+offset. Value of start and start+offset must lie in 0 -> 1048575
  • print mem :offset : This will print the value of memory from start of current data segment till offset, both inclusive.

For opcodes, detail explanation of what they do is given in 8086 family manual, this explains only the syntax.
Control Instructions
These are single opcode instructions.
STC,CLC,CMC,STD,CLD,STI,CLI,HLT,NOP are supported.
WAIT, ESC, and LOCK are not supported
Syntax : opcode
Control Transfer Instructions
  • jump instructions :
    jmp, ja,jnbe,jae,jnb,jb,jnae,jbe,jna,jc,je,jz,jg,jnle,jge,jnl,jl,jnge,jle,jng,jnc,jne,jnz,jno,jnp,jpo,jns,jo,jp,jpe,js,jcxz
  • loop instructions :
    loop,loope,loopz,loopne,loopnz
  • Syntax : opcode label
int : Following interrupts are supported
  • int 3 : Can be used for debugging, displays user prompt
  • int 0x10 : value of AH allowed are : 0AH,13H
    0AH ignores BH & BL (page number and page attribute)
    13H ignores AL (write mode), BH & BL (page number and attributes), DH (row to print the string on), supports DL (column to print string on)
  • int 0x21 : value of AH allowed are : 1H,2H,0AH

into and iret are not supported

  • call : used for calling a procedure.

    syntax : call proc_name

  • ret : used for returning from a procedure.

    syntax : call

Bit Manipulation Instructions
  • not : bitwise not
    syntax :
    not byte register
    not word register
    not byte memory
    not word memory
    not byte label
    not word label
  • binary logical : and,or,xor,test
    syntax :
    opcode byte-register , byte-register
    opcode word-register , word-register
    opcode byte-register , byte memory
    opcode word-register , word memory
    opcode byte-register , byte label
    opcode word-register , word label
    opcode byte memory , byte-register
    opcode word memory , word-register
    opcode byte label , byte-register
    opcode word label , word-register
    opcode byte-register , unsigned byte number
    opcode word-register , unsigned word number
    opcode byte memory , unsigned byte number
    opcode word memory , unsigned word number
    opcode byte label , unsigned byte number
    opcode word label , unsigned word number
  • shifts and rotates : sal,shl,sar,shr,rol,ror,rcl,rcl
    syntax :
    opcode byte-register , unsigned byte number
    opcode word-register , unsigned byte number
    opcode byte-register , cl
    opcode word-register , cl
    opcode byte memory , unsigned byte number
    opcode word memory , unsigned byte number
    opcode byte memory , cl
    opcode word memory , cl
    opcode byte label , unsigned byte number
    opcode word label , unsigned byte number
    opcode byte label , cl
    opcode word label , cl
Arithmetic Instructions
  • No operands : aaa,aad,aam,aas,daa,das,cbw,cwd
    syntax : opcode
  • Single operand : dec,inc,neg,mul,imul,div,idiv
    syntax :
    opcode byte-register
    opcode word-register
    opcode byte memory
    opcode word memory
    opcode byte label
    opcode word label
  • Binary opcodes : add, adc, sub, sbb, cmp
    syntax :
    opcode byte-register , byte-register
    opcode word-register , word-register
    opcode byte-register , byte memory
    opcode word-register , word memory
    opcode byte-register , byte label
    opcode word-register , word label
    opcode byte memory , byte-register
    opcode word memory , word-register
    opcode byte label , byte-register
    opcode word label , word-register
    opcode byte-register , unsigned/signed byte number
    opcode word-register , unsigned/signed word number
    opcode byte memory , unsigned/signed byte number
    opcode word memory , unsigned/signed word number
    opcode byte label , unsigned/signed byte number
    opcode word label , unsigned/signed word number
String Instructions
Instructions are : movs, lods,stos,cmps,scas
movsb, movsw are not supported
Syntax :
opcode byte
opcode word
The word and byte specifies if the string is byte string or word string
repeat instructions
rep supports movs,lods,stos
repe,repz,repne,repnz supports cmps, scas
Data Transfer Instructions
in,out,lds,les are not supported
  • No operands : lahf,sahf,pushf,popf,xlat
    syntax : opcode
  • lea :
    syntax : lea word-register word memory
    lea word-register word label
  • push : supports only word length memory
    syntax : push word-register
    push segment-register (cs register allowed)
    push word memory
    push word label
  • pop : supports only word length memory
    syntax : pop word-register
    push segment-register (cs register not allowed)
    push word memory
    push word label
  • xchg :
    syntax : xchg byte-register , byte-register
    xchg word-register , word-register
    xchg byte memory , byte-register
    xchg byte-register , byte memory
    xchg word memory , word-register
    xchg word-register , word memory
    xchg byte label , byte-register
    xchg byte-register , byte label
    xchg word label , word-register
    xchg word-register , word label
  • mov :
    syntax : mov byte-register , byte-register
    mov word-register , word-register
    mov byte-register , byte memory
    mov word-register , word memory
    mov byte-register , byte label
    mov word-register , word label
    mov byte memory , byte-register
    mov word memory , word-memory
    mov byte label , byte-register
    mov word label , word-register
    mov byte-register , unsigned/signed byte number
    mov word-register , unsigned/signed word number
    mov byte memory , unsigned/signed byte number
    mov word memory , unsigned/signed word number
    mov byte label , unsigned/signed byte number
    mov word label , unsigned/signed word number
    mov segment-register , word-register
    mov word-register , segment-register
    mov segment-register , word memory
    mov segment-register , word label
    mov word memory , segment-register
    mov word label , segment-register