-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x86_64 decoding issues #936
Comments
will get back to you on SAL & PUSH instructions. on your LOCK instruction, objdump is incorrect, because LOCK is only relevant for instruction writing to memory. see http://docs.oracle.com/cd/E19620-01/805-4693/instructionset-128/index.html |
I think aquynh is right about 'AND', because I tried to run a binary with this code and it failed with an "instruction not permitted" error. Also, nasm says "lock.asm:3: warning: instruction is not lockable" when I try to assemble this instruction. |
After browsing internet and being under doubt, I found more information (because ORACLE reference is ungreat): It is not clear if the undefined exception MAY occur due to what the cpu is. And 'may' implies some CPU may not do this way, making the decoding valid even if not really sensible. By the way, is there a reason why nasm tells a warning instead of an error? |
In fact, it may be considered as valid because those are aliases to D3 /4. /4 is supposed to be SHL, /5 to be SHR, /7 to be SAR and so /6 to be SAL. But SHL and SAL does the same thing and so /6 is an alias to /4 in execution term.
EDIT: unless in AMD64, they should not have to observe that old aliasing, that is, when being in 64-bit mode. |
The issue here is probably not with their documentation because 'immediate' value are always signed-extended (it is the case with 8-bit and 16-bit immediate values before AMD64) but the lack of information with "push 0xB7": is it a 8-bit, 16-bits or 32-bit immediate push? so it is ambiguous because it wouldn't give the same value in a 64-bit stack entry. In nasm, shouldn't we write "push byte 0xb7"? As for Objdump, the disassembling is a bit misleading and ackward as there is no real "pushQ" opcode and still lacking information even if the result in a 64-bit stack entry is correct. |
ok, so the sal seems to be an issue with objdump and I should probably report it to them. But I still don't understand the way 'push' works. If I understand the doc correctly, you can push a 16 bits or a 64 bits (default is 64) to the stack in 64 bits mode, using immediate values of size 8, 16 or 32. I've tested all three combinations (b"\x6a\xb7", b"\x66\x68\xb7\x00" and b"\x64\xb7\x00\x00\x00"). They are all decoded by capstone as "push 0xb7", but they don't have the same effect on the stack. Testing with gdb, I see that the first pushes 0xffffffffffffffb7 (64 bits), the second pushes 0x00b7 (16 bits) and the last pushes 0x00000000000000b7 (64 bits) to the stack. Where is this detail hidden in capstone? |
Exactly, that is the point I don't agree with Capstone's choice : for the same representation, we can have two different results in the stack, which is bad in my opinion and must be addressed. One possible way is to consider "push byte 0xb7" and "push dword 0xb7"... it may be whatever as long as there is a rational to make the representation not ambiguous in result. |
I just remember stack operations on AMD64 may be a little confusing. The 16-bit immediate value indeed pushes a 16-bit value (2 bytes in stack) and both 8-bit and 32-bit immediate value pushes a 64-bit value (8 bytes in stack) because you cannot push a 32-bit value in 64-bit mode (the 32-bit and 64-bit use the same opcode for pushing value in their respective default operand size). |
well, there is an optional "detail" structure and some API to get that info but I agree if you don't use it (just using the textual disassembly), you're in trouble. |
I think I was also bit by this, in radare2:
|
Hi, I've run some tests and found three instructions on x86_64 that don't seem to always be decoded properly. I'm using the first basic example in http://www.capstone-engine.org/lang_python.html, except that I replace the CODE string with something else.
with
b'\x6a\xb7'
, I get the following result with capstone:push 0xb7
, but I getpushq $0xffffffffffffffb7
with objdump.According to AMD's manual, 6A i8 "Push an 8-bit immediate value (sign-extended to 16, 32,
or 64 bits) onto the stack". So I think objdump is right here.
with
b'\xf0\x22\xbd\x71\x20\x17\x00'
, I get no result with capstone (nothing is decoded), but objdump decodes that aslock and 0x172071(%rbp),%bh
which looks correct to me.with
b'\xd3\xb6\x6b\x8f\xac\xa0'
, I get the following result with capstone:sal dword ptr [rsi - 0x5f537095], cl
, but according to AMD's manual, sal can be D3 /4 (but no other), but the string contains D3 /6. Objdump says 'd3' is bad.I ran that with the latest git commit.
The text was updated successfully, but these errors were encountered: