Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IMUL mem/Ib DWORD; OPC_IMUL_GvEvIb; "imul eax, [ecx+0x41], 0x10" #364

Closed
egberts opened this issue Jan 9, 2016 · 72 comments
Closed

IMUL mem/Ib DWORD; OPC_IMUL_GvEvIb; "imul eax, [ecx+0x41], 0x10" #364

egberts opened this issue Jan 9, 2016 · 72 comments

Comments

@egberts
Copy link
Contributor

egberts commented Jan 9, 2016

In Intel IMUL opcode described in one method using different descriptions:

  • 6B / r ib IMUL r32, imm8 doubleword register = doubleword register * sign-extended immediate byte
  • imul Gv, Ev, I
  • OPC_IMUL_GvEvIb
  • imul eax, [ecx+0x41], 0x10
  • 6b414110 imul eax,DWORD PTR [ecx+0x41],0x10

When encountering an IMUL opcode of i386 architecture, specifically when using with an immediate 8-bit multiplier, the emulator engine does not properly multiply the number 0x5151494a by 0x10 and get the expected 0x151494a0 result.

Interesting thing, is that a standalone IMUL instruction works, but this stack-based AlphaMixed code snippet (derived from Metasploit) failed.

Will submit working proof-of-failure code soon.

@aquynh
Copy link
Member

aquynh commented Jan 10, 2016

yes, please put the testcases under tests/. if your code is in C, preferred place is under tests/unit.

thanks.

@egberts
Copy link
Contributor Author

egberts commented Jan 11, 2016

Unable to procure a working FIX of QEMU engine, but I have a clearly documented unit test C file for this troubleshooting of IMUL aex, mem, Ib" and securing a pass/fail scenario.

test_i386_imul_eax_r_ib.c.zip

$ unzip test_i386_imul_eax_r_ib.c.zip
$ gcc test_i386_imul_eax_r_ib.c -lunicorn
$ a.out

(Just started refitting my C code for tests/unit style)

@egberts egberts changed the title x86_64 IMUL r/Ib DWORD x86_64 IMUL mem/Ib DWORD Jan 11, 2016
@egberts
Copy link
Contributor Author

egberts commented Jan 11, 2016

I'm staring at the tcg_gen_muls2_i32 function... in GDB, perhaps this is @farmdve's cup of tea because this signed IMUL operation got messed before it stores that corrupted value into the EAX register?

        case MO_32:
            tcg_gen_trunc_tl_i32(tcg_ctx, cpu_tmp2_i32, *cpu_T[0]);
            tcg_gen_trunc_tl_i32(tcg_ctx, cpu_tmp3_i32, *cpu_T[1]);
            tcg_gen_muls2_i32(tcg_ctx, cpu_tmp2_i32, cpu_tmp3_i32,
                              cpu_tmp2_i32, cpu_tmp3_i32);
            tcg_gen_extu_i32_tl(tcg_ctx, *cpu_regs[reg], cpu_tmp2_i32);
            tcg_gen_sari_i32(tcg_ctx, cpu_tmp2_i32, cpu_tmp2_i32, 31);
            tcg_gen_mov_tl(tcg_ctx, cpu_cc_dst, *cpu_regs[reg]);
            tcg_gen_sub_i32(tcg_ctx, cpu_tmp2_i32, cpu_tmp2_i32, cpu_tmp3_i32);
            tcg_gen_extu_i32_tl(tcg_ctx, cpu_cc_src, cpu_tmp2_i32);

https://github.com/unicorn-engine/unicorn/blob/master/qemu/target-i386/translate.c#L5473

@egberts egberts changed the title x86_64 IMUL mem/Ib DWORD x86_64 IMUL mem/Ib DWORD; OPC_IMUL_GvEvIb; "imul eax, [ecx+0x41], 0x10" Jan 11, 2016
@egberts egberts changed the title x86_64 IMUL mem/Ib DWORD; OPC_IMUL_GvEvIb; "imul eax, [ecx+0x41], 0x10" IMUL mem/Ib DWORD; OPC_IMUL_GvEvIb; "imul eax, [ecx+0x41], 0x10" Jan 11, 2016
@egberts
Copy link
Contributor Author

egberts commented Jan 11, 2016

@farmdve
Copy link
Contributor

farmdve commented Jan 11, 2016

The problem is not with the instruction decoder. Your code is self-modifying code, it modifies the value the IMUL multiplier from 0x51 to 0x10, usually QEMU handles self-modifying code, so this should work, but for some reason QEMU is using the old value 0x51, instead of 0x10(despite what uc_mem_read returns). Meaning the translation cache is not flushed.

0x5151494a + 0x51 = 0xBAB8306A
0x5151494a + 0x10 = 0x151494A0

There have been other instances of Unicorn failing with self-modifying code, so thanks for finding this.

@aquynh
Somewhere we need to flush the translation block.

@aquynh
Copy link
Member

aquynh commented Jan 16, 2016

@egberts, so can you put your testcase under tests/ directory?

@egberts
Copy link
Contributor Author

egberts commented Jan 16, 2016

Yes.

egberts pushed a commit to egberts/unicorn that referenced this issue Jan 16, 2016
…eated

for exercising proper flushing of the instruction translation cache.
@egberts
Copy link
Contributor Author

egberts commented Jan 16, 2016

@aquynh , done.

First attempt was incorrectly added under tests/regress and that 1st pull-request has been closed, 2nd attempt was recoded to fit tests/unit style and 2nd pull-request opened at #380

@farmdve
Copy link
Contributor

farmdve commented Jan 17, 2016

I traced the issue to the self-modified code check in translate-all.c. Even though the code is modified, this check fails

if (!(tb_end <= start || tb_start >= end)) in tb_invalidate_phys_page_range and the page isn't invalidated.

@egberts
Copy link
Contributor Author

egberts commented Jan 18, 2016

Me setting a breakpoint at tb_invalidate_phys_page_range() isn't working for me as that function isn't mapped yet at GDB main() time... Do we have an awesome .gdbinit or defines to pass to make.sh (or something) to help us get to this threaded point faster?

@lunixbochs
Copy link
Contributor

I've had luck just letting a program run once in gdb before trying to add the library breakpoint.

@farmdve
Copy link
Contributor

farmdve commented Jan 18, 2016

@egberts

The functions have a suffix which depends on the arch. tb_invalidate_phys_page_range_x86_64 for X86 arch, and tb_invalidate_phys_page_range_arm for ARM.

@farmdve
Copy link
Contributor

farmdve commented Jan 27, 2016

This is the most bothersome bug. I have not looked at it in a while, but it's the first time that self-modifying code is not detected properly and thus TB page/region is not invalidated.

@aquynh
Copy link
Member

aquynh commented Jan 27, 2016

a quick glance, and this testcase seems a bit too complicated. can someone make a minimal testcase, so it is shorter & simpler?

@egberts
Copy link
Contributor Author

egberts commented Jan 27, 2016

Ummm, possibly... Are you looking for a smaller x86 code footprint (size of buffer)? Or are you looking to keep the test code simpler?

@aquynh
Copy link
Member

aquynh commented Jan 27, 2016

Yes, please make the x86 code as small as possible, by just keeping relevant instructions. Thanks

@egberts
Copy link
Contributor Author

egberts commented Jan 28, 2016

Will do

@egberts
Copy link
Contributor Author

egberts commented Jan 31, 2016

OK. Got the RIP (PC) to start exactly at before the self-modifying opcode (0x60000021) by using a snapshot of the register set content and setting these new register values in the test/unit/test_tb_x86.c test code:

0x021: 30 41 30    xor     byte ptr [ecx + 0x30], al      # modify immediate operand of imul opcode
0x024: 41          inc     ecx
0x025: 6b 41 41 51 imul    eax, dword ptr [ecx + 0x41], 0x51

Train your eye at the stack region dump of tests/unit/test_tb_x86.c output after each instruction

60000020: 50 30 41 30 41 6b 41 41 51 32 41 42 32 42 42 30

until imul opcode's immediate byte operand got modified at offset 0x60000028 to 0x10 from 0x51:

60000020: 50 30 41 30 41 6b 41 41 10 32 41 42 32 42 42 30

I wanted to keep the original code sequence for later but fuller unit test of invalidating translation cache...so a C preprocessor define #define RIP_NEXT_TO_THE_SELFMODIFY_OPCODE is inserted, set and used. Will PR next.

egberts pushed a commit to egberts/unicorn that referenced this issue Jan 31, 2016
…elf-modifying code

which modified the 2nd next instruction (imul) in which that escaped
our wonderful ability to invalidate the
instruction translation cache in which we badly need to pick up the
self-modification being made.
aquynh added a commit that referenced this issue Jan 31, 2016
Pull Request for Issue #364: Invalidating Translation Cache after self-modifying code
@egberts
Copy link
Contributor Author

egberts commented Feb 8, 2016

Would it make sense to perform at UC_MEM_WRITE hook routine and do the following:

  • check the register access if it touches any of Unicorn memory region(s)
  • call tb_flush()

If so, we would need API for that, no?

@egberts
Copy link
Contributor Author

egberts commented Feb 8, 2016

tb_flush() in UC_MEM_WRITE hook didn't work. (Yeah, I made me a new temporary API).

At the time of executing an already self-modified code, the cpu_ldub_code_x86_64()/glue() didn't expectedly obtain the modified instructions at disas_insn:4794.

Going back to the modifying instruction to see what it takes to 'push' the modified instructions down to the (perhaps, looking for that cpu_stub_code_x86_64() equivalent (doesn't exist).

@egberts
Copy link
Contributor Author

egberts commented Feb 8, 2016

Uses TVGv_i64 scratch memory
0x15 - contains value 0x10 (AL register content)
0x16 - scratch XOR operation
0x17- contains indexing result of [ecx + 0x30]

This snippet in translate.c:1519

case OP_XORL:
 tcg_gen-xor_tl(tgc_ctx, *cpu_T[0], *cpu_T[0], *cpu_T[1]);  # constructed 0x10
 gen_op_st_rm_T0_A0(s, ot, d);  # put in 0x10 in memory
 gen_op_update1_cc(tcg_ctx);
 set_cc_op(s, CC_OP_LOGICB + ot);
 break

In gen_op_st_rm_T0_A0(DisasContent, 0, OR_TMP0), calls
gen_op_st_v(s, idx=0, t0=0x15, a0=0x17), which calls
tcg_gen_qemu_st_tl(s->uc, t0=0x15, addr=0x17, idx=0x2, memop=MO_8), which calls
tcg_canonicalize_memop_x86_64(op=MO_8, is64=0x1, st=0x1)

Nothing is invalidating the TCG. It proceeds run TCG unhindered. I was expecting some kind of flush_icache_range() after passing memory region boundary test or something. (cpu_flush_icache_range() is being ignored while in TCG-mode).

@farmdve
Copy link
Contributor

farmdve commented Feb 8, 2016

I think you are looking at the wrong code, however tb_flush should in fact flush the entire cache and retranslate, so...not sure why it isn't.

@egberts
Copy link
Contributor Author

egberts commented Feb 8, 2016

I flipped the DEBUG_FLUSH define on, and within my hook_write(), performed the following:

  1. check range of write memory access region
static void
hook_mem32_write(uc_engine *uc, uc_mem_type type, uint64_t address, int size, int64_t value, void *UNUSED(user_data))
{
    switch(type) {
      default: break;
      case UC_MEM_WRITE:
        if (0 != uc_mem_check(uc, address, size)) { // bool: True/False
            printf("SELF-MODIFYING CODE\n");
            uc_tb_flush(uc);
        }
    }
}
  1. call my new uc_tb_flush(), which in turns did the following:
uc.c
UNICORN_EXPORT
uc_err uc_tb_flush(uc_engine *uc)
{
    CPUState *cpu = uc->cpu;
    void *env = cpu->env_ptr;
    printf("FLUSHED!\n\n");
    tb_flush_x86_64(env);
    return 0;
}

and got this:

qemu: flush code_size=2240 nb_tbs=1 avg_tb_size=2240

later uc_mem_read() retrieval of modified memory of its instruction shows this to be correct; however the actual TCG behavior of its immediate AL operand of IMUL still hasn't been updated by the 'self-modifying' xor instruction.

@egberts
Copy link
Contributor Author

egberts commented Feb 8, 2016

having the following

#define DEBUG_TB_INVALIDATE 
#define DEBUG_TB_CHECK

Both reveal nothing.

@egberts
Copy link
Contributor Author

egberts commented Mar 14, 2016

Now working on #437.

@farmdve
Copy link
Contributor

farmdve commented Mar 16, 2016

@egberts

Have you managed to fix the bug? Or identify the root cause? Perhaps you can consolidate the information?

@egberts
Copy link
Contributor Author

egberts commented Mar 17, 2016

SUMMARY

This bug revealed bug 437, @farmdve, see this comment

in test_tb_x86 unit test, the emulation of XOR (2 instructions before imul) whose operand performed a write operation to the its host physical memory, AND that location of the host physical memory also in the same TB as its operand (same TB, self-modify).

As a result, during emulation (SMC?), somehow, for this same address location, 'xor' operand performed a physical write TWICE, this operand value reverts back to its original value by the virtue of XOR logic, which is what test_tb_x86 unit test is seeing now. It performed twice xor because there is a duplicate xor helper in the TB. Somebody called for two gen_xor_helpers...

SIDE NOTE: this TWICE operation also bore the #437 where we are seeing two HOOK_CODE added to its TB whenever there is a write operation done to the same TB as its code.

At the moment, I'm not exactly closer to a solution because I am slowly learning the design of TCG/TB. I'm just a good debugger.

I currently gather that our focus is to avoid duplication of its "gen_xor_helper"/gen_uc_tracecode pair during TCG (which should in turn fix that double XOR emulation during SMC mode in test_tb_x86).

Each of the two points of the duplicate TB create/replace operations, for the same operand write address, are (hopefully) clearly documented in #437, with complete backtraces for each.

@egberts
Copy link
Contributor Author

egberts commented Mar 17, 2016

; ecx = 0x60000028, or that immediate operand of imul, 0x10
; al = 0x42
60000021   xor     byte ptr [ecx], al   <=== primary focus
60000024   <immaterial opcode, ignore this>
60000025   imul   ..., ..., 0x10

XOR emulation

During emulation of the code space in test_tb_x86, when it came time to perform the xor helper, as directed by its output operand, gen_xor_helper performs that write operation.

This operand write operation is doing a REAL CPU WRITE ACCESS.

During this operand write operation, it detects that this address is in the same physical page as its neighbor opcode, then mark any matching TB as invalid ( tb_invalidate_phys_page_range ).

tb_invalidate_phys_page_range

Before the invalidate begins, it finds the PageDesc block

Invalidation occurs in form of removal of all matching TBs and then creating a new TB. There is only one TB chain to work with. The TB address has its lower-2bit cleared (indice 0 of tb->page_next[0])

There are no counts in tb->cflags.

current_tb_modified = 1 is performed. THIS IS THE KEY.

WithIn that part of loop of invalidating TB chains (one chain, actually), it performs:

1214:  cpu_restore_state_from()

The above is that FIRST of the duplicate gen_xor_helper/gen_uc_tracecode(HOOK_CODE) pair.

Then it exits the loop of invalidating TB chains.

After this end of the loop, this current_tb_modified is detected as set to 1, and does the following:

1251: tb_gen_code()

It is this point where the second set of gen_xor_helper/gen_uc_tracecode(HOOK_CODE) pair gets inserted. But I noticed a new TB pointer there.

@egberts
Copy link
Contributor Author

egberts commented Mar 17, 2016

For the same write operation of XOR output operand, the double-generation starts here:

XX  0x0000000000415494 in tb_invalidate_phys_page_fast_x86_64 (uc=0x14f61230, 
    start=0x28, len=0x1) at /git/unicorn-master/qemu/translate-all.c:1414
XX  0x000000000043d233 in notdirty_mem_write (uc=0x14f61230, opaque=0x0, 
    ram_addr=0x28, val=0x10, size=0x1) at /git/unicorn-master/qemu/exec.c:1389
XX  0x0000000000430f3d in memory_region_write_accessor_x86_64 (mr=0x14f61558, 
    addr=0x28, value=0x7ffff59b6678, size=0x1, shift=0x0, mask=0xff)
    at /git/unicorn-master/qemu/memory.c:511
XX  0x000000000043108a in access_with_adjusted_size_x86_64 (addr=0x28, 
    value=0x7ffff59b6678, size=0x1, access_size_min=0x1, access_size_max=0x4, 
    access=0x430ea7 <memory_region_write_accessor_x86_64 at
/git/unicorn-master/qemu/memory.c:504>, mr=0x14f61558) at
/git/unicorn-master/qemu/memory.c:548
XX 0x0000000000434397 in memory_region_dispatch_write_x86_64 (mr=0x14f61558, 
    addr=0x28, data=0x10, size=0x1) at /git/unicorn-master/qemu/memory.c:1182
XX 0x00000000004375cf in io_mem_write_x86_64 (mr=0x14f61558, addr=0x28, 
    val=0x10, size=0x1) at /git/unicorn-master/qemu/memory.c:1888
XX 0x0000000000423009 in io_writeb_x86_64 (env=0x14f83170, physaddr=0x28, 
    val=0x10, addr=0x60000028, retaddr=0xf5ee1e)
    at /git/unicorn-master/qemu/softmmu_template.h:633
XX 0x00000000004237d8 in helper_ret_stb_mmu_x86_64 (env=0x14f83170, 
    addr=0x60000028, val=0x10, mmu_idx=0x2, retaddr=0xf5ee1e)
    at /git/unicorn-master/qemu/softmmu_template.h:739
XX 0x0000000000f5ee20 in static_code_gen_buffer ()

Starting with the first gen_uc_tracecode/gen_xor_helper (see traceback below),

#0  gen_uc_tracecode (tcg_ctx=0x7ffff77b8010, size=0xf1f1f1f1, type=0x2, 
    uc=0x14f61230, pc=0x60000021) at /git/unicorn-master/qemu/tcg/tcg-op.h:35
#1  0x00000000005893d3 in disas_insn (env=0x14f83170, s=0x7ffff59b6210, 
    pc_start=0x60000021)
    at /git/unicorn-master/qemu/target-i386/translate.c:4786
#2  0x0000000000596e68 in gen_intermediate_code_internal_x86_64 (
    gen_opc_cc_op=0x7ffff780db7c "", cpu=0x14f7af40, tb=0x7ffff59b8010, 
    search_pc=0x1) at /git/unicorn-master/qemu/target-i386/translate.c:8429
#3  0x00000000005970e6 in gen_intermediate_code_pc_x86_64 (env=0x14f83170, 
    tb=0x7ffff59b8010) at /git/unicorn-master/qemu/target-i386/translate.c:8485
#4  0x0000000000413909 in cpu_restore_state_from_tb_x86_64 (cpu=0x14f7af40, 
    tb=0x7ffff59b8010, searched_pc=0xf5ee1e)
    at /git/unicorn-master/qemu/translate-all.c:242
#5  0x0000000000415177 in tb_invalidate_phys_page_range_x86_64 (uc=0x14f61230, 
    start=0x28, end=0x29, is_cpu_write_access=0x1)
    at /git/unicorn-master/qemu/translate-all.c:1214
#6  0x0000000000415494 in tb_invalidate_phys_page_fast_x86_64 (uc=0x14f61230, 
    start=0x28, len=0x1) at /git/unicorn-master/qemu/translate-all.c:1414

the detailed design description is as followed, starting with tb_invalidate_phys_page_fast_x86_64, the systemtap tool revealed the following call sequence of interest:

  • cpu_restore_state_from_tb, which in turns call
  • tcg_func_start

For the second gen_uc_tracecode/gen_xor_helper (see traceback below),

#0  gen_uc_tracecode (tcg_ctx=0x7ffff77b8010, size=0xf1f1f1f1, type=0x2, 
    uc=0x14f61230, pc=0x60000021) at /git/unicorn-master/qemu/tcg/tcg-op.h:35
#1  0x00000000005893d3 in disas_insn (env=0x14f83170, s=0x7ffff59b61a0, 
    pc_start=0x60000021)
    at /git/unicorn-master/qemu/target-i386/translate.c:4786
#2  0x0000000000596e68 in gen_intermediate_code_internal_x86_64 (
    gen_opc_cc_op=0x7ffff780db7c "", cpu=0x14f7af40, tb=0x7ffff59b8088, 
    search_pc=0x0) at /git/unicorn-master/qemu/target-i386/translate.c:8429
#3  0x000000000059706a in gen_intermediate_code_x86_64 (env=0x14f83170, 
    tb=0x7ffff59b8088) at /git/unicorn-master/qemu/target-i386/translate.c:8478
#4  0x0000000000413794 in cpu_x86_gen_code (env=0x14f83170, tb=0x7ffff59b8088, 
    gen_code_size_ptr=0x7ffff59b63f4)
    at /git/unicorn-master/qemu/translate-all.c:179
#5  0x0000000000414de7 in tb_gen_code_x86_64 (cpu=0x14f7af40, pc=0x60000021, 
    cs_base=0x0, flags=0x4000f4, cflags=0x1)
    at /git/unicorn-master/qemu/translate-all.c:1100
#6  0x00000000004152b4 in tb_invalidate_phys_page_range_x86_64 (uc=0x14f61230, 
    start=0x28, end=0x29, is_cpu_write_access=0x1)
    at /git/unicorn-master/qemu/translate-all.c:1251
#7  0x0000000000415494 in tb_invalidate_phys_page_fast_x86_64 (uc=0x14f61230, 
    start=0x28, len=0x1) at /git/unicorn-master/qemu/translate-all.c:1414

the detailed design description (for the 2nd) is as followed, starting with tb_invalidate_phys_page_fast_x86_64, systemtap reveals that it called

  • tb_gen_code, which calls
  • cpu_gen_code, which in turn calls
  • tcg_func_start

We don't need two tgc_func_start calls for the same write operation of its XOR output operand.

Removing the second tb_gen_code/tgc_func_start results in infinite loop stuck on the same RIP/PC. So, there is a difference between the two TB regeneration sequences.

@egberts
Copy link
Contributor Author

egberts commented Mar 17, 2016

First one is:

cpu_restore_state_from_tb:

tcg_func_start
gen_intermediate_code_pc
  gen_intermediate_code_internal(,,,true)
    disas_insn
      gen_uc_tracecode(HOOK_CODE)
      "gen_xor_helper"
tcg_gen_code_search_pc
restore_state_to_opc

second one is:

tb_gen_code:

get_page_addr_code
tb_alloc
cpu_gen_code
  tb_func_start
  gen_intermediate_code
    gen_intermediate_code_internal(,,,false)
      disas_insn
        gen_uc_tracecode(HOOK_CODE)
        "gen_xor_helper"
tb_link_page

@egberts
Copy link
Contributor Author

egberts commented Mar 17, 2016

Above partial call trace is derived from a full run of call traces and their function argument values... (courtesy of systemtap).

Wrote a little indenter so we can see 'depth' of the call stack.

unicorn-#364-stap-callgraph.indented.txt

@egberts
Copy link
Contributor Author

egberts commented Mar 22, 2017

It is still a major vulnerability, despite what Chromium team reported.

https://bugs.chromium.org/p/project-zero/issues/detail?id=1122

@0xDEADFED5
Copy link

0xDEADFED5 commented Nov 21, 2019

I can't reproduce this bug on Windows with the most recent commit. It was kind of a pain to get test_tb_x86 going on Windows, but it passes for me...mostly. The test, as written, isn't actually able to pass (technically). After all the test parameters are passed, the VM keeps executing the shellcode, which loops back around, triggers the same tests as before, but this time ECX doesn't match the test parameters and it fails after already passing.

hook_code32: Address: 60000029, Opcode Size: 3
Register dump:
eax 151494a0 ecx 5ffffff9 edx 5ffffff8 ebx 034a129b
esp 6010229a ebp 60000002 esi 1f350211 edi 488ac239
Opcode: 32 41 42
Stack region dump
60000000: 89 e1 d9 cd
PASS
Register dump:
eax 151494a0 ecx 5ffffff9 edx 5ffffff8 ebx 034a129b
esp 6010229a ebp 60000002 esi 1f350211 edi 488ac239

Update: I tried this unit test using unicorn32.dll from a project by @Coldzer0 which he confirmed as exhibiting this bug...but the test still passed. So I'm not sure if this test is bad or what.

@egberts
Copy link
Contributor Author

egberts commented Nov 21, 2019 via email

@wtdcode
Copy link
Member

wtdcode commented Apr 2, 2021

Self-modifying code emulation should work well on Unicorn2. Link to #1217 for now.

OK. Got the RIP (PC) to start exactly at before the self-modifying opcode (0x60000021) by using a snapshot of the register set content and setting these new register values in the test/unit/test_tb_x86.c test code:

0x021: 30 41 30    xor     byte ptr [ecx + 0x30], al      # modify immediate operand of imul opcode
0x024: 41          inc     ecx
0x025: 6b 41 41 51 imul    eax, dword ptr [ecx + 0x41], 0x51

Train your eye at the stack region dump of tests/unit/test_tb_x86.c output after each instruction

60000020: 50 30 41 30 41 6b 41 41 51 32 41 42 32 42 42 30

until imul opcode's immediate byte operand got modified at offset 0x60000028 to 0x10 from 0x51:

60000020: 50 30 41 30 41 6b 41 41 10 32 41 42 32 42 42 30

I wanted to keep the original code sequence for later but fuller unit test of invalidating translation cache...so a C preprocessor define #define RIP_NEXT_TO_THE_SELFMODIFY_OPCODE is inserted, set and used. Will PR next.

@egberts Could you provide a more minimal reproducer? For this imul case, could you provide your registers snapshot?

@egberts
Copy link
Contributor Author

egberts commented Apr 13, 2021

a test case of a minimalistic IMUL (and register set, RIP) has been provided and checked in Unicorn test case. find the test case filename with IMUL in the name.

@wtdcode
Copy link
Member

wtdcode commented Apr 13, 2021 via email

@egberts
Copy link
Contributor Author

egberts commented Apr 13, 2021

Well, in that case, you can write a singular XOR follow by an IMUL whose operand got modified by prior XOR, which is what the test case does.

@egberts
Copy link
Contributor Author

egberts commented Aug 17, 2021

One or two lines of code? Setting up RIP
/registers requires more than 1 or 2 LOCs.

@wtdcode wtdcode closed this as completed Oct 3, 2021
@egberts
Copy link
Contributor Author

egberts commented Oct 3, 2021

Definitely still an open issue.

@wtdcode
Copy link
Member

wtdcode commented Oct 3, 2021

Definitely still an open issue.

Hello, this one I think shall work on Unicorn2, would you like another check?

@wtdcode
Copy link
Member

wtdcode commented Oct 3, 2021

Link to #1449 for further examination. If no more information is provided, I would close it again after a week or so.

@wtdcode wtdcode reopened this Oct 3, 2021
@wtdcode wtdcode added this to the Unicorn2 Official Release milestone Oct 5, 2021
@wtdcode
Copy link
Member

wtdcode commented May 7, 2022

I assume this has been covered by

static void test_x86_smc_xor(void)
and some relevant commits I make a few weeks before. If your case still doesn't work, inform me with a reproduction script.

@wtdcode wtdcode closed this as completed May 7, 2022
@egberts
Copy link
Contributor Author

egberts commented Jun 13, 2022

Happy to say that this has been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants