Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficiency of WASM from Rust->WASM may depend on num loop iterations #162

Open
Tracked by #175
mooori opened this issue Dec 15, 2023 · 1 comment
Open
Tracked by #175

Comments

@mooori
Copy link

mooori commented Dec 15, 2023

Modifying the Rust code for Fibonacci by @Akashin to loop over 0..9998 instead of 0..10000 yields quite different WASM:

(module
  (type (;0;) (func (param i64 i64)))
  (type (;1;) (func))
  (import "env" "assert_eq" (func $assert_eq (type 0)))
  (func $main (type 1)
    (local i64 i64 i32)
    i64.const 1
    local.set 0
    i64.const 0
    local.set 1
    i32.const 9998
    local.set 2
    loop  ;; label = @1
      local.get 0
      local.get 1
      i64.add
      local.tee 1
      local.get 0
      i64.add
      local.set 0
      local.get 2
      i32.const -2
      i32.add
      local.tee 2
      br_if 0 (;@1;)
    end
    local.get 1
    i64.const -5660816691627274375
    call $assert_eq)
  (start $main)
  (memory (;0;) 16)
  (global $__stack_pointer (mut i32) (i32.const 1048576))
  (global (;1;) i32 (i32.const 1048576))
  (global (;2;) i32 (i32.const 1048576))
  (export "memory" (memory 0))
  (export "main" (func $main))
  (export "__data_end" (global 1))
  (export "__heap_base" (global 2)))

Note that (start $main) was added manually.

Number of cycles

I’ve compiled from_rust_9998.wat to zkASM and executed it in the same framework as existing benchmarks. Although the Rust code has fewer loop iterations (9_998 vs 10_000), the resulting from_rust_9998.zkasm requires more cycles: 100_004 cycles.

That’s almost twice as many as from_rust.zkasm, but still less than handwritten WAT.

Test,Status,Cycles
from_rust,pass,53023
handwritten,pass,50008
handwritten_wat,pass,170023

What happens in from_rust_9998.wat

  • Per loop iteration an extra element of the sequence is added.
    • That’s probably why it requires less cycles than handwritten.wat which adds only one element per loop.
  • It is counting down instead of up.

Possible cause

Just some initial thoughts: In meetings it was mentioned that from_rust.wat is efficient due to loop unrolling. With a number of iterations less favorable to loop unrolling the generated WASM might become less efficient.

Num loop iterations known at compile time?

Often the number of loop iterations is not known at compile time. We may want to figure out how that affects the generated WASM and zkASM.

Low number of loop iterations

For instance for 0..100 the compiler seems to propagate and fold constants as there is no loop at all in the resulting WAT:

(module
  (type (;0;) (func (param i64 i64)))
  (type (;1;) (func))
  (import "env" "assert_eq" (func $assert_eq (type 0)))
  (func $main (type 1)
    i64.const 3736710778780434371
    i64.const -5660816691627274375
    call $assert_eq)
  (memory (;0;) 16)
  (global $__stack_pointer (mut i32) (i32.const 1048576))
  (global (;1;) i32 (i32.const 1048576))
  (global (;2;) i32 (i32.const 1048576))
  (export "memory" (memory 0))
  (export "main" (func $main))
  (export "__data_end" (global 1))
  (export "__heap_base" (global 2)))

Potential action items

  • Should we add Fibonacci benchmarks with different numbers of loop iterations to the repo?
  • Try to find out what WASM is generated by the Rust compiler if it can’t figure out the number of loop iterations?
@aborg-dev
Copy link

Two more ideas to try here that we discussed last week:

  • Enable compiler optimizations in WASM -> ZKASM path for benchmarks
  • Pass number of iterations as an external input to the program to prevent these kinds of constant-folding optimizations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants