Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some Questions Related to BinaryenExpressionPrint()'s Output #6016

Closed
mobsceneZ opened this issue Oct 17, 2023 · 4 comments
Closed

Some Questions Related to BinaryenExpressionPrint()'s Output #6016

mobsceneZ opened this issue Oct 17, 2023 · 4 comments

Comments

@mobsceneZ
Copy link
Contributor

Hello developers, I am currently working on a work that modifies the contents of wasm modules, since I could not find a way to traverse the Binaryen IR like we do in LLVM IR (which contains control flow graph), I try to inspect how Binaryen constructs a function's body by the following code:

#include <binaryen-c.h>
void test(BinaryenModuleRef mod) {

    BinaryenIndex numFuncs = BinaryenGetNumFunctions(mod);
    for (BinaryenIndex i = 0; i < numFuncs; i++) {

        BinaryenFunctionRef   func = BinaryenGetFunctionByIndex(mod, i);
        BinaryenExpressionRef body = BinaryenFunctionGetBody(func);

        BinaryenExpressionPrint(body);

    }

}

And the sample module provided is of the following WAT form:

(func (export "multi") (param i32) (result i32 i32)
  (if (local.get 0) (then (call $dummy) (call $dummy) (call $dummy)))
  (if (local.get 0) (then) (else (call $dummy) (call $dummy) (call $dummy)))
  (if (result i32) (local.get 0)
    (then (call $dummy) (call $dummy) (i32.const 8) (call $dummy))
    (else (call $dummy) (call $dummy) (i32.const 9) (call $dummy))
  )
  (if (result i32 i64 i32) (local.get 0)
    (then
      (call $dummy) (call $dummy) (i32.const 1) (call $dummy)
      (call $dummy) (call $dummy) (i64.const 2) (call $dummy)
      (call $dummy) (call $dummy) (i32.const 3) (call $dummy)
    )
    (else
      (call $dummy) (call $dummy) (i32.const -1) (call $dummy)
      (call $dummy) (call $dummy) (i64.const -2) (call $dummy)
      (call $dummy) (call $dummy) (i32.const -3) (call $dummy)
    )
  )
  (drop) (drop)
)

The resulting output is listed below, which is a little bit tedious, but in essense, I have three questions:

  • Why there are tuple.make instruction appearing in the result, which is not a valid WebAssembly instruction?
  • Why i32.const instructions are surrounded by local.set instructions, what's the point of doing this or is it related to optimizations?
  • If the behaviors mentioned above are configurable, how can I turn off these options or how can I get a Binaryen IR as close as the original wat content?

Thanks in advance for any helpful replies ;) !

(local.set $21
  (if (result i32 i64 i32)
   (local.get $0)
   (block (result i32 i64 i32)
    (call $0)
    (call $0)
    (local.set $4
     (i32.const 1)
    )
    (call $0)
    (call $0)
    (call $0)
    (tuple.make
     (block (result i32)
      (local.set $18
       (local.get $4)
      )
      (local.set $12
       (i64.const 2)
      )
      (call $0)
      (call $0)
      (call $0)
      (local.get $18)
     )
     (block (result i64)
      (local.set $17
       (local.get $12)
      )
      (local.set $3
       (i32.const 3)
      )
      (call $0)
      (local.get $17)
     )
     (local.get $3)
    )
   )
   (block (result i32 i64 i32)
    (call $0)
    (call $0)
    (local.set $6
     (i32.const -1)
    )
    (call $0)
    (call $0)
    (call $0)
    (tuple.make
     (block (result i32)
      (local.set $20
       (local.get $6)
      )
      (local.set $13
       (i64.const -2)
      )
      (call $0)
      (call $0)
      (call $0)
      (local.get $20)
     )
     (block (result i64)
      (local.set $19
       (local.get $13)
      )
      (local.set $5
       (i32.const -3)
      )
      (call $0)
      (local.get $19)
     )
     (local.get $5)
    )
   )
  )
 )
 (local.set $7
  (block (result i32)
   (local.set $23
    (tuple.extract 0
     (local.get $21)
    )
   )
   (local.set $14
    (block (result i64)
     (local.set $22
      (tuple.extract 1
       (local.get $21)
      )
     )
     (local.set $8
      (tuple.extract 2
       (local.get $21)
      )
     )
     (local.get $22)
    )
   )
   (local.get $23)
  )
 )
 (tuple.make
  (block (result i32)
   (local.set $24
    (local.get $10)
   )
   (local.set $9
    (local.get $7)
   )
   (local.set $15
    (local.get $14)
   )
   (drop
    (local.get $8)
   )
   (drop
    (local.get $15)
   )
   (local.get $24)
  )
  (local.get $9)
 )
)
@tlively
Copy link
Member

tlively commented Oct 17, 2023

Binaryen's IR is an AST where each node produces at most one value consumed by its parent node. Wherever a control flow structure returns multiple values, we have to package them up as a single tuple value in the AST structure. That's where the tuple pseudoinstructions are coming from.

Similarly, whenever you have "stacky" code that leaves values on the value stack while it goes and does some other operation, that code doesn't translate cleanly into the AST form. We end up putting the values left on the stack into locals to resolve the mismatch. That's where the extra locals are coming from.

If you optimize this module, Binaryen should be able to remove some of the bloat that the parser inserted.

@mobsceneZ
Copy link
Contributor Author

Aha I get it, just to make sure I understand it right, does the second situation (extra locals) similar to the situation mentioned in issue #663? In other words, the following code will also introduce extra locals to save values on the stack, right?

call gimme-i32
call gimme-i32
call no-value-returned
i32.add

@tlively
Copy link
Member

tlively commented Oct 18, 2023

Yes, exactly

@mobsceneZ
Copy link
Contributor Author

Thanks for your kind response, that helps a lot :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants