Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebAssembly] AsmTypeCheck does not handle block return values of unreachable block end #107524

Closed
aheejin opened this issue Sep 6, 2024 · 1 comment · Fixed by #110770
Closed

Comments

@aheejin
Copy link
Member

aheejin commented Sep 6, 2024

AsmTypeCheck does not handle block/loop's concrete return values when their end is not reachable, for example,

block i32
i32.const 0
br 0
end

This is a valid wasm program but does not pass AsmTypeCheck.

Currently Wasm backend does not generate block/loop return types, but one exception is, the ones generated by WebAssemblyCFGStackify::fixEndsAtEndOfFunction:

/// In normal assembly languages, when the end of a function is unreachable,
/// because the function ends in an infinite loop or a noreturn call or similar,
/// it isn't necessary to worry about the function return type at the end of
/// the function, because it's never reached. However, in WebAssembly, blocks
/// that end at the function end need to have a return type signature that
/// matches the function signature, even though it's unreachable. This function
/// checks for such cases and fixes up the signatures.
void WebAssemblyCFGStackify::fixEndsAtEndOfFunction(MachineFunction &MF) {
const auto &MFI = *MF.getInfo<WebAssemblyFunctionInfo>();
if (MFI.getResults().empty())
return;
// MCInstLower will add the proper types to multivalue signatures based on the
// function return type
WebAssembly::BlockType RetType =
MFI.getResults().size() > 1
? WebAssembly::BlockType::Multivalue
: WebAssembly::BlockType(
WebAssembly::toValType(MFI.getResults().front()));
SmallVector<MachineBasicBlock::reverse_iterator, 4> Worklist;
Worklist.push_back(MF.rbegin()->rbegin());
auto Process = [&](MachineBasicBlock::reverse_iterator It) {
auto *MBB = It->getParent();
while (It != MBB->rend()) {
MachineInstr &MI = *It++;
if (MI.isPosition() || MI.isDebugInstr())
continue;
switch (MI.getOpcode()) {
case WebAssembly::END_TRY: {
// If a 'try''s return type is fixed, both its try body and catch body
// should satisfy the return type, so we need to search 'end'
// instructions before its corresponding 'catch' too.
auto *EHPad = TryToEHPad.lookup(EndToBegin[&MI]);
assert(EHPad);
auto NextIt =
std::next(WebAssembly::findCatch(EHPad)->getReverseIterator());
if (NextIt != EHPad->rend())
Worklist.push_back(NextIt);
[[fallthrough]];
}
case WebAssembly::END_BLOCK:
case WebAssembly::END_LOOP:
case WebAssembly::DELEGATE:
EndToBegin[&MI]->getOperand(0).setImm(int32_t(RetType));
continue;
default:
// Something other than an `end`. We're done for this BB.
return;
}
}
// We've reached the beginning of a BB. Continue the search in the previous
// BB.
Worklist.push_back(MBB->getPrevNode()->rbegin());
};
while (!Worklist.empty())
Process(Worklist.pop_back_val());
}

An example program that would generate this case is

;; test.ll
target triple = "wasm32-unknown-unknown"

define i32 @loop_i32() {
entry:
  br label %header
header:
  br label %header
}

which generates

loop i32
br 0
end

The reproducing commands:

$ llc test.ll
$ llvm-mc -triple=wasm32-unknown-unknown test.s

The error will be

...
        loop            i32                             # label0:
        br              0                               # 0: up to label0
.LBB0_2:
b.s:15:2: error: end: insufficient values on the type stack
        end_loop
        ^
        end_function
...
@llvmbot
Copy link
Member

llvmbot commented Sep 6, 2024

@llvm/issue-subscribers-backend-webassembly

Author: Heejin Ahn (aheejin)

`AsmTypeCheck` does not handle block/loop's concrete return values when their end is not reachable, for example, ```s block i32 br 0 end ``` This is a valid wasm program but does not pass `AsmTypeCheck`.

Currently Wasm backend does not generate block/loop return types, but one exception is, the ones generated by WebAssemblyCFGStackify::fixEndsAtEndOfFunction:

/// In normal assembly languages, when the end of a function is unreachable,
/// because the function ends in an infinite loop or a noreturn call or similar,
/// it isn't necessary to worry about the function return type at the end of
/// the function, because it's never reached. However, in WebAssembly, blocks
/// that end at the function end need to have a return type signature that
/// matches the function signature, even though it's unreachable. This function
/// checks for such cases and fixes up the signatures.
void WebAssemblyCFGStackify::fixEndsAtEndOfFunction(MachineFunction &MF) {
const auto &MFI = *MF.getInfo<WebAssemblyFunctionInfo>();
if (MFI.getResults().empty())
return;
// MCInstLower will add the proper types to multivalue signatures based on the
// function return type
WebAssembly::BlockType RetType =
MFI.getResults().size() > 1
? WebAssembly::BlockType::Multivalue
: WebAssembly::BlockType(
WebAssembly::toValType(MFI.getResults().front()));
SmallVector<MachineBasicBlock::reverse_iterator, 4> Worklist;
Worklist.push_back(MF.rbegin()->rbegin());
auto Process = [&](MachineBasicBlock::reverse_iterator It) {
auto *MBB = It->getParent();
while (It != MBB->rend()) {
MachineInstr &MI = *It++;
if (MI.isPosition() || MI.isDebugInstr())
continue;
switch (MI.getOpcode()) {
case WebAssembly::END_TRY: {
// If a 'try''s return type is fixed, both its try body and catch body
// should satisfy the return type, so we need to search 'end'
// instructions before its corresponding 'catch' too.
auto *EHPad = TryToEHPad.lookup(EndToBegin[&MI]);
assert(EHPad);
auto NextIt =
std::next(WebAssembly::findCatch(EHPad)->getReverseIterator());
if (NextIt != EHPad->rend())
Worklist.push_back(NextIt);
[[fallthrough]];
}
case WebAssembly::END_BLOCK:
case WebAssembly::END_LOOP:
case WebAssembly::DELEGATE:
EndToBegin[&MI]->getOperand(0).setImm(int32_t(RetType));
continue;
default:
// Something other than an `end`. We're done for this BB.
return;
}
}
// We've reached the beginning of a BB. Continue the search in the previous
// BB.
Worklist.push_back(MBB->getPrevNode()->rbegin());
};
while (!Worklist.empty())
Process(Worklist.pop_back_val());
}

An example program that would generate this case is

;; test.ll
target triple = "wasm32-unknown-unknown"

define i32 @<!-- -->loop_i32() {
entry:
  br label %header
header:
  br label %header
}

which generates

loop i32
br 0
end

The reproducing commands:

$ llc test.ll
$ llvm-mc -triple=wasm32-unknown-unknown test.s

aheejin added a commit to aheejin/llvm-project that referenced this issue Sep 29, 2024
aheejin added a commit to aheejin/llvm-project that referenced this issue Sep 29, 2024
aheejin added a commit to aheejin/llvm-project that referenced this issue Sep 30, 2024
aheejin added a commit to aheejin/llvm-project that referenced this issue Oct 1, 2024
aheejin added a commit to aheejin/llvm-project that referenced this issue Oct 1, 2024
aheejin added a commit to aheejin/llvm-project that referenced this issue Oct 1, 2024
aheejin added a commit to aheejin/llvm-project that referenced this issue Oct 1, 2024
aheejin added a commit to aheejin/llvm-project that referenced this issue Oct 2, 2024
This makes the type checker handle blocks with input parameters and
return types, branches, and polymorphic stacks correctly.

We maintain the stack of "block info", which contains its input
parameter type, return type, and whether it is a loop or not. And this
is used when checking the validity of the value stack at the `end`
marker and all branches targeting the block.

`StackType` now supports a new variant `Polymorphic`, which indicates
the stack is in the polymorphic state. `Polymorphic`s are not popped
even when `popType` is executed; they are only popped when the current
block ends.

When popping from the value stack, we ensure we don't pop more than we
are allowed to at the given block level and print appropriate error
messages instead. Also after a block ends, the value stack is guaranteed
to have the right types based on the block return type. For example,
```wast
block i32
  unreachable
end_block
;; You can expect to have an i32 on the stack here
```

This also adds handling for `br_if`. Previously only `br`s were checked.

`checkEnd` and `checkBr` were removed and their contents have been
inlined to the main `typeCheck` function, because they are called only
from a single callsite.

This also fixes two existing bugs in AsmParser, which were required to
make the tests passing.

This modifies several existing invalid tests, those that passed
(incorrectly) before but do not pass with the new type checker anymore.

Fixes llvm#107524.
@aheejin aheejin closed this as completed in a268bda Oct 2, 2024
xgupta pushed a commit to xgupta/llvm-project that referenced this issue Oct 4, 2024
…m#110770)

This makes the type checker handle blocks with input parameters and
return types, branches, and polymorphic stacks correctly.

We maintain the stack of "block info", which contains its input
parameter type, return type, and whether it is a loop or not. And this
is used when checking the validity of the value stack at the `end`
marker and all branches targeting the block.

`StackType` now supports a new variant `Polymorphic`, which indicates
the stack is in the polymorphic state. `Polymorphic`s are not popped
even when `popType` is executed; they are only popped when the current
block ends.

When popping from the value stack, we ensure we don't pop more than we
are allowed to at the given block level and print appropriate error
messages instead. Also after a block ends, the value stack is guaranteed
to have the right types based on the block return type. For example,
```wast
block i32
  unreachable
end_block
;; You can expect to have an i32 on the stack here
```

This also adds handling for `br_if`. Previously only `br`s were checked.

`checkEnd` and `checkBr` were removed and their contents have been
inlined to the main `typeCheck` function, because they are called only
from a single callsite.

This also fixes two existing bugs in AsmParser, which were required to
make the tests passing. I added Github comments about them inline.

This modifies several existing invalid tests, those that passed
(incorrectly) before but do not pass with the new type checker anymore.

Fixes llvm#107524.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants