test_compiler_recursion_limit fails on wasm32-wasi #95335

iritkatriel · 2022-07-27T14:36:28Z

The problem showed up after we merged #95107
and a couple of test were disabled for WASI in #95296

With the patch in iritkatriel@4ce85e8 we get:

Here is a full WASM stack trace generated with

WASMTIME_BACKTRACE_DETAILS=1 \
  wasmtime run --env PYTHONPATH=/builddir/wasi/$(cat pybuilddir.txt) --mapdir /::../../ -- \
  python.wasm -m test -v test_compile -m test_compiler_recursion_limit

wasi_stacktrace.txt

test_compiler_recursion_limit (test.test_compile.TestSpecifics.test_compiler_recursion_limit) ... Error: failed to run main module `python.wasm`

Caused by:
    0: failed to invoke command default
    1: wasm trap: out of bounds memory access
       wasm backtrace:
           0: 0x15d8eb - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5868:37
           1: 0x15ee62 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4846:5
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5802:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5870:15
           2: 0x15ee62 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4846:5
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5802:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5870:15
           3: 0x15ee62 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4846:5
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5802:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5870:15
           4: 0x15ee62 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4846:5
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5802:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5870:15
           5: 0x15ee62 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4846:5
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5802:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5870:15
           6: 0x15ee62 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4846:5
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5802:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5870:15

Originally posted by @tiran in #93678 (comment)

The text was updated successfully, but these errors were encountered:

tiran · 2022-07-27T14:49:34Z

wasmtime-cli 0.39.1
WASI-SDK 16.0 with clang version 14.0.4

tiran · 2022-07-27T15:34:26Z

To reproduce a similar problem before #95107, add sys.setrecursionlimit(1700) at the beginning of test_compiler_recursion_limit. The default recursion limit on WASI is 600. It will result in a call stack exhausted trap with 3,609 calls instead of a out of bounds memory access trap after 991 calls.

test_compiler_recursion_limit (test.test_compile.TestSpecifics.test_compiler_recursion_limit) ... Error: failed to run main module `python.wasm`

Caused by:
    0: failed to invoke command default
    1: wasm trap: call stack exhausted
       wasm backtrace:
           0: <unknown>!validate_keywords
           1: 0x15c9e4 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4844:9
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5813:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5881:15
           2: 0x15cde6 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4857:5
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5813:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5881:15
           3: 0x15cde6 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4857:5
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5813:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5881:15
           4: 0x15cde6 - compiler_call
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:4857:5
                     - compiler_visit_expr1
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5813:16
                     - compiler_visit_expr
                           at /python-wasm/cpython/builddir/wasi/../../Python/compile.c:5881:15
...
         3602: 0x1b6c85 - pymain_run_module
                           at /python-wasm/cpython/builddir/wasi/../../Modules/main.c:300:14
         3603: 0x1b63f1 - pymain_run_python
                           at /python-wasm/cpython/builddir/wasi/../../Modules/main.c:604:21
                     - Py_RunMain
                           at /python-wasm/cpython/builddir/wasi/../../Modules/main.c:689:5
         3604: 0x1b70d7 - pymain_main
                           at /python-wasm/cpython/builddir/wasi/../../Modules/main.c:719:12
         3605: 0x1b7186 - Py_BytesMain
                           at /python-wasm/cpython/builddir/wasi/../../Modules/main.c:743:12
         3606: 0x4c85 - __main_argc_argv
                           at /python-wasm/cpython/builddir/wasi/../../Programs/python.c:15:12
         3607: 0x321d5b - <unknown>!__main_void
         3608: 0x4c69 - <unknown>!_start
         3609: 0x339092 - <unknown>!_start.command_export

markshannon · 2022-07-28T08:59:13Z

This reminds me of #30855 (comment)

Fiddling with the code to keep the stack use below some arbitrary limit imposed by the compiler and platform is not the way to fix this or other bugs of the sort.

The proper fix is use some sort of C stack check: #91079

For now, I would recommend reducing the recursion limit in the compiler.

TBH, we shouldn't care if some really weird code f()()()()()()()... fails in the compiler, as long as it fails gracefully.

The only thing we need to handle is long chains of if: .. elifs, which should be handled iteratively to avoid consuming excessive stack.

markshannon · 2022-07-28T09:04:13Z

Until we implement #91079 we should use the standard recursion limit counter in the compiler, otherwise we can crash on this code:

def f(n):
    if n: 
        f(n-1)
    else:
        compile_deeply_recursive_ast()
f(900)

As we consume most of the stack in f, then crash in the compiler.

iritkatriel · 2022-07-28T09:20:48Z

I’m not actually sure my PR added function calls that get repeated in a recursion. I don’t really know what I’m my PR would have caused this.

markshannon · 2022-07-28T09:28:30Z

It was probably passing locations by value, instead of by pointer, that increased the stack consumption.
But, as I've said, it could be any change that tips the stack use over the limit.

We need to reduce the recursion by using iterative approaches, not fiddle with the code.

markshannon · 2022-07-28T09:29:50Z

Converting recursion into iteration is a pain. So, I'd just reduce the depth for everything but if/elifs and handle those iteratively.

markshannon · 2022-07-28T09:30:39Z

@pablogsal might have thoughts on converting if/elifs into a list from a tree in the parser.

iritkatriel · 2022-07-28T09:31:45Z

I’m not sure the locations are passed down deep recursions though.

markshannon · 2022-07-28T09:32:49Z

Probably, not. But they still need stack space to be passed into the ADDOP...s

tiran · 2022-07-28T09:37:11Z

It is very well possible that your commit change some unrelated properties of the code that results in different optimizations.

I asked the wasmtime developers about stack limits in my ticket bytecodealliance/wasmtime#4214 .

Currently there is no way for a wasm module to determine its own stack limit or otherwise see how big stack frames are, so there's no way for the wasm module itself to detect what its recursion limit should be.

The stack limit also depends on the WASM runtime and possible other parameters. So far we have been testing with wasmtime only.

pablogsal · 2022-07-28T09:40:33Z

@pablogsal might have thoughts on converting if/elifs into a list from a tree in the parser.

I personally not very enthusiastic but also don't mind much if we are going to make our life easier as that's an allowed change we can make but also will break everyone analyzing ASTs.

markshannon · 2022-07-28T10:31:09Z

Let's not break everyone.

Can the parser produce the tree without being recursive?

pablogsal · 2022-07-29T14:51:47Z

Can the parser produce the tree without being recursive?

I will look into it, but the parser is an RDP so it's very nature is to recurse. We have a prototype with a stack machine but it was a pain to debug, a bit slower and it was not really better so we never used it.

I am slightly confused, why do you want the parser not to recurse? The parser is not reaching any limits here, no?

markshannon · 2022-08-01T08:23:40Z

The parser is not reaching any limits here, no?

Not yet, no, but if we change the the compiler to not recurse, then the parser will be the limiting factor.

This is really only an issue for if/elif statements. We can quite reasonable reject expressions hundreds deep, but we need to accept if/elifs with thousands of clauses.

What about a grammar change, so that the parser iterates?
It would still need post-processing to create the current AST, though.

pablogsal · 2022-08-01T12:59:10Z

What about a grammar change, so that the parser iterates?
It would still need post-processing to create the current AST, though.

It depends on what do you mean by "iterating". If you are referring to make the elif be represented as a list, then that's a simple change but we need to alter the AST. If you are referring that the parser needs to iterate over the clauses and construct a linear version and then the tree-version, then as I mentioned previously, in general, that goes strongly about the current parser design. Not only iterating is quite strange to the parser design but also we put quite a lot of effort to make the parser emit the final AST with no post-processing needed. I would prefer not to go this way.

If you want the parser to fully not recurse we need to bring back the prototype we had to make the parser work as a bytecode interpreter. That would be the best solution, but that's not without downsides, though.

In any case, there are even more limiting factors: the ast constructors, optimizers, visitors and validators will recurse and they will become the limiting factor because they currently use the C stack to go down. So even if you have a tree created without recursion, the fact that the tree is very deep will crash these other parts unless you also redesign them.

pablogsal · 2022-08-01T13:03:22Z

I think the best change here is to bite the bullet and change the ast so clauses are in a list instead of a nested tree. That would break everyone but it won't put a lot of crazy redesign burden on the interpreter.

Another alternative is just the status quo, which is suboptimal but it is not that bad. In my machine the parser handles a conditional with less than 10000 elifs with no problem s(although the parser fails and this is platform specific). More than that we raise MemoryError. I agree that having arbitrarily big chains of conditionals would be a great thing, but if the cost is redesigning several parts of the compiler pipeline, I don't think is worth the effort and the risk of introducing bugs, and behaviour changes and other stuff.

markshannon · 2022-08-01T13:31:08Z

It depends on what do you mean by "iterating".

I meant changing the grammar, not the parser.
Something like (in vaguely EBNF syntax)

if_stmt:
   "if" cond ":" ( "elif" cond ":" clause )* ("else" ":" clause)

We could then post-process to convert back to the deeper tree the AST currently has.
The transformation is not complicated, but it does slow things down.

In my machine the parser handles a conditional with less than 10000 elifs with no problem

On linux, sure. But we want portability, which means controlling recursion.

pablogsal · 2022-08-01T13:38:09Z

Ah, I see what you mean. 👍

We could then post-process to convert back to the deeper tree the AST currently has.

I would be ok with that if the parser is the one doing the transformation (no-post processing needed). I argued before why is important that the parser produces the final AST with no post-processing needed. The key would be here how to avoid consuming C stack during the processing which I think should not be a problem. Something like (pseudocode):

current_node = root
for clause in clauses:
    current_node.elif = Elif_Node(clause)
    current_node = current_node.elif

This will consume double memory for the elif chain because all nodes need to be alive at the same time.

The transformation is not complicated, but it does slow things down.

I suppose it depends on how slower this gets, but we should be aware that the tradeoff may not be worth it.

In any case, I think is going to be much much much better for everyone in the long run if we do something like you propose (( "elif" cond ":" clause )* ) but we just make the elif a list in the AST instead of use a tree branch. That would solve it for every single piece of the pipeline. The only downside is that it will break every AST tool, but this is a change we are allowed to make.

In any case, this problem exists for more things, for example:

x+x+x+x+x... thousands of times ... +x+x+x

also creates a tree that will exhaust the stack in the compiler, the parser and every single piece in the middle.

brettcannon · 2024-03-13T23:54:34Z

Since the tests are all passing on WASI at this point, I'm closing this as fixed.

iritkatriel mentioned this issue Jul 27, 2022

Direct unit tests for compiler optimisations #93678

Closed

tiran changed the title ~~test_compiler_recursion_limit fails on WASM~~ test_compiler_recursion_limit fails on wasm32-wasi Jul 27, 2022

This was referenced Jul 27, 2022

gh-95335: extract 'struct cfg_builder' from the compiler so that the CFG can be manipulated directly #95107

Merged

gh-95335: reduce stack consumption in the compiler #95333

Closed

brettcannon added the OS-wasi label Oct 4, 2022

iritkatriel added the tests Tests in the Lib/test dir label Nov 23, 2023

brettcannon added type-bug An unexpected behavior, bug, or error and removed type-bug An unexpected behavior, bug, or error labels Mar 1, 2024

brettcannon closed this as completed Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_compiler_recursion_limit fails on wasm32-wasi #95335

test_compiler_recursion_limit fails on wasm32-wasi #95335

iritkatriel commented Jul 27, 2022 •

edited by tiran

Loading

tiran commented Jul 27, 2022

tiran commented Jul 27, 2022 •

edited

Loading

markshannon commented Jul 28, 2022

markshannon commented Jul 28, 2022

iritkatriel commented Jul 28, 2022

markshannon commented Jul 28, 2022

markshannon commented Jul 28, 2022

markshannon commented Jul 28, 2022

iritkatriel commented Jul 28, 2022

markshannon commented Jul 28, 2022

tiran commented Jul 28, 2022

pablogsal commented Jul 28, 2022

markshannon commented Jul 28, 2022

pablogsal commented Jul 29, 2022

markshannon commented Aug 1, 2022

pablogsal commented Aug 1, 2022 •

edited

Loading

pablogsal commented Aug 1, 2022 •

edited

Loading

markshannon commented Aug 1, 2022

pablogsal commented Aug 1, 2022 •

edited

Loading

brettcannon commented Mar 13, 2024

test_compiler_recursion_limit fails on wasm32-wasi #95335

test_compiler_recursion_limit fails on wasm32-wasi #95335

Comments

iritkatriel commented Jul 27, 2022 • edited by tiran Loading

tiran commented Jul 27, 2022

tiran commented Jul 27, 2022 • edited Loading

markshannon commented Jul 28, 2022

markshannon commented Jul 28, 2022

iritkatriel commented Jul 28, 2022

markshannon commented Jul 28, 2022

markshannon commented Jul 28, 2022

markshannon commented Jul 28, 2022

iritkatriel commented Jul 28, 2022

markshannon commented Jul 28, 2022

tiran commented Jul 28, 2022

pablogsal commented Jul 28, 2022

markshannon commented Jul 28, 2022

pablogsal commented Jul 29, 2022

markshannon commented Aug 1, 2022

pablogsal commented Aug 1, 2022 • edited Loading

pablogsal commented Aug 1, 2022 • edited Loading

markshannon commented Aug 1, 2022

pablogsal commented Aug 1, 2022 • edited Loading

brettcannon commented Mar 13, 2024

iritkatriel commented Jul 27, 2022 •

edited by tiran

Loading

tiran commented Jul 27, 2022 •

edited

Loading

pablogsal commented Aug 1, 2022 •

edited

Loading

pablogsal commented Aug 1, 2022 •

edited

Loading

pablogsal commented Aug 1, 2022 •

edited

Loading