-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interpreter issues with let #35
Comments
I'm sorry if this is a stupid question, but how does an interpreter handle branch instructions if it doesn't track control scopes? Doesn't it need to pop the stack? |
There is no need for an explicit control stack. An interpreter has a side data structure to make branches efficient. The side data structure is a function of The |
Does it make sense to use a different block terminator for |
I considered that as an option, but it still means that the interpreter would need an additional stack pointer (typically already has 2 stack pointers) because now the stack grows in an additional direction. For context, a typical interpreter will do the following: storage = | locals A | operand stack A ---> .... | with a "stack pointer" pointing to the top of the operand stack, moving up and down with each instruction, and a non-moving "frame pointer" pointing to the bottom of the locals. The storage may be part of the interpreter's actual call stack (i.e. allocated on the "C" or "machine stack") or another array (just another data structure). Either way, a call to B would push a new frame like so: storage = | locals A | operand stack A | locals B | operand stack B --> ... | That doesn't work if storage = | .....<-- | B locals | locals A | operand stack A | operand stack B --> ... | Or you can put the locals and stack at different ends of that array, storage = | locals A | locals B | --> ... ... <-- operand stack B | operand stack A | but you can't do either of those tricks with the intepreter's actual execution stack, which forces you to use a datastructure, which you now have to manage with 3 pointers. Either way, it costs you, and it's complicated. There is also the aesthetic aspect of having a special terminator for Also, all of the above assumes that we don't do de Bruijn indexing for let-bound locals, because none of that would work anyway. Separate issues, but worth mentioning. |
Sounds reasonable. To be clear, I think asking for an upper bound on let-depth is perfectly reasonable. |
If the interpreter is already making a pre-pass to generate the branch side table, can't it do the same to generate a local lookup side-table? That is, |
@binji that'd be too much space. I suppose the prepass could compute the maximum let size (or simply include it in the total number of locals), but the indexing issue still remains. |
I agree that the prepass should be able to compute the maximum easily. As for the relative indexing, that's designed to correspond to a stack-like handling of the local index space. So I think all you'd need to do is treat the locals array as a downward-growing stack (whose maximum size you've precomputed), initialise it with the function-locals in reverse order, and then index all locals from top of stack. |
I understand the motivation for this numbering scheme for But for an interpreter or a simple JIT that would follow an interpreter's stack frame layout, this numbering scheme doesn't quite work no matter which way you point stacks, even if you know the maximum. I thought through multiple variants of what you suggest, but it only works if you copy arguments upon a function call. The interpreter can't overlap the stack of caller and callee like it can now, because "local 0" (i.e. the first argument) is deeper in the stack than "local 1". A new let bound variable would then inherit index 0 (knocking local 0 up to local 1) and thus a stack pointer adjustment would be necessary. But adjusting the stack pointer to be below the old local 0 upon (dynamic) execution of a But really, the biggest problem is with dynamic adjustment of the stack pointer with It's worth mentioning for posterity it would be possible to avoid that argument copy if there was a different instruction to access let-bound variables, like If let-bound variables extend the existing numbering scheme, then I think the prepass to compute the maximum is enough for an interpreter and there will be no overhead at all for using Thus I would propose that Sorry for the noise on this issue. I hate making an interpreter a design criteria, as it was never one in the past, but this one kind of blew up in my face as I was implementing it :-) |
@titzer, to check that I understood this correctly, are you saying that you want to keep function arguments in place on the operand stack and push locals there as well, so that you can directly index args off their original location? And you'd want to reserve additional space on the operand stack to copy let-locals after those? If so, then I'm afraid you'll eventually run into more problems than just let. For example, leaving arguments in place won't work with func.bind, because it would require inserting the bound arguments before the ones on stack when invoking call_ref. I suspect it also requires an extra round of copying for tail calls. I was assuming an interpreter would pop a function's arguments and move them to an array of locals (which might still be allocated on the operand stack). That's almost as simple but with none of the problems. AFAICS, the only disadvantage is the need to move the arguments, but is that really a notable cost for an interpreter? |
Yes to the first paragraph. That already works smoothly for not let-bound locals.
It's not necessary to use two arrays of values in an interpreter; one will do, because frames can overlap in this way. That's how Java's operand stack works and our current numbering scheme allows that same trick. Copying isn't necessary now. But copying is not the main problem, it's the dynamic adjustment of the stack pointer that would require tracking control scopes to find matching ends as I described in the first comment. Maybe I should write shorter comments. |
Re the latter problem: how do you deal with adjusting the operand stack height when branching? For that, don't you need to record a stack height with each control construct in the prepass data? Couldn't you reuse this mechanism for the locals' height? I realise this is slightly more work for an interpreter then indexing the other way round. OTOH, the nice property of relative indexing is that it is consistent with the treatment of labels. And that it composes: inner indices don't depend on outer context. This is particularly nice for code transformations and in a single-pass producer. Would be unfortunate to lose that. For closures: if you have a locals array then dealing with nested closures is easy I think: you go depth first and copy each one's bindings to the array and finally pop the remaining operands off the stack. No stack twiddling required. |
To what extent can this interpreter modify the original bytecode? If it can, then I can imagine other optimizations here. For example, you could rewrite the local index directly if there is space (which there typically should be). You can have a sentinel value to a side table when that's not possible. |
@binji The interpreter is designed to never modify the original bytecode (think: the original module bytes are mapped read-only in memory). It would only copy and mutate a copy of the bytecode for debugging. @rossberg Earlier in the thread I explained the side data structure and how it works for control scopes. It's space prohibitive to have side data structure entries for local variable access instructions. I get your point about local code transforms but it doesn't really work because any time you move any code that references a local index into or out of a let scope, you have to renumber anyway. It really only works if you move an entire scope at once, which is just a special case. |
OK, if you can't modify the bytecode, then you can still collect additional information in the prepass:
Then your side table doesn't need to be stored for all local accesses (which I agree would be prohibitive -- I measured 2.5 million in a 19MB wasm file). |
I have to mull your suggestions over a bit, but for context:
At this point I'm not hopeful about finding a better solution and even more dubious that there is any material benefit to the current numbering scheme. |
@titzer, what I was implying above is that, if you organise locals as a stack within the function's frame (an array with pre-computed max size), then all you need to add to your side table is another field The only additional issue is that you'll also need this There should be no need to store anything about individual local.get/set/tee instructions, nor to distinguish function- from let-locals.
You only need to renumber indices that point to locals outside the code you move -- same as for labels. That's the usual shift operation for de Bruijn indices. It's notably simpler than a general index substitution. |
Let was removed, so this is obsolete. |
The
let
instruction is difficult to implement in an interpreter without significant overhead. In particular, alet
instruction introduces a new block scope and binds new local variable indices that are accessible (and writeable) in that scope. This requires the use of a second stack pointer if using the common implementation technique where local variables are preallocated before the operand stack on a single array that increases with the call stack. Secondly, leaving the scope of thelet
block must unbind or pop these variable indices. As the scope of alet
is terminated with a normalend
, there is no way for an interpreter to know where the scope ends unless by dynamically tracking control scopes. This is not the case with any of the existing control constructs in Wasm, and is specific to let. Dynamically tracking control scopes is not necessary for any reason currently, and would penalize all functions, even ones without let.A simple solution to this problem is to require that functions pre-declare all of the space that they will need for let bindings in their bodies. This allows an interpreter to preallocate space for let bindings, doesn't require a second stack pointer, and also benefits JITs, since they must also use a dynamic data structure (e.g. to track SSA values) that would have to grow and shrink with lets. Since the preamble of a function is just a series of local variable declarations with value types, we will probably need to reserve a value from the binary encoding of the value type space to indicate a pre-declaration of
let
indexing space.I think the other alternatives for avoiding this problem (e.g. not having
let
, but requiring definite initialization of locals with value types that have no default value such as non-null references) are worse.The text was updated successfully, but these errors were encountered: