Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple values support. #66

Closed
wants to merge 1 commit into from
Closed

Multiple values support. #66

wants to merge 1 commit into from

Conversation

ghost
Copy link

@ghost ghost commented Apr 7, 2016

This is a WIP branch adding some multiple-value expression support to the interpreter. It uses bottom up typing and demonstrates that even this single pass compiler can get enough type information to generate useful output, and that adding type information to blocks is not necessary. The code is rough, far from optimized, but this seems to be enough to demonstration the type system and the ability to handle these operations which is the main goal.

The extensions:

  • The values operator constructs a multiple value expression from its arguments and the values type subsumes the current void and single values types: (values) == (nop) (values (i32.const 1)) == (i32.const 1)). A single value is consumed from each argument and the rest discarded: (values (values (i32.const 1) (i32.const 2)) (values (i32.const 3) (i32.const 4)) == (values (i32.const 1) (i32.const 3))
  • The conc_values operator constructs a multiple value expression from the concatenation of all the values of its arguments. For example (conc_values (i32.const1) (nop) (values (i32.const 2) (i32.const 3))) == (values (i32.const1) (i32.const 2) (i32.const 3)) Valid code must have a fixed number of values for each argument to ensure that the number of consumed values for each expression is static, for example the follow is invalid: (conc_values (if (i32.const 1) (nop) (i32.const 1))). This operator would challenge top-down typing which might require encoding the number of values consumed from each argument.
  • The mv_call operator is similar to the call operator but accepts a single expression argument and passes all the values to the callee. For example (mv_call $f (values (i32.const1) (i32.const 2))) == (call $f (i32.const1) (i32.const 2))). This could be generalized to accept multiple arguments and concatenate all the values like conc_values but this can done with (mv_call $f (conc_values ...)) so would just be an extra convenience.
  • Where there are multiple exits or input/outputs for an operator, such as block/break if select, the type system accepts conflicting value types and missing values so long as they are not consumed. This makes it convenient and efficient to discard unused values which is a nice property in a language in which all operators are expressions. In the limit this gives the current behaviour for the single values types and the void/zero-value type, where single values can be discarded when not used. For example (i32.add (call $fn_returning_i32_f32) (call $fn_returning_i32)) is valid and (i32.add (call $fn_returning_i32_f32) (call $fn_returning_i32_f64)) is valid, and in both cases only the first value is consumed.
  • The br br_if return and potentially the br_table operators pass on all the values of their single expression. The text syntax accepts a missing expression which is encoded as a nop: (br $l) == (br $l (nop)) == (br $l (values)). For example (values (i32.const 1) (i32.const 2)) == (block (values (i32.const 1) (i32.const 2))) == (block $l (br $l (values (i32.const 1) (i32.const 2))))
  • Functions can return multiple values. For example (func $f1 (result i32 i64) (values (i32.const 1) (i64.const 2))) for which (call $f1) => (values (i32.const 1) (i64.const 2)). The interpreter supports returning multiple values for exported functions.
  • A block1 operator returns the values of the first expression and discards the values of all other block top level expressions, in contrast to the block operator that returns the values of the last expression. Combined with block this allows returning the values of an arbitrary expression in an effective block. For example (block (exp1) (block1 (exp2) (exp3))) returns the values of (exp2).
  • TODO Support for storing multiple values into multiple local variables would also be needed to make this viable, perhaps this could be a pick-style operation allowing some values to be ignored and some used multiple times, or just the first-n values, lots of options.
  • TODO Support could also be added for picking/duplicating values if this helped.

The type system:

  • The base types are empty and values. The empty type is the set of no types and represents the type of unreachable code. The values type is a sequence of value types, i32, i64, f32, f64, plus the union type which represents any of i32, i64, f32, or f64, plus an optional type which also includes the value being missing.
  • Where an expression has potentially multiple result types at runtime, the result is computed with a union operation. The union of any type with the empty type is itself. The union of values types is computed per element. If either value is the optional type or missing then the result type is optional, otherwise if the types differ the result type is union, otherwise the types are the same and this is the result type for the element. Note that this computation is an expansion of the set of result value types, not the most specific, but wasm only validly consumes values elements with a matching type, or expressions with a fixed number of values, so the union operators can be simplified to a flat sequence. For example, the precise union applied to (if (cond) (values (i32.const 1) (f32.const 2)) (values (f32.const 3) (i32.const 4))) is (or (values i32 f32) (values f32 i32)), but it is sufficient for wasm to expand this set to (values union union) == (or (values i32 i32) (values i32 i64) (values i32 f32) (values i32 f64) (values i64 i32) ...).
  • Type checking is a subset relationship - the actual type must be a subset of the expected type. Wasm consumers always expected a fixed number of values and all values consumed have a fixed expected type, so the expected type can not be the empty type or have values type elements of type union or optional. (Todo an expected element type for non-consumed elements). If the actual type is the empty type then this test is always true for any expected type as the empty set is a subset of all sets. If the actual and expected types are values types then all the consumed expected element types must match their respective element in the actual type, and other elements in the actual type are ignored.

@binji
Copy link
Member

binji commented Apr 8, 2016

Interesting, thanks for sharing this.

@ghost
Copy link
Author

ghost commented Apr 16, 2016

I see little prospect of this making the cut. A wasm focused on expressionless operators might be far simpler and meet the use cases where the call operator just returns multiple values to multiple local variables.

If I can be permitted to express an opinion. Expressions in wasm seem to have brought it well into the bike-shedding territory and made progress almost impossible. Whatever the outcome for expressions it seems that they will just be some baggage to work around, to be canonicalized into a more primitive expressionless form and re-built into expressions for presentation. The pre-order/post-order saga seems unnecessary without expressions. I have seen a number of issues with the type system and been unable to make progress. I started out thinking that expressions were the saving grace for wasm but have come full circle and have completely given up on them. Giving up on expressions in the wasm code section does not mean giving up on them in the file as a whole, as they can be rebuilt with the aid of meta information for presentation. I have re-encoded AngryBots using a simple expressionless encoding, and already uncompressed it is only 15% larger, and brotli encoded is 16% larger, and I see some prospect of closing this gap. I have explored writing a few of the expressionless instructions in v8 and their decoding seems efficient.

@ghost ghost closed this Apr 16, 2016
@ghost ghost deleted the multiple-values branch April 16, 2016 09:16
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant