-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Final opcodes #372
Final opcodes #372
Conversation
Looks OK. The bulk array instructions (from #363) appear to be missing. Personally, I would have compacted the opcode encoding space a lot more, essentially to assign a contiguous range. My reasoning is that whatever gaps we choose to leave now have an exceedingly small likelihood to turn out to have appropriate size, so as we add instructions in future proposals we'll likely end up with a wildly fragmented mix of remaining gaps that were too large and others that were too small where we then had to simply append new opcodes at the end of the range or squeeze them into "non-fitting" gaps. OTOH, it's not a big issue because (1) at the end of the day we're just choosing between different kinds of ugly for the encoding space a couple of proposals down the line, and (2) prettiness or ugliness of the binary encoding doesn't actually matter. |
FWIW, we tried to leave useful gaps in the SIMD proposal opcode space, and they did not turn out to be useful at all. If we had instead used a compact opcode space, we might not have needed to use an extra LEB byte for the relaxed SIMD opcodes. I don't feel strongly either way, though. |
+1 to just compacting the opcode space without gaps. The 1-byte opcode space is important, but IMO there is marginal benefit, if any, in leaving holes in the prefixed opcode spaces. |
Nit: I'd suggest to keep different encodings for the same instruction together, i.e. put |
I agree that we should fill up gaps before going to two bytes, but currently we are far away from hitting that boundary, and never may. Is there any particular advantage in compacting before that? I thought that leaving some gaps actually worked reasonably well for some of the Wasm 1.0 code space. @askeksa-google, good point, fixed. |
Is there any particular advantage to leaving gaps? FWIW, here's the summary:
One data point to illustrate the low likelihood of gap sizes being lucky guesses: Aske has previously suggested a set of 8 branching i31 instructions, which (if we end up deciding to adopt them in a future proposal) wouldn't have an obvious fit in any of these gaps. |
I don't think we should leave gaps in the prefixed opcode spaces. First, as mentioned above, I think it's unlikely we'll predict the size of gaps properly, meaning there will likely always be vestigial gaps. Second, leaving gaps reduces the number of two-byte encodings being utilized; in the future, will new instructions start filling in gaps if they'd otherwise overflow into the 3 byte space? Third, engines may way to use some of the unused opcode space for internal things, like quickened versions of bytecodes, which benefits interpreters. |
Coming back to this:
@titzer, I didn't follow your point about custom opcodes. Isn't that orthogonal? |
It's a minor point, but it's often useful to have a few opcodes left over for the VM, in each of the opcode spaces. This is easier to do if the opcode space fills up from the lowest numbers first. It doesn't work very well to try to use an opcode that is currently a hole and later filled up.
The 1-byte opcode space is a much more important space, but in retrospect, I think we would have been fine to have not left holes in it. |
I see, but this is mostly a question of leaving sufficient space at the end, right? Which I would think we have even now. |
Let's discuss this at the subgroup meeting tomorrow. If we don't come to an agreement through discussion, I propose that we take a popularity vote to settle the issue. It doesn't seem important enough to be worth spending much additional energy or time on. |
I won't be able to attend tomorrow because of a conflict, but I would generally prefer that we pack the prefixed opcode spaces rather than leaving holes. Is a wider CG discussion warranted, as this could be a potentially precedent-setting decision? |
If the purpose is to have some nice groupings, we could still do that while leaving significantly smaller holes, e.g.:
|
Maybe, but it seems extremely low priority and CG time is hard to come by these days. We didn't have a critical mass of people to resolve this at the meeting yesterday, but I'd like to take the temperature here. We are only adding 31 prefixed opcodes, so even if we leave generous holes, we will not be close to overflowing to the second LEB byte, which would require 128 opcodes. Given that, do you: 🚀 : prefer leaving holes |
Whatever direction this goes in, I think it will have significant aesthetic value, and probably some practical value as well, if encodings that are obvious pairs (last 8 in the currently proposed ordering in particular) are aligned to only differ in the lsb. |
As discussed at the last subgroup meeting, I've pushed a commit that simply removes all of the holes in the instruction opcode space but preserves the order of instructions. Here's a sheet to help visualize the difference: https://docs.google.com/spreadsheets/d/1eaMdpL-MwLOwWgj-mWgv_tNq5crP_nemQ-PW54ss-Q8/edit?usp=sharing Unfortunately, when you just close the holes like that, the obvious pairs of instructions all start on odd numbers, so they differ in their low two bits rather than just their lowest bit. I added a third column to the spreadsheet skipping a single opcode to "fix" that, but I don't think it's worth it. |
proposals/gc/MVP.md
Outdated
| 0xfb43 | `ref.cast (ref null ht)` | `ht : heaptype` | | ||
| 0xfb48 | `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` | | ||
| 0xfb49 | `br_on_cast_fail $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` | | ||
| 0xfb02 | `struct.get $t i` | `$t : typeidx`, `i : fieldidx` | struct accessors (0x08+) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notes about group prefixes are no longer accurate.
proposals/gc/MVP.md
Outdated
| 0xfb1a | `ref.test (ref null ht)` | `ht : heaptype` | | ||
| 0xfb1b | `ref.cast (ref ht)` | `ht : heaptype` | | ||
| 0xfb1c | `ref.cast (ref null ht)` | `ht : heaptype` | | ||
| 0xfb1d | `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 0xfb1d | `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` | | |
| 0xfb1d | `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`,` $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` | |
proposals/gc/MVP.md
Outdated
| 0xfb1b | `ref.cast (ref ht)` | `ht : heaptype` | | ||
| 0xfb1c | `ref.cast (ref null ht)` | `ht : heaptype` | | ||
| 0xfb1d | `br_on_cast $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` | | ||
| 0xfb1e | `br_on_cast_fail $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 0xfb1e | `br_on_cast_fail $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, $l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` | | |
| 0xfb1e | `br_on_cast_fail $l (ref null1? ht1) (ref null2? ht2)` | `flags : u8`, `$l : labelidx`, `ht1 : heaptype`, `ht2 : heaptype` | |
proposals/gc/MVP.md
Outdated
| -0x31 | `rec dt*` | `dt* : vec(subtype)` | | | ||
| -0x32 | `sub final $t* st` | `$t* : vec(typeidx)`, `st : strtype` | shorthand | | ||
| -0x31 | `sub final $t* st` | `$t* : vec(typeidx)`, `st : strtype` | shorthand | | ||
| -0x34 | `rec dt*` | `dt* : vec(subtype)` | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the interpreter wasn't changed accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, thanks. Fixed. Also changed the rec opcode to -0x32, not sure why I picked something else before.
I do think it's worth skipping an opcode to align the pairs, especially since for We still stay within the first 32 opcodes. |
FWIW, if we care for bits that way, shouldn't we also make it so that any _u/_s pairs of instructions only differ by LSB? |
Perhaps, although I would say that |
I notice that this is already not the case for some MVP instructions, so I think we can leave it as is. |
Cool, let's put this into something of a "final comment period." If you are ok with the currently proposed encoding, please give this comment a 🚀 react. Otherwise, please give it an 👀 react and comment with a specific change you would like to see. After a week or so, if we have consensus, we can go ahead and merge this. |
Thanks, everyone. I'll go ahead and merge this now since there have been no objections. |
Update the encodings for ref.as_non_null, br_on_null, (ref ht), and (ref null ht) for consistency with the final encodings chosen in WebAssembly/gc#372.
Update the encodings for ref.as_non_null, br_on_null, (ref ht), and (ref null ht) for consistency with the final encodings chosen in WebAssembly/gc#372. Fixes #103.
Fix various places in the spec text and interpreter where binary encodings were inconsistent with those we decided on in #372.
Fix various places in the spec text and interpreter where binary encodings were inconsistent with those we decided on in #372.
Here is my proposal for the final opcode reordering (fixes #337 and #370).
Trying to group related instructions together and leave space for future extensions in respective opcode ranges.
Also reordering type opcodes to group nullary constructors together and before non-nullary ones (c.f. #337).