Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile reftype short hands wrt nullability #266

Closed
rossberg opened this issue Jan 19, 2022 · 14 comments
Closed

Reconcile reftype short hands wrt nullability #266

rossberg opened this issue Jan 19, 2022 · 14 comments

Comments

@rossberg
Copy link
Member

Currently, i31ref, dataref, and arrayref expand to non-null, while anyref, eqref, funcref, and externref are null. Should probably be more principled.

@tlively
Copy link
Member

tlively commented Jan 19, 2022

Yes, please. I'm always forgetting which ones are null and which are not. I suggest that we make them all nullable because writing anyref instead of (ref any) is less of an improvement than writing it instead of (ref null any).

@rossberg
Copy link
Member Author

Makes sense to me. Other opinions?

@manoskouk
Copy link
Contributor

We are already committed to funcref and externref being nullable, therefore if we want a uniform treatment, nullable is the only choice we have without making breaking changes. i31ref is somewhat counterintuitive as nullable in my opinion, although that might not be relevant depending on the resolution of #130.

@titzer
Copy link
Contributor

titzer commented Jan 21, 2022

I agree with making all the shorthands nullable.

@jakobkummerow
Copy link
Contributor

(Moving my comment from the PR)

I think it would be nice to get some data: take as many GC-using Wasm modules as you can find, and count how often they use (ref X) and (ref null X) for all relevant values of X. Then we can define the meaning of the shorthands such that they actually save a maximum number of bytes. If we find that they don't save many bytes at all, then we might as well drop them entirely, as replacing a two-byte (ref X) or (ref null X) with a one-byte xref is pretty much their only purpose.

Note that I'm not opposed to flipping every single default if need be. I'm just saying it'd be good to have a solid reason for doing so. With Wasm being a compilation target, convenience, aesthetics, and symmetry/regularity/uniformity matter far less than they do in human-written languages, and hence are not convincing arguments for making design decisions.

@tlively
Copy link
Member

tlively commented Feb 18, 2022

FWIW, I've mostly been thinking of this as a text format change and I only recently connected the dots and realized there would be a binary encoding change as well.

I just measured the effect of never using the binary shorthands on the 10MB optimized J2CL module and it only increased code size by 13 bytes. I assume that's not representative of all use cases, but after seeing that I would certainly be in favor of making the text shorthands nullable and removing the binary shorthands that haven't already been standardized. Or removing the new text shorthands as well.

@rossberg
Copy link
Member Author

rossberg commented Feb 20, 2022

Interesting. Does J2CL use non-concrete types or i31 much, though?

I did the same experiments with removing all (new) binary shorthands and saw a +0.34% size increase on the total size of the Waml test suite modules and +0.64% for the Wob test suite.

On the other hand, #277 costed +7 bytes in total on the Waml test suite, and made no difference whatsoever on the Wob one.

With everything else being pretty much equal like this, I'd let consistency take precedent.

@tlively
Copy link
Member

tlively commented Feb 21, 2022

Interesting. Does J2CL use non-concrete types or i31 much, though?

No, I don't think so.

Anyway, given the relatively small effect this choice seems to have on code size, I'd also be in favor of choosing the most self-consistent design. Since this does come with a backward-incompatible binary format change, we'll have to wait to batch up the change with other incompatible changes and won't be able to implement the change right away, but that's not a big deal.

@jakobkummerow
Copy link
Contributor

How likely are we to eventually regret having spent encoding space on these shorthands? We can use the entire range of 7 bits (where 0x7f..0x79 and 0x70..0x66 are used as of this proposal, and 0x65..0x0 are still free), right? In that case I think the answer is "unlikely", so no concerns there.
(The current encoding choices at first glance look like we may want to avoid collisions with 0x60 = "func", but I think that's misleading and there's no actual technical reason to avoid such collisions? If I'm mistaken about this, and the available encoding range does end at 0x61, then I think that would be a reason to not introduce these shorthands in the binary format, to keep sufficient room for future types.)

@rossberg
Copy link
Member Author

We have the range from 0x40..0x7f for single-byte type constructors (since they have to be negative sleb's). That's quite a few for types. But we can go beyond a single byte, so it's effectively unlimited (or at least any negative s33).

I would want to avoid overlapping constructors, even across different type sorts, since there may be reasons to unify some of the sorts later. Under that constraint, the shorthands don't occupy additional coding space.

@tlively
Copy link
Member

tlively commented Mar 9, 2022

@jakobkummerow, do you have any remaining concerns about landing #277?

@jakobkummerow
Copy link
Contributor

Based on the data you collected (thanks for that!), I'm not convinced that adding new shorthands is useful, so if it was up to me, I'd just drop them. But I don't feel strongly about it, so you may consider this a "Neutral" vote.

Note that this is a breaking change, so we'll have to be careful about logistics when we update implementations (both toolchains and engines).

@rossberg
Copy link
Member Author

Anybody wants to approve #277 then?

@tlively
Copy link
Member

tlively commented Apr 4, 2022

#277 was merged, so I'll go ahead and close this.

@tlively tlively closed this as completed Apr 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants